Where can I learn cryptography/cryptanalysis the hard way, without going to school ? Any good book?
I'm not so bad at mathematics:
I know what are p-list and p-combinations, I know matrix algebra, I know what a XOR is, I know how to tell if number is a prime, etc: I'm not the programmer who hates math because he is bad at it, but I don't have a PhD eitherway.
I'm not bad a computer sciences either, well at least in term of general computer science culture:
I know C, C++ (both learned at school), python, some haskell, what text encoding are out there, how UNICODE works, I know how a file can be compressed or encrypted, what common algorithms are out there (diffie-hellman, the LZMA algorithm, DES, AES, Serpent, Blowfish, SHA, MD5...). I got a lot interested in cryptography on wikipedia or other websites, but I don't think wikipedia can teach me cryptography without detailing algorithms or without practice; for example I know what is synchronous cryptography and what is asynchronous (public/private key).
I'd like to learn how to properly and securely implement the most popular algorithms, and how to make them reliable: a book or good tutorials or courses. I've quickly searched on Khan Academy, but this subject is not trivial and requires both knowledge in math, computer sciences and/or electronics.
I don't want to read pages and pages of just theory about the basic things I might already know or might be not really relevant to today's cryptography, like a paper written by a researcher, just something practical, with problems, and cryptanalysis problems, for students.
I have currently much free time, I'm only 26, and I'm sure I can learn this stuff, not only for the pay increase it can bring me but also because I've always been fascinated by cryptography without actually understanding it, I just can't find any good material.
Are you talking about learning the *theory*, how algorithms are defined, etc, OR how to implement practical solutions? These are almost mutually exclusive, for most situations.
more by reading, but also by doing, I mean cryptography is meaningless without cryptanalysis.
(LZMA is a compression algorithm, not cryptographic.)
For the purpose of implementing cryptographic algorithms, the generic method is getting the relevant descriptive standard, grabbing your keyboard, and trying. Most standards include "test vectors", i.e. sample values which let you know whether your implementation returns the correct answers. At that point, things differ, depending on what kind of algorithm you are considering.
Symmetric algorithms cover symmetric encryption, hash functions, and message authentication codes (MAC). You do not need to know much mathematics to handle these; most of it is about additions of 32-bit and 64-bit integers (that's modular arithmetic, with 232 or 264 as modulus) and bitwise operations (XOR, AND...).
Such code is usually done in C. Good performance is achieved by having some notions on how the C compiler will understand and translate the code into instructions for the CPU; knowledge of assembly is not strictly mandatory, but quite useful. An important parameter is cache memory: loop unrolling is usually a good tool, but if you overdo it, performance drops sharply.
I suggest beginning by implementing the classical hash functions (the SHA family, described in FIPS 180-3) and trying to make them fast. As a comparison point, get OpenSSL and use the command-line tool
openssl speedto see what kind of performance can be obtained (this tool is already included in any decent Linux distribution, and it works on Windows and MacOS too). For instance, on my PC:
$ openssl speed sha256 Doing sha256 for 3s on 16 size blocks: 4842590 sha256's in 3.00s Doing sha256 for 3s on 64 size blocks: 2820288 sha256's in 2.99s Doing sha256 for 3s on 256 size blocks: 1262067 sha256's in 2.99s Doing sha256 for 3s on 1024 size blocks: 395563 sha256's in 3.00s Doing sha256 for 3s on 8192 size blocks: 53564 sha256's in 3.00s OpenSSL 0.9.8o 01 Jun 2010 built on: Wed Feb 23 00:47:27 UTC 2011 options:bn(64,64) md2(int) rc4(ptr,char) des(idx,cisc,16,int) aes(partial) blowfish(ptr2) compiler: cc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wa,--noexecstack -g -Wall -DMD32_REG_T=int -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha256 25827.15k 60367.37k 108056.57k 135018.84k 146265.43k
which means that OpenSSL includes a SHA-256 implementation hand-optimized in assembly, which achieves 146 MB/s when processing 8 kB messages. On the same machine, a pure C implementation ought to get to at least 130 MB/s.
For an example of how hash functions are implemented in C and Java, and how hashing speed can be measured in a meaningful way, see sphlib.
Afterwards, you can try symmetric encryption, in particular the AES (FIPS 197). It helps a bit to know what a finite field of characteristic 2 is, but the standard is clear enough to guide you through a perfunctory implementation. Then, try to optimize things. OpenSSL can serve as a comparison point, and get inspiration from the AES implementations of Brian Gladman. As for security, there has been some concern about what key-dependent information could be leaked through use of look-up tables in the implementation (try to search for "AES cache timing attack"); trying to reproduce that kind of attack is a very good exercise (mind you, it is not easy, but if you succeed in demonstrating it in lab conditions then you will have learned a good deal on how cryptographic implementations work).
Asymmetric cryptography is about the algorithms which involve more than one party. This includes asymmetric encryption (RSA, ElGamal), key exchange (Diffie-Hellman) and digital signatures (RSA again, DSA...). The maths contents are much bigger there, and optimization is a much broader subject than for symmetric cryptography, because there are several ways to implement each algorithm, instead of a single "obvious" implementation path.
A good reference is the Guide to Elliptic Curve Cryptography. Although it is mainly about elliptic curves, it includes a general treatment of the implementation of operations in finite fields, and it so happens that this is the sample chapter which can be downloaded for free at the URL linked to above. So get it and read it now. Another indispensable reference is the Handbook of Applied Cryptography, which can be freely downloaded; chapter 14, in particular, is about efficient implementation.
RSA is simple enough, and is adequately described in PKCS#1. There are possible timing attacks on RSA, which are countered by masking (yes, this is a paper "written by a researcher", but in the subject of cryptography, researchers are the people who understand what is going on). If you get the hang of modular arithmetic, you can try to implement DSA (FIPS 186-3). Diffie-Hellman is mathematically simple (it needs nothing more than is needed to implement DSA) but its describing standard (ANSI X9.42) is not downloadable for free.
Elliptic curves are a popular future replacement for modular arithmetic; EC variants of DSA and Diffie-Hellman are faster and believed more secure with shorter public keys. But that's more mathematics. There again, the Guide to Elliptic Curve Cryptography is the must-have reference.
There are other kinds of asymmetric cryptography algorithms, e.g. the McEliece cryptosystem (asymmetric encryption; there is a variant for signatures described by Niederreiter) and algorithms based on lattice reduction. But they do not (yet) benefit from published standards which take care of the implementation details, and there are not so many existing implementations to compare with. You'd better begin with RSA and DSA.
Cryptanalysis uses a much higher dose of mathematics than implementation.
For symmetric cryptography, the two main tools are differential and linear cryptanalysis; see this tutorial.
My own path to cryptography began by implementing DES, and then implementing Matsui's linear cryptanalysis on a reduced version of DES (8 rounds instead of 16). DES is described in FIPS 46-3, which is officially withdrawn, but still available. From DES can be defined Triple-DES (three DES instances, with three distinct keys, the middle one being use in "decryption" direction) and there are published test vectors for Triple-DES (also known as "TDES", "3DES", or sometimes "DES", which is arguably confusing).
For asymmetric algorithms, cryptanalysis mostly involves working on the mathematical structure of the keys, e.g. by trying to factor big non-prime integers in order to break RSA variants. Mathematics here range from the non-trivial to the totally unimaginable, so this might be too steep a learning curve to begin cryptography by trying to break RSA...
Wow. Great response. You offered a nice survey of cryptography in easy to understand language with valuable references. Thanks.
Two things, really:
- Get a good book. Bruce Schneier's "Applied Cryptography" is adequate.
- Learn the 'openssl' tools, and learn to use them.
The most important thing about crypto to learn is humbleness. You don't want to ever create a new solution a problem -- you want to copy as best you can the solutions that have been well-tested by others. Most crypto failures are due to people having a bright idea, thinking they could make some optimization to improve upon an existing solution. It is only the extremely humble that eventually succeed in finding new ways of doing things.
The next lesson is that you have unlearn your prejudices that you've gotten from TV and movies, where a hacker sits down at a computer and breaks crypto. These are either unrelated to crypto, or a dramatization of what really goes on. For example, the movie "Sneaker" is a dramatization of what would happen if somebody developed a chip that could factor large integers.
The hardest thing in learning crypto is to distinguish between the technical concepts that are necessary to understand the field in general, and those which you'll need only when you specialize in a narrow area. Take, for example, the highly rated post above. You need to understand the difference between a "symmetric" algorithm vs. "asymmetric" algorithm vs. a "hash", but when the author of that post says "It helps a bit to know what a finite field of characteristic 2 is", I would disagree: it's only meaningful to PhDs researching crypto, not to the rest of us that simply want to figure out how to use it correctly.
A good way to wade through the technical details is to pick a target, and work backwards. For example, today, Apple updated the iPhone/iPad operating system to version 4.3.5 to fix a bug in the validation of X.509 certificate chains. Understanding the problem and why they had to fix it is exactly the sort of thing you are discussing in your original post. Figure out what an X.509 certificate is, what the chains are, and why they need to be validated, and why if you don't, a hacker using a tool like 'sslsniff' can defeat the encryption. Once you fully understand all that, you'll have achieved much of your goal that you describe in your original post.
Another example is a blog post on Verifying the Comodo Hacker's key
Again, find out what the Comodo hacker did (created signed certificates for Google and Yahoo), how certificate revocation works, and how to use the tools to validate that certificate. I suggest it as a good post because it's a good starting point for using the 'openssl' tools, which are standard in our industry.
Register for Stanford's online Cryptography class that begins next January. It's free, online, includes both theory (video lectures and quizzes) and practice (programming assignments), let's you work at your own pace and you will get a statement of accomplishment if you succeed. Given the various echoes I got on Stanford's previous online courses session, I'm definitely registering for this class (as well as the Computer Security one).
- On the theory side:
Cryptography Engineering Design Principles and Practical Applications by Niels Ferguson and Bruce Schneier. Book is intended to be an update of the venerable Applied Cryptography, authors are well known in the field and reviews are good.
- On the practice side:
You may give a look at various CTF (Capture The Flag) hacking/security competitions. They generally include Cryptography challenges. They're fun and take you out of your comfort zone to solve the problems within a limited time. Here's a good CTFs Calendar. Also, look at some write-ups of previous CTFs, I found many to be very educational and well explained.
seems very interesting and quite exciting, I subscribed, but I don't know, will it be free ?
For crypto algorithms:
Stinson's Cryptography: Theory and Practice
goes through the math of many crypto algorithms in a way that would make them fairly easy to implement, if that's what you wanted to do.
Scheiner's Applied Cryptography is also a leading book on the topic. Probably overlaps quite a bit but with some different algorithms.
As far as a book explicitly telling you how to implement them - I've got nothing. Commercially, these aren't always implemented in software, and it's a fairly niche industry. From a playing around perspective, I'd say get a book explaining the algorithm, implement it, and compare your outcome with a commonly used library for the same algorithm.
Similarly, I've got nothing on cryptanalysis, although I suspect if you pick an algorithm and google for things like "weakness in" and "weak keys" you'll find some interesting papers and other information. Last time I had to write a paper on anything like this (10 years+ ago) that's essentially what I did...
I recommend Cryptography Engineering: Design Principles and Practical Applications. It is the perfect book for you. It describes how to design and implement cryptosystems, from the perspective of a systems designer and implementor. It is a very pragmatic book, with a perspective derived from the authors' years of experience. The word "engineering" in the title is truly apt. I think you will find it an invaluable resource helping you prioritize what you should spend most of your energy worrying about, what can go wrong, and how to make sure those bad things don't happen to you.
I would also add you have a look at Matasano crypto challenges as you already have some programming skills.
From their website :
HOW MUCH CRYPTO DO I NEED TO KNOW? None. That's the point.
SO WHAT DO I NEED TO KNOW? You'll want to be able to code proficiently in any language. We've received submissions in C, C++, Python, Ruby, Perl, Visual Basic, X86 Assembly, Haskell, and Lisp.
All their challenges are based on real-world crypto vulnerabilities, so you should learn nice stuff resolving these.
Learning cryptography from scratch can sometimes feel like a hard thing but if you have the right resources you will absolutely love it. The following are my favourite sites to get you started learning cryptography today.