To conclude this module I want to say a few words about how RSA is used in practice. So first of all, if you want to speed up RSA encryption, it's perfectly fine to use a small encryption exponent e. So rather than using a random e one can use a small value of e and in fact the minimum value that's possible is e=3. So it's not difficult to see that the smallest possible value for e is in fact e=3. And let's see why. Well e=1 is a bad idea because that's not particularly hard to invert with e=1. e=2 is not a valid RSA exponent because remember in the definition of RSA, e had to be relatively prime to phi(N). phi(N), if you remember, is (p-1) times (q-1), which is an even number. If p and q are odd primes, (p-1) times (q-1) is an even number, but if e is even, if e is equal to two, it's not going to be relatively prime to phi(N). So e=2 is not valid either. And then e=3 is the first minimum value that can be used, and then we just have to make sure that in fact, p is equal to two mod three, and q is equal to 2 mod 3 so that (p-1) times (q-1) is not divisible by 3. So in fact this is a fine public exponent to use, however the recommended value is 2 to the 16 plus 1. So 65537. It's a good idea to use this recommended value of e. To compute x^3 mod N, you would basically need three multiplications. To compute x^65537 mod N using repeated squaring, you would need 17 multiplications. Basically what you would do is you would repeatedly square 16 times and then you would multiply by x one more time. Ok? So just with 17 multiplications you can do this exponentiation, so this is still much, much faster than using a random e, which would require something like 2000 multiplications. So this leads us into what's called the asymmetry of RSA, where in fact encryption is quite fast: encryption only requires 17 multiplications. However decryption is much, much, much slower; it requires something on the order of 2,000 multiplications. I should point out that there is a standard trick for slightly speeding up RSA decryption. This is called RSA-CRT; this stands for RSA with Chinese remaindering. It's a method for speeding up RSA decryption by about a factor of four, but nevertheless it's still gonna be much, much, much slower than encryption. The ratio of encryption to decryption on RSA is roughly a factor of ten to thirty, so encryption could be up to like thirty times faster than decryption, depending on the size of your modulus. Interestingly, this is a special property of RSA where encryption is so much faster than decryption. And other public key systems--for example, in the next module we're gonna look at ElGamal encryption-- don't have this property where both encryption and decryption take roughly the same amount of time. We've already discussed key lengths for RSA before. So I just wanted to flash these numbers to remind you that if you're using a 128-bit AES key, really, you should be using a 3,000-bit modulus. Although everybody just uses a 2,048 bit modulus. And if you're really using a long AES key, like a 256 bit AES key, the RSA modulus gets to be quite big. Now, I wanted to mention a number of implementation attacks on RSA. These are attacks that have been demonstrated against particular, mathematically correct implementations of RSA. However, these implementations were vulnerable to certain side channel attacks that make the implementation completely insecure. So the first example of this is due to Paul Kocher back in '97. He showed a timing attack where all you do is you measure the time for an RSA decryption. And simply by measuring the time, you can actually expose the secret exponent d. And so, this says that if you are going to implement an RSA decryption, you had better make sure that the decryption time is independent of the arguments. Another attack also by Paul Kocher two years later showed that if you have a smart card that's implementing RSA decryption you can actually measure the power consumption of the card while it's doing RSA decryption and then simply by looking at the peaks and troughs you can literally read off the bits of d one bit at a time as the smart card is running through the repeated squaring algorithm. So using a power analysis attack it's actually fairly easy to get the secret bits of the secret key unless the smart card defends against power analysis attacks. And finally another attack called a fault attack shows that the RSA is very vulnerable to decryption errors and in particular, if for some reason an error occurs during an RSA decryption, one error is actually completely enough to reveal the secret key. So this attack is actually fairly significant. It's just, one error completely reveals your secret key. And as a result, many cryptolibraries will actually check the result of the RSA decryption before returning it to the caller. And the way you would check it is, you would take the output of this exponentiation, and simply raise it to the power of e, and check that you actually get back c modulo N. And if so, you know that your decryption was done correctly. Now the reason you can do this is because again, e is much smaller than d, therefore checking takes much less time than actually raising something to the power of d. Nevertheless, you know, even though checking is ten times faster than the actual decryption, this still introduces a 10% slowdown. And so sometimes this is actually turned off. But nevertheless, it's actually a good idea to check that your RSA output is correctly computed. And so all these attacks kind of again make the point that if you just implement RSA naively it would be mathematically correct, it would work, however there would be all these potential attacks on the implementation and as a result you should never implement RSA yourself. Always, always use standard libraries and just use the implementation available there. So to be concrete, I wanted to show you an example of one of these attacks. And in particular I'll show you the fault attacks on RSA. And again, this will be a fault attack on what's called RSA with Chinese remaindering. So in fact, as I said at the beginning of the segment, RSA decryption is often implemented as follows: What you do is, you decrypt the cipher text c modulo p, then you decrypt the cipher text c modulo q. And then you combine the two to actually get the decryption modulo N. And this combination is done using what's called a Chinese remainder theorem. Which I'm not going to explain here but it's not too difficult to see how that works. Basically, once you have the result of the decryption mod p and the decryption mod q you can combine it to get the decryption mod N. And it turns out this gives about a factor of four speed up over the naive implementation of directly doing the exponential modulo N. Okay. So, let's see why this is vulnerable to faults. Suppose it so happens that when your decryption library is computing the decryption modulo q, for some reason the processor makes an error. And, actually, what it gets is not the correct xq. It gets an incorrect xq, so let's call this xq hat. However when it computed the decryption modulo p, you know, no error occurred. So these errors are fairly infrequent. And so let's just assume that an error occurred modulo one prime but it did not occur modulo the other prime. Well, if that's the case our computation is correct modulo p and incorrect modulo q. That says when we combine the two results we're actually going to get an output, I'll call it x prime, such that the output is correct modulo p. So, x prime is really equal c to the d mod p but is incorrect modulo q. If we raised both these equations to the power of e, what we obtain is the following two relations. Well, let's see. This guy we raise it to power of e. What happens is the left hand side becomes x prime to the e. The right hand side, here, it's c to the d. If I raise c to the d to the power of e--e and d, remember are inverses of one another-- And as a result, if I raise c to the d to the power of e, both exponents cancel out and I simply get c back. So I know that x prime to the e is equal to c. But modulo q, there was a mistake. So x prime to the e is not equal to c modulo q. So therefore, if I look at this difference, x prime to the e minus c. We know that it's zero modulo p, and it's not zero modulo q. So now if we compute the GCD of this value with N, what do we get? Well, as I said, this is zero mod p, but it's not equal to zero mod q. Which means that this quantity here is divisible by p, but is not divisible by q. So therefore when I compute the GCD, what I'll get is simply p. And again, this is because p divides this quantity here when q does not divide this quantity. So now basically what I have just obtained is the factorization of N. Once I have the factorization of N I can compute phi(N) and then given phi(N) I can actually compute myself the decryption exponent from the public key. So now I have just recovered the secret key from the public key from a single mistake that happened during decryption. So this is why typically when you do RSA decryption it's a good idea to check the results especially when you use Chinese remaindering to speed up RSA decryption. The last attack I want to talk about is a very recent observation that was observed by Heninger et al and Lenstra et al that shows that RSA key generation can be problematic when it's done with bad entropy. So here's how things go wrong. So the way open SSL generates RSA keys is as follows. Well, it starts by basically seeding the pseudo random number generator. And then it uses random bits from a pseudo random number generator to generate the first prime p. Then it will go ahead and seed the random number generator some more, and will generate bits from the pseudo random number generator to generate q. And finally, it will output the product of p and q. Okay so the two steps, where we see the seed the pseudo random number generator. Now suppose that this is implemented on a router or a firewall of some sort, and suppose that the key generation happens right after the firewall starts up. So the firewall boots up. At the time of the boot, there's not a lot of entropy in the pseudo random number generator, and as a result the firewall is likely to generate a prime p that comes from a very low entropy pool. Which means that this p is gonna be one of a small number of possibilities. However, after we've generated p, generating the prime actually takes a little bit of time, a few microseconds. And so, by then the firewall will have generated more entropy and so after we add more entropy to the entropy pool, the prime q say is generated from a much larger entropy pool and is therefore unique to this firewall. Now the problem is that many different firewalls if they generate an RSA key in this way many of them will actually end up using the same prime p but a different prime q. So what this says is that if we look at two RSA moduli from two different firewalls, N1 and N2. If we compute the GCD of N1 and N2, well both of them had different q's but the same p. So if we compute the GCD, actually we will end up with a factorization of N, of both N1 and N2 and then we can actually figure out the private key both for N1 and for N2. So this has actually been observed in practice. Both of these groups, what they did is they scanned the web and recovered all of the public keys out there that are used on various web servers. They then ran a massive GCD, using some arithmetic tricks they were able to compute this massive GCD of all pairs of public keys N1 and N2. And lo and behold, they actually realized that a fair number of these keys have a common factor. So they were actually able to factor these moduli. So in the experiment, they were actually able to factor about .4% of all public SSL keys. This is an amazing fact, that, in fact, so many web server public keys out there are vulnerable just because they happened to generate the RSA key using low entropy. So they have a common factor with somebody else's factor and GCDing the two together gives you the factorization. So, the lesson from all this is when you generate keys, no matter whether they are RSA keys or ElGamal keys or symmetric keys, it's crucial that the number--, that your generator is properly seeded. So don't generate keys immediately after start up. You have to kind of make sure the seeding of the generator has had enough time to actually generate enough entropy. And only then you can start generating keys. So this is a really nice example of how a, a bad random number generator can mess up your RSA public keys. Okay so this is the end of our discussion of public encryption from RSA. I wanted to mention a few further readings if you want to read more about this. So there's a nice paper by Victor Shoup that talks about why chosen cipher text security is so important in the public key settings. So if the Bleichenbacher attack wasn't convincing enough, there are many other attacks like this that are possible if you don't use a chosen cipher-text secure system. So if you want to learn more about chosen cipher-text security, please look at Victor Shoup's paper. There's a survey that I guess I wrote a couple years ago, that looks at many different attacks on the RSA system. I guess I wrote this when RSA was twenty, I actually need to update this to 30 years of attack on the RSA cryptosystem. There've been some developments in the last decade, but for now this is actually a decent survey to look at and read about more attacks on RSA. The OAEP results that I mentioned are referenced here, OAEP reconsidered. And finally, if you're interested in key length analysis of RSA and other public key systems, there's a nice paper by Arjen Lenstra that discusses how you should choose key lengths for your public key systems, and even for your symmetric key systems. Okay, so those are the four references. I hope you have time to look through them. And I will stop here. And, in the next module we're going to look at another family of public key encryption systems, this time based on discrete log.