In the last segment in this module, I wanna show you a general attack that
affects many implementations of Mac algorithms. And there's a nice lesson to
be learned from an attack like this. So let's look at a particular implementation
of HMAC verification. This happens to be an implementation from the Keyczar library,
that happens to be written in Python. So here's the code that's used to verify a
tag generated by HMAC. This code is actually simplified. I just wanted to
kinda simplify it as much as I can to get the point across. So basically, what the
inputs are, the key, the message, and the tag bytes. The way we verify it is, we
re-compute the HMAC on the message and then we compare say the resulting sixteen
bytes. To the actual given signature bites. So this looks perfectly fine.
In fact, anyone might implement it like this. And, in fact, many people have implemented
it like this. The problem is, that if you look at how the comparison is done, the
comparison, as you might expect, is done byte by byte. There's a loop inside of the
Python interpreter that loops over all sixteen bytes. And it so happens that the
first time it finds an inequality, the loop terminates and says the strings are
not equal. And the fact that the comparator exits when the first inequality
is found introduces a significant timing attack on this library. So let me show you
how one might attack it. So imagine you're the attacker, and you have the message m,
for which you want to obtain a valid tag. Now, your goal is to attack a server that
has an HMAC secret key stored in it. And the server exposes an interface that
basically takes message MAC pairs. Checks that the MAC is valid, if the MAC is valid
it does something with the message. And if the MAC is not valid, it says reject.
Okay, so it's back to the originator or the message rejects. So now this attacker
has an opportunity to basically submit lots of message it appears and see if it
can deduce the tags of the particular message for which it once attacked. Here's
how we might use the broken implementation from the previous slide to do just that.
So what the attacker is gonna do is to submit many message tag queries, where the
message is always the same. But with a tag, he's gonna experiment with lots and
lots and lots of different tags. So in the first query, what he's gonna do is just
submit a random tag along with the target message. And he's gonna measure how long
the server took to respond. The next query that he's gonna submit, is he's gonna try
all possible first bytes for the tags. Let me explain what I mean by that. So the
remaining bytes of the tags that he submits are just arbitrary, doesn't really
matter what they are. But for the first bite, what he'll do is he'll submit a tag
starting with a byte zero. And then he's gonna see whether the server took a little
bit longer to verify the tag than before. If the server took exactly the same amount
of time to verify the tag as in step one, then he's gonna try again, this time with
bytes at one. If still the server responded very quickly, he's going to try
with byte sets of two. If the server responded quickly then he's going to try
with byte sets of three, and so on until finally, let's say, when the byte sets of
three the server takes a little bit longer to respond. What that means is actually
when it did the comparison between the correct MAC and the MAC submitted by the
attacker. The two matched on this byte, and the rejection happened on the second
bytes. Aha. So now the attacker knows that the first bite of the tag is set to three
and now he can mount exactly the same attack on the second bite. So here. It's
going to submit the tag. And the second, back here. Here This should go here. So
it's gonna submit a tag when the second byte is set to zero. And it's gonna
measure whether this took a little bit longer than in step two. If not, he's
gonna change this to be set to one, and he's gonna measure if the server's
response time is a little longer than before. Eventually, let's say when he sets
this to, I don't know. When the byte is set to, to 53, say, all of a sudden, the
server takes a little bit longer to respond. Which means that now, the
comparator matched on the first two bytes. And now the attacker learned that the
first two bytes of the Mac are three and 53. And now he can move on and do the
exact same thing on the third byte and then on the fourth byte and so on and so
forth. Until finally, the server says, yes, I accept. You actually gave me the
right Mac. And then we'll go ahead and act on this bogus message. That, attack our
supply. So this is a beautiful example of how a timing attack can reveal the value
of a MAC, the correct value of the MAC. Kind of byte by byte, until eventually,
the attacker obtains all the correct bytes of the tag, and then he's able to fool the
server into accepting this message tag pair. The reason I like this example is
this is a perfectly reasonable way of implementing a MAC verification routine.
And yet, if you right it this way, it will be completely broken. So what do we do? So
let me show you two defenses, the first defense, I'll write it in again in python
is, is as follows. In fact the Keyczar library exactly implemented this defense.
This code is actually taken out of the updated version of the library. The first
thing we do is we test if the signature bytes submitted by the attacker are of the
correct length, say for HMAC this would be say, you know 96 bits or 128 bits, and
if not we reject that as an invalid MAC. But now, if the signature bytes really do
have the correct length, what we do is implement our own comparator. And it
always takes the same amount of time to compare the two strings. So in particular,
this uses the zip function in Python, which will, essentially, if you are giving
it two sixteen byte strings. It will create sixteen pairs. Of bytes. So it'll
just create a, a list of sixteen elements, where each element is a pair of bytes. One
taken from the left and one taken from the right. And then you loop, you know, you
loop through this list of pairs. You compute the XOR of the first pair, and the
OR into the result. Then you compute the XOR of the second pair, and
you OR that into the result. And you note that, if at any point in this
loop, two bytes happen to be not equal, then the XOR will evaluate to something
that's non zero. And therefore, when we OR'ed it into the result. The result
will also be counting on zero, and then we'll return false, at the end of the
comparison. So the point here is that now the comparator always takes the same
amount of time. Even if it finds difference in byte number three, it will
continue running down the both strings until the very end. And only then will it
return the results. So now the timing attack supposedly is impossible. However,
this can be quite problematic, because compilers tried to be too helpful here. So
an optimized compiler might look at this code and say, hey, wait a minute. I can
actually improve this code by making the four loop end. As soon as an incompatible
set of bytes is discovered. And so, an optimizing compiler could be your, kind
of, Achilles heel when it comes to making programs always take the same amount of
time. And so a different defense, which is not as widely implemented, is to try and
hide from the adversary, what strings are actually being compared. So let me show
you what I mean by that. So again, here we have our verification algorithm. So it
takes as inputs, a key, a message, and a candidate's MAC from the adversary. And
then, the way we do the comparison is we first of all, compute the correct MAC on
the message. But then instead of directly comparing the MAC and the signature
bytes adversary, what we're gonna do is we're gonna hash one more time. So we
compute a hash here of the MAC. We compute a hash of the signature bytes. Of course,
if these two happen to be the same, then the resulting HMACs will also be the
same, so the comparison will actually succeed. But the point is now, if sig
bytes happen to equal MAC on the first byte, but not on the remaining bytes.
Then, when we do this additional hash layer, it's likely that the two resulting
values are completely different. And as a result, the byte by byte comparator will
just output on the first iteration. The point here is that the adversary doesn't
actually know what strings are being compared. And as a result, he can't
mount a timing attack that we discussed earlier. Okay, so this is
another defense. At least now, you're not at the mercy of an optimizing compiler.
The main lesson from all of this is that you realize that people who even are
experts at implementing cryptolibraries, get this stuff wrong. And the right code
that words perfectly fine and yet is completely vulnerable to a timing attack
that completely undo all security of the system. So the lesson here is of course
you should not be inventing your own crypto but you shouldn't even be
implementing your own crypto because most likely it'll be vulnerable to the side
channel attacks. Just use a standard library like OpenSSL. Keyczar is actually a
fine library to use that would reduce the chances that you're vulnerable to these
types of attacks.