[SOUND]. In this lecture, we'll talk about formal definitions of security for Digital signatures. First of all, we'll define a digital signature scheme. A signature scheme, as we've seen already, is defined by three probabilistic polynomial time algorithms. The key generation algorithm, the signing algorithm, and the verification algorithm. The key generation algorithm takes as input the security parameter as usual, and outputs a pair of keys, a public and private key. The signing algorithm takes as input the private key and a message m, and outputs a signature sigma. And they'll often denote it this way with the private key sub-scripted, as we've see for other algorithms as well. The verification algorithm takes as input the public key pk, a message m, and a signature sigma and then outputs either 1 or 0. Where as we've said already a one indicates acceptance or validity, and a zero indicates rejection or invalidity. And of course the correctness condition says that for all messages m and all public private key pairs output by the key generation algorithm. If we sign a message m using the private key, and then verify m along with that signature using the public key, then verification should, should succeed, i.e., verification should output 1. Now the Security model for digital signatures is exactly analogous to the Security model we've seen already for message authentication codes. In fact the definition of Security here for digital signatures came first, and message authentication codes were patterned after signatures. So recall that the Threat model is that of an adaptive chosen message attack. That is, we assume that the attacker can induce the sender to sign messages of the attacker's choice, and the Security goal is Existential Unforgeability. That is, the attacker should be unable to forge a valid signature on any message that was not explicitly signed by the sender. And the main difference between our setting and the setting of message authentication codes, is that because we're in the public key setting, we're again always going to assume that the attacker is given the public key. And it can use that public key when it tries to generate messages for the sender to authenticate. It can also use that public key when it's trying to construct its forgery, and this makes the problem, of course, harder. Formally, if we fix a signature scheme pi, and then some attacker A, we can define the randomized experiment Forge based on A and pi. On security parameter n, what this experiment does is first run the key generation algorithm to obtain a public and private key pair. And the attacker A is then given the public key, and also given the ability to interact with a signing oracle. The singing oracle is parametrized by the private key generated in the first step, and it allows the attacker to submit messages m, and get back their corresponding signatures. We can let capital M denote the set of all messages that the attacker sends to this oracle i.e., all messages whose signature the attacker has obtained. After this interaction the attacker outputs a pair containing a message and a signature sigma. And we'll say that the attacker succeeds and the experiment evaluates to 1 if first of all, the message signature pair that the attacker output is valid. That is, if we verify the message signature pair using the public key generated in the first step, then that will result in output of 1. And furthermore, the message M should be a message whose signature the attacker has now previously obtained. So the message M should not be in the set capital M of messages whose that the attacker sent to the signing oracle. And then of course we'll, we'll define that the scheme pi is secure if for all probabilistic polynomial time attackers A. There's a negligible function epsilon such that the probability that the forge experiment we've just seen outputs one, is negligible. Now again, as in the case of message authentication codes, the definition we've just given does not take into account Replay attacks at all. And of course, does nothing preventing an attacker from taking a prior message signature pair, and replaying that to a recipient. And the receiver will then verify that, just as they did the first time around. Now in many settings, this can be problematic. For example, imagine the patch distribution case that we talked about in the previous lecture. And imagine that the software company issues a new patch. And what the attacker does is to replace that new patch with a patch from last year. Well that patch was signed a year ago. That signature is still going to appear valid, and so the client may in fact install the year old patch. And that may have the effect actually of rolling back the client to a previous version of a software. That doesn't violate the definition of security of a signature scheme. And if we did want to prevent that sort of an attack, we would have to handle it by some other mechanism. In this particular setting it would not be hard to do that. We could, for example, include a date inside what we sign, and then the client would check that the date on the patch is later than the date on the last patch that it applied. The point is, however, that we can't handle this using signatures alone. We have to instead handle it using some higher level mechanism. The last thing I want to talk about in this lecture is the Hash-and-sign paradigm. And in fact, we've already seen this paradigm in the context of message au, message authentication codes as well. However, in the case of message authentication codes it's the case in practice that not every message authentication code requires that paradigm, or in fact, not every message authentication code uses that paradigm. We've seen actually in this, in this course that CBC-MAC doesn't need to use any hashing before generating a message authentication code whereas HMAC does that as part of its processing anyway. In the context of signatures, Hash-and-sign is really ubiquitous. And we'll explore why in just a minute, it will be sort of obvious, actually. The basic idea is that we're given a signature scheme pi that can handle, say, short messages of length n. The exact length is not important, but it has to be long enough that it can serve as the output length of a Hash function H. So in addition to the scheme we have a Hash function that say can map arbitrary length inputs to strings of length n. We can then construct a signature scheme, pi prime, that can handle, that can sign arbitrary-length messages. The key generation algorithm in this modified scheme is the same. The signing algorithm now, rather than signing the rather then, then trying to sign the long message m. What it does simply is to first hash the message m to this short end bit string, and then apply the underlying signing algorithm for the original scheme pi. Verification of a signature sigma on a message m, works by simply running the verification algorithm of the underlying signature scheme on the hash of the message. And the theorem we can prove here is that if the original scheme pi is secure for, for signing short messages, and if the hash function H is collision-resistant. Then the modified scheme pi prime, that can handle arbitrary length messages, is itself a secure signature scheme. And we can go through the proof at a very high level. So imagine that the sender authenticates some messages m1, m2, m3 etc. And let's denote h i to be the hash of the ith message. When the attacker outputs it's forgery m comma sigma, we know that it, for the attacker to be successful it must be the case that this message m is different from mi for all i. Well there are two cases now and we can just do a case analysis. If H of m is equal to h i for some i, then we have a collision in the hash function, right? Because m is not equal to mi, but yet H of m is equal to the hash of mi, and so that means that the attacker has managed to find a collision in H. But that's something that shouldn't happen except with negligible probability, if H is a collision-resistant hash function. The other possibility is that H of m is not equal to h i for all i. But in that case what's happened is that the attacker has effectively obtained signatures on the quote messages h1, h2, h3, etc. And then forged a valid signature on the message H of m, which is distinct from everything that was signed prior to that. And what means is that the attacker has effectively been able to forge a signature on a new message in the underlying signature scheme. And if the underlying signature scheme for short messages was secure by assumption, then this happens only with negligible probability. So putting it together, either way the attacker tries to defeat our modified scheme pi prime, it will not be able to do so except with negligible probability. Now, one thing to note about the Hash-and-sign paradigm is that it's sort of Analogous to hybrid encryption in the sense that what it gives us is the functionality of digital signatures at the asymptotic cost of a symmetric-key operation. And I say that because if the message that you're signing is very, very long, then the time to sign will start becoming dominated by the time to compute the hash over that long message. And hashing is a symmetric-key operation, so it's going to be very fast. And in that sense it's analogous to hybrid encryption that gave us a functionality of public key encryption at the asymptotic cost of private key encryption. And as I said a few minutes ago, the hash-and-sign paradigm is used extensively in practice. And really I, I, I just about any signature scheme that would be used to find actual content would use the hash-and-sign paradigm to do that. In the next lecture we'll begin exploring some constructions of signatures. And we'll start here by looking at signatures based on the RSA problem.