In this segment we'll continue our study of probability theory and we'll talk about the probabilities of multiple events; an event happening now plus an event happening a little bit in the future or an event that has happened and we use that information to estimate the likelihood of something else happening down the road. The first thing to think about when we talk about multiple events is to ask whether those events are in fact independent. Consider two coins, a and b. What is the probability that a lands as heads and b lands also as heads? Mathematically, we have the probability of a equals heads and b equals heads. If a and b are independent, the probability of a equals h, heads, and b equals h is simply the product of those two things, the probability of a equals heads times probability of b equals heads. So a and b are just two separate coins, they don't talk, they don't communicate. If they're independent, the joint probability a equals h, b equals h is how we write the joint probability is simply the product of those two. This is the definition of independence. More abstractly, we have p of a and b equals p of a times p of b. Then let's introduce one more piece of notation here. This is called the conditional notation. What is the probability of a equals heads given that b is already head? We flip b first, we observe it, we know it's heads and now we ask the question, what is the likelihood that a is heads? If these two are independent, the probability of a given h, given that b equals h, is just equal to probably of a equals h. The fact that I tell you what happened to b tells you nothing about a. This is the independence. This is what independent probability and independent variables mean. The next piece of knowledge you should see at least once is this one called Bayes' rule and this is the formula that relates all of those things. This is Bayes' rule. The probability of a given b is the probability of b given a times the probability of a divided by the probability of b. It's one of the most fundamental relationships in all of statistics and probability. What is the probability of a given b? This is known as the posterior probability, the probability of something given that something else has happened. We'll see an example in the next slide. These functions effectively as belief updating. How likely is a is p of a. How likely is a after b has happened is the probability of a given b. If I tell you that somebody driving down the street and what's the probability of that person getting into an accident, you have some notion of what the likelihood of that is, maybe it's 0.1, maybe it's 0.01. Now if I tell you that person is drunk, that is b. Now what is a given b, that's a very different probably. Now you think that the likelihood of them getting into an accident is significantly higher and unfortunately of all the other people on the road that are driving at the same time as this person. Let's look at this more concretely and let's give an example. Now we have two dice, A and B. Now we know that these dice are independent, but we're going to complicate things a little bit here. If the dice are fair, the probability of A coming out as six is 1/6 and that's the same for B. What we really want to think about is the sum A plus B. One of these dice will have been rolled, we'll get a value and now we want to know what's probability of the sum after we do the second dice? Now, if we know nothing prior to rolling the dice, the probability of S is 1/36 for 2, the only way you get two is both dice come out as one. It's 2/36 for 3 because we have one or two or two and one. 6/36, for the sum of 7, we get one and 6, 2 and 5, 3 and 4 and the complements and we end up with a distribution as this triangular shape with a peak at seven. The most likely event if you roll two dice is seven. If you play Backgammon, you know that well. Now, let's say we rolled the first dice first and we get a value of six, what is now the probability of the sum S? We're about to roll B. The probability of S given A equals 6 is a uniform distribution, it has the same value from 7-12. Because B is independent of A, we already have six, so we're going to get values 1-6, so we have 1/6 probability of getting value from 7 for the sum to 12. This is belief updating. If we know nothing, we have a distribution that goes from 2-12 with the shape, once I tell you that the first dice came out six, now the distribution changes because I gave you some information. Now we're doing this conditional. It's the probability of S given A equals to 6 and this is how this updates. Let's look at a more concrete example again from the medical world to illustrate this point at least with more detail. This is what's called the positive predictive value. What is the likelihood that you have a disease if you have a sign? What is the probability that you have a disease if you test positive for it? Let's just examine this in more detail. What is the probability that you have cancer if you exhibit some sign S? I'm going to leave this abstract for now. Let's put some facts on the table, 100 percent of cancer patients have the sign. If you don't have the sign, you do not have cancer, but in all of the cancer patients have the sign S. Five percent of non-cancer patients also have the sign. What end? We know from prevalence in the population, that cancer prevalence in this population we study is two percent. Let me ask you, what is the likelihood that you have cancer knowing nothing? The answer is two percent. What if you have the sign, what is the probability now that you have cancer? What is the probability of C given that S equals true? What is the answer to that question? To address that, we're going to give you a tabular solution as opposed to an algebraic solution. Let's start out. We have 1,000 people. This is a nominal population, 980 of which do not have cancer and 20 percent of which have cancer. Remember the probably of cancer to population is two percent. All of the people who have cancer have the sign, so we have 20 people here that both have cancer, yes, and the sign is yes and of the people that do not have cancer, five percent of them have the sign, so 49 people do not have cancer, no, but actually have the sign, so 69 people will have the sign and then the rest, nobody has cancer and does not have the sign so it's 0 and 931 people do not have cancer and do not have the sign. All these numbers add up. Now let's think a little bit what the implications of that. What is the probability that cancer is true given that sign is yes? Well, 69 people have the sign of which 20 people actually have cancer, so probability of having cancer if you have the sign, in fact, in this case is 29 percent, which a lot less than 100 percent. Most of the people who have the sign, in fact do not have cancer. This is belief updating in action. The probability of cancer equals true is just two percent, but once you know that you have the sign, you get up to 29 percent. This is the type of information, the kind of experiment that you can do with knowledge of this type of things. Right now in the pandemic, what is the probability actually of COVID given that you get a positive test? Well, given that the tests are not perfectly accurate, if most of the population where you live is actually free from COVID, the probability that you actually have COVID given a positive test is actually fairly low and maybe 30, 40 percent, certainly not 100 percent, but of course, that determines the sensitivity of the person. This is something we'll talk about. We'll talk about event detection is driven by other factors, making sure we don't miss anybody and thus that story. This concludes our initial discussion of probability. In the next segment, we'll introduce basic concepts from statistics. Thank you.