Hi and welcome back, in this video we're going to start to learn about independence. As you'll see in future videos, and in future modules. Independence is a key concept in statistics and data science. So what is independence, two events are independent if knowing the outcome of one event does not change the probability of the other. For example, suppose you flip a coin twice. If you know you got ahead on the first flip, does that change the probability of getting a head or a tail on the second flip? No, if it's a fair coin, the probability is still a half for a head and a half for a tail. What about rolling a dice, if you roll a dice and get a 1 on the first roll, does that change the probabilities for the outcome of the second roll? No, what on the other hand about polling, suppose you ask to randomly selected people about their political affiliation. You might think if the people are truly chosen at random, that the answer from the first person will not affect the answer from the second person. But what if the two people are from the same family or what if they're friends, then knowing the outcome of the first person might affect the outcome for the second. And so that situation would not be independent. Let's see if we can make this definition more concrete. So we say two events A and B, are independent if the probability of A given B is the same as the probability of A. So remember the probability of A, this is the prior, and this is the posterior. So, finding new information from event B, doesn't change the probability for event A, if the two events are independent. And by symmetry, this is the same as the probability of B given A. So knowing A, what's the probability of B. That's still going to be the probability of B if A and B are independent. Recall from our definition, the probability of A given B is the probability of A divided by the probability of B. Now, if the events are independent, so if independent this is going to be the same, as the probability of A. So then we get what's called the multiplication rule for independent events, and that is the probability of A intersect B. So that part right here, the probability of A intersect B, is the probability of A times the probability of B. Now, we can extend this definition to multiple events. So we say A1 through An, are mutually independent if for every subset, every possible subset of events, we get the probability of A sub 1 intersected A sub 2 and so on. Is the same as the probability of A sub 1, A sub 2 and so on. Okay, so just to make clear this is every possible grouping. Of events, and so every pair wise grouping is included in this when K equals 2, when K equals 3, it's every triple and so on. Now, we can use the definition of independence in two ways. The first way, is if I have two events A and B, I can see whether they're independent or not, by calculating the probability of A. The probability of B, and the probability of A intersect B. And then checking to see if the probability of A intersect B, is the same as the probability of A times the probability of B. If they're equal, then we say events A and B are independent. If they're not equal, then they are dependent. Now we can use the definition another way, if we know two events are independent, then we can find the probability of their intersection. And sometimes that knowing that they're independent. Sometimes that's an assumption we make, or sometimes we know it from other sources. Let's do some examples, so here's example 1, our favorite example. We roll that six sided dice twice, we've seen this example before. The cardinality is 36, and every single one of those events is equally likely. Let's make E, be the event that the sum is 7. F is the event that the first roll is a 4, and G is the event that the second roll is a 3. What can you say about the independence of E, F and G? Let's calculate some things, so the probability of E, is the probability of getting that 7. So that's going to be 16, 25, 34, 43, 52 and 61, so that's going to be one sixth, because there's 6 events and each one of them is equally likely. Also you can find, the probability of F and the probability of G are one sixth. And you can calculate that yourself. Now, what about the probability of E intersect F. So this is the probability of the event, that the sum is 7. And, because it's intersected and that the first roll is a 4, there's only one way to get that. If the first roll is a 4 and the sum is a 7, we had to get a 4 and a 3. That's going to have probability 1 over 36. And indeed that is the probability of E times the probability of F. We also calculate the probability of E intersect G. And that also, so E is the sum of 7, G is the second roll of 3. The only way for that to happen is again with a 4 and a 3, and that's also 1 over 36. And that's the probability of E times the probability of G. Finally we can calculate the probability of F intersect G. That's a four on the first role and the three on the second and we get 1/36. Now, what do we see from this, any pair? This is the probability of F probability of G. So this tells us any pair of E, F or G is pairwise independent. What about all 3, are they mutually independent? So recall, for mutually independent, We need the probability of E intersect F intersect G is the probability of E times the probability of F times the probability of G. Well, when we calculate that this is going to be equal to 1/6 cubed. On the other hand, this E intersect F intersect G is still going to be that single Roll of 4,3. So this is going to be won 1/36. And so we see that they are not mutually independent, so they're pair wise independent. Any pair of them is independent, but all three together are not. And if you think about this a little bit, if you know E,F and G, if you know two of them have occurred, then the 3rd 1 must have occurred. So, knowing the first to have occurred, tells you something about the probability that the third event will occur. Here's a second example, suppose we have a school with 1200 students, 250 year juniors, 150 students from the whole school are taking a stats courses. Furthermore, we know that 40 students are juniors and taking a status of course. So we want to know, we want to let jay be the event that a randomly selected student is a junior. And I want to let s be the event that the selected student is taking statistics. If the randomly selected student is a junior, then what is the probability that they are also taking stats? So what this question is asking? So, let J equal a junior and s is going to be, the student is taking a stats course. Now from what we're given were given, the probability of S is 150 over 1200 the probability of J is 250 over 1200. And the probability of S intersect J is 40 over 1200. So that's just given from the problem. And the first part asks, calculate the probability that if you know the person you selected is a junior, what's the chance that they are also taking statistics? So this is going to be the probability of S intersect J divided by the probability of J. So we get 40 over 1200, divided by the probability of being a junior. So that's 250 over 1200, 4 over 25 and that's the same as 16. So remember this is the posterior probability. So this is the probability that they are taking statistics. Given that we now have that additional information that they're a junior. The prior probability is right here. This is the player, the probability that a randomly selected person is taking statistics. The second part asks SSJ, R, J and S independent. And let's calculate the probability of S given J is not equal to the probability of S. So not independent, also note we could have answered this question without calculating the first part. So we could have calculated the probability of S times the probability of J. That's going to be 150 over 1200 times 250 over 1200. And you can calculate that that's not the same. S intersect J which is 40 over 1200. Just throw that into your calculator. Let's do one more example, in this example I'm going to assume that each component works independently of every other component and I want to calculate some event. Let's look at this, I'm going to have a system of five components and the A C B will be the event that the ice component works. And I'm going to assume the probability of A C B is equal 2.9. I'm also going to assume that the components work independently of each other. So for the entire system to work you would need a path of working components from the start to the finish. So if we think about it we could have a path from 1 to 2, from 1 to 3 or 4 to 5. So there's three separate pass through the system that make the system work. You can think about it as water flowing through the system and these are individual gates. You can think of it as electricity flowing through the system. You could think of it as some other complicated system where you have redundancy built in so that your probability of the system working is as good as possible. Now, what does the sample space look like? If we had to come up with a sample space, it would look like X1, X2, X3, X4 X5. Where Xi is a 1 if I've component works and Xi is zero. If I've component doesn't work. We can see because each one of the X bis has two choices, 0 or 1. We can see that the cardinality is 2 to the 5th. So there's 32 possible events in the system. But because each component works with probability 0.9 and then doesn't work with probability 0.1, we know that each element in s is not equally likely. So, for example. The probability of the event 00000. So none of the components is working. That's going to be 0.1 to the 5th. The probability of the event 1, 0, 1, 0, 1. So that would be 1, 3 and 5 are working, and 2 and 4 or not, that would be 0.9 cubed, and 0.1 squared. All right. So how are we going to calculate the probability that the system works? We would need ones in the 1st and 2nd spot, or the first and the third, or the 4th and the 5th. So let's think about this. So for the probability that the system works. That's going to be the probability of A1 and a2 working, or that's a union A1 and a3, or a4 and A5. The elements in here are 1, 1 and then I don't know what the 3rd, 4th and 5th spots are and it doesn't matter. The system will work if I have a one in the first spot and a one in the second. Likewise here. The elements in the event A1 intersect day three have a one in the first spot and a three in the second spot. And then I don't know what's in the 2nd, 4th or fifth spot. They could be 0s or 1s. It doesn't matter because the system will still work here. We'll have this. So how do we calculate the probability of this event? Well, if you remember, so I'll just go up here and say recall. If we have, We should in one of the previous that this is the probability of B plus the probability of C minus all of the pair wise intersections. Okay. Plus the probability. Of the intersection of all three. So we're going to think of this as the event A. From up here. This will be the event B. And this will be the event C. And we're just going to apply that process. So we're going to get the probability of A1 intersect A2, plus the probability of A1 intersect A3, plus the probability of A4 intersect A5. That's the some of the Union of the three events. Then we're going to subtract all the pair wise intersections. So if we have A1 intersect A2 and we intersect that with A1 intersect A3. I only have to write the A1 once. So we get A1, A2, A3. That's going to be the pair wise intersection of the first two. Then we'll take the probability of A1 intersect A2 intersect A4 intersect A5. So we're intersecting this one and this one. And then the last intersection is the probability of A1, A3, A4 and A5. And then we're going to add in the intersection of all five events like that. But now remember we're assuming that all of the A's of i's are independent of each other, so A1 and A2 function independently. And so we can use the fact that probability of A1 intersect day two is the same as the probability of A1 times the probability of A2. And we can get this as being 0.9 squared. So I'll just write that here. This is the probability of A1 times the probability of A2, and that's going to be 0.9 squared. Now we have that three times, because we have the A1 intersect A3, and A4 intersect A5. So we get 3(0.9) squared. Then this one right here, this is going to be -0.9 cubed. Then these two here and here that will be 0.9 to the fourth and we have two of them, and then the last one is 0.9 to the fifth. So we are using the fact that each component works independently of each other. So when you put this into a computer or calculator you get 0.97929. So this is the overall probability, overall probability that the system works. Now the probability of A1 intersect A2, we mentioned this is 0.9 squared, that's going to be 0.81. So let's think about this. Just looking at components one and two the system will work with probability 0.81. But we have three possible pathways through the system. We can do A1 and A2, we could do A1 and A3 or we could do 4 and 5. And so because of that redundancy, we increase the overall chance that the whole entire system will work up to 0.97929. So, the key thing, key to increasing probability, that the system works is redundancy. There's another way of course that you could improve the chance that the system will work. And that is to improve the chance that every single component, has probability higher than 0.9. But given that they are each 0.9, we can improve the chance, the system will work by adding additional redundancy into the system. All right now, we could make this problem harder. We assumed that every valve or every component, worked with probability 0.9. Maybe some of the components are more delicate than others. Maybe some of them, don't work with such a high probability. Maybe some work with higher probability. And so, we could have had each component with different probabilities. And the way we would calculate the probability, would be exactly the same. It just wouldn't be all 0.9 in here. We'd have other values as well. Okay, one final question for this video. Suppose you have two events, A and B. And they're mutually exclusive. So, that is A intersect B, is the empty set. So, the question is, are A and B independent? You might initially think yes. Since A and B are mutually exclusive, then knowing the probability of one, maybe would influence the probability of the other. But let's think about this. The probability of A given B is the probability of A intersect B, divided by the probability of B. So, if you know that A and B are mutually exclusive, then the probability of the intersection is zero. So, this is since A intersect B, equals the empty set. So, if you know A and B are mutually exclusive, knowing that one has occurred, means the other one can't occur. So, knowing B has occurred, means A cannot occur. So, A and B are dependent. Okay, so hopefully this has given you some sense of what it means for two events, or more than two events, to be independent and some ways to work with that information. We'll see you next time, bye.