[MUSIC] Sometimes we can check is it true that two events are independent by calculating some probabilities. But more often, we can assume that some events are independent from some general considerations and use this knowledge to find some probabilities. Let us consider an example. Let us consider a professor who is going to the lecture. To go there, he have to drive through the city and it is possible to stuck in traffic if there will be traffic jam. To avoid traffic jams, Professor decides to depart earlier and so he has to set up some alarm clock. Unfortunately, alarm clock can fail. We want to find the probability with which Professor have to cancel their class. So they have two events, event A, alarm clock failed. And event T, traffic jam occurred. Even if alarm clock failed, if no traffic jam occurred everything is fine and professor will be on time at their lecture. But if both of these events occur, alarm clock failed, and traffic jam occurred, then unfortunately professor have to cancel the class. So we have event C, class canceled. Now we can say that C is intersection of A and T. C occurs only if A and T occur simultaneously. Similar to, we know that P(A) equals to 0.1. One over ten times alarm clock fails, and assume that probability of traffic jam equals to 0.3. That means that approximately 3 out of 10 days we have traffic jam. And we are interested in probability of c. It is clear that professor's alarm clock does not change the probability of traffic jam. It is completely independent things. In the same way, it is clear that traffic jam does not affect professor's alarm clock. And so it doesn't change the probability of alarm clock to fail. It means that in this case from our understanding of the world, we can safely assume that A and T are independent event. We can use these assumptions to find probability of C. In this case, probability of C, which is equal to probability of their intersection, equals to product of probabilities. Which is one of the definitions of independence. Then we can find it. We see that probability that a professor have to cancel their class is rather small. It is 0.03. Let us consider a different example. Let us assume that we are a company that has to make some decision. For example, to buy or not to buy some start up. To make this decision, the company can ask expert is it good idea to buy this start up? Unfortunately, every expert can be wrong. How can the company can decrease their probability that expert gives wrong advice. For example, company can ask several experts. Let us denote by E1, the event that the first expert is wrong. In the same way, denote by E2 the event that second expert is wrong. Let us assume that we are in a lucky situation when both experts agree in their recommendations. Then, they can be either both wrong or both right. So, we are interested in the fact that they are both wrong, which is denoted by this intersection. What can we say about the probability of this event? It depends on the procedure of decision making. Let us first assume that these two experts make their decision independently to each other's. It means that E1 and E2 are independent events. In this case, our probability that they are both wrong equals to product of probabilities E1 and E2. Of course, we assume that these probabilities are less than 1. No expert will give us wrong answers every time. In this case, this product is less than any of these factors. For example, it is less than probability of one. So if both experts give the same advice, the probability that they both wrong is less than probability that only the first expert wrong. So in a sense, in this case committee of two experts is better than just one expert in terms of the error of mistake. However, it is also possible that the decisions of these experts are not independent to each other. For example, it is possible it is in a sense extreme case of dependence, when the second expert just copy the solution of the first expert. In this case, they are both wrong or both right at the same time. It means that event E1 equals to event E2. Now, if we find the probability of their intersection it is the same as probability of E1 intersected with E1. And this intersection is just event E1. So we see that in this case, using of the second expert gives us no new information and the probability of mistake is the same. Of course, these are two extreme cases when both experts are independent and when they just copied each other's solution. In a real situation, it is possible that there is some dependence between their actions. But it is not this full dependence as here. In any case, the more independent their solutions the better from the point of view of this probability. In fact, we use the same idea in machine learning. If we have several predictive models that make independent predictions, we can unite these models and create a new model which is called ensemble. If our models behave more or less independently to each other, the performance of ensemble will be better than the performance of one model. [SOUND]