[MUSIC] In supervised machine learning, we are building predictive models that predict variance of some variable, using variance of some other variables. This is possible if there is some connections between these variables. To study these kinds of connections, we have to study conditional probabilities and the notion of independence of events. Let us begin with an example. Let us assume that we are performing a study, that is interested in how marriage is related to happiness. To perform this study, we conducted some survey. And now, we have some data set which contains information about, is a particular participant married and is he or she happy? So, we have a table that can look like the following. Here is participant ID. Here is married, and here is happy. And we have some answers here. For example, participant number 1 is married but not happy. And participant number 2 is married and happy. And participant number 3 is not married, but also happy, and so on. It is possible that we have a lot of participants. And what we are interested in, is other connection between the value of this variable, is a particular participant married and this variable, is he or she happy? For example, we are interested in the probability to meet a happy person, depending on the fact, is he or she married or not? So, to ease the processing of this data, let us consider the following new table, which is constructed using this table. This is called a contingency table and it is constructed in the following way. We have two variables, Happy and married. The values of happy are yes and no. And the values of married are also, yes and no. And let us put some numbers here. For example, let us assume that in our study, we have 42 participants who are happy and married. So, in this table, we have 42 rows for which we have yes, yes. And the rest of values are shown here. In this case, we can find that, overall number of married participants is 70 and overall number of not married participant is 30. And we can also find overall number of happy participants, which is 48 and overall number of non-happy participants, which is 52. Now, let us consider the following random experiment. We pick random participant from our study. Or you can think about this table, we pick a random row of this table. We will use equal probabilities for each participants. Now, the set of all outcomes is the same as the set of all participants. The number of outcomes equals to the overall number of participants, which is 100. You can find by summation of these numbers, Now, as we defined probability space, we can ask probabilistic questions. Questions about different events. For example, we can consider event which is called H. And it means that, the chosen, Participant, is happy. And we can also consider event M. That is the chosen participant is married. Now, we can find probabilities of some events. For example, what is probability of H? We see that we have 48 happy participants out of 100. So, this is 48/100, this is 0.48. In the same way, we can find probability, that chosen random person is married. What is the probability of M? This is 70/100 which is 0.7. Now let me ask, assume that, know that a chosen person is married. For example, I see a wedding ring on this person, what can I say now about the probability that this person is happy? If I don't know anything about either this person married or not, then this probability is equal to 48/100. But, if I know that the person is married, does it give me any new information about of probability of age? You can look at this table and try to guess, but also we can just calculate. If we already know, that the person that we chose is married, it means that it is one of these 70 persons with equal probabilities. Because there is no any reasons to assume that, one person from this set is chosen with higher probability than another person. So, probability that person is happy. Now, all randomly chosen participant is happy, provided that, This person is married. Can we find in the following way? We are interested in these 42 happily married persons. Out of this 70 persons. And this is equal to 0.6 Now, we see that this probability is higher than probability that person is happy. This is different probability. It is denoted by the following symbol. This is probability of H under condition of M. This thing is called conditional probability. Here is vertical bar or pipe. Now, we see that this probability equals to 0.6 Can you find probability that, some person is happy under assumption that this person is not married? In this case, if we know, that the person is not married, so the event M did not occur. It means that, we choose our participant among these 30 and non-married participants. And the corresponding probability, this conditional probability is equals to 6/30. So, we see that, if we know the that person is married, we can predict, for example, in some mentioned learning prediction model, that this person is happy with probability 0.6. Which is larger than the probability that this person is happy. If we don't know anything about, is he or she married or not? So, we see that in this case, knowledge of one variable, of variable married, gives us some information about another variable, variable happy. These kind of connections between variables is a key reason why machine learning algorithms can work. Now, let us study conditional probabilities systematically. [MUSIC]