If we have two random variables and are interested in interaction between them, we have to consider Joint distribution and Joint probability density function. Let X and Y are two random variables that are defined on the same probability space. Let us consider Joint Cumulative distribution function. Joint CDF of X and Y is a function of two variables that is equal to probability that X is less than or equal to x and at the same time Y is less than or equal to y. To visualize variance of two random variables, let us use Cartesian plane. We have two axis, x-axis and y-axis and random variable X takes values on this x-axis and random variable Y takes values on the y-axis. So the pair is denoted by some point on this Cartesian plane. Now let us consider segment here from x naught to x naught plus Delta x and some segment here, y naught, y naught plus Delta y. Let us consider an event that X is in segment from x naught to x naught plus Delta x. At the same time, Y is in segment y naught y naught plus Delta y. It means that x coordinate of the point is here, and y coordinate of the point is here. So the point itself lies somewhere in small rectangle that looks like this one. Now probability density function of X and Y is a function of two variables, x and y. Let me draw naught here which is defined as limit as Delta x tends to zero, and Delta y tends to zero and we are interested in the probability of getting into this rectangle. But again, as delta x becomes smaller and Delta y becomes smaller, this rectangle becomes smaller. The probability of getting there becomes smaller. So we are interested in the relation between this probability, and the area of this rectangle, and this area is Delta x times Delta y. So we have a ratio here, and in numerator, we have probability of this event. In the denominator, we have area of this rectangle which is Delta x times Delta y. This is a joint probability density function of two variables X and Y. The function of two variables can be visualized either by three-dimensional graph or by it's level curves. Let us use level curves to draw how this probability density function can look like. It is possible for example that in this area, values of probability density function are large like greater than two. In this area, they are between 1 and 2 and in this area, they are between 1 and one-half and here they are less than one-half. It means that it is more probable to have a point that is here than in similar rectangles somewhere here. So probability density function gives us information about how likely is to have a value of our random variables that together, gives us a point near some particular point. So if we will sample from the pair of random variables, which probability density function looks like this one, we will get more often points somewhere here. Little bit less of an points somewhere here. Even less points somewhere here and almost no points somewhere here. So this is a picture that we can obtain when we generate values from the pair of random variables so that we consider. Now like in discrete case, we can consider marginal distribution. So if we have a probability density function of a pair of two variables x and y, how can we get probability density function of one variable for example x? In discrete case, we had to consider a sum over all possible values of the other variable, in this case y. In continuous case, we have to replace summation by integration. So we have marginal probability which is defined like this. PDF of random variable X, at some point x is equal to integral of the joint probability, joint density function by dy. Here we have from negative infinity to plus infinity. Geometrically that means that we fix some value of X. For example this value, and we want to find probability density function of random variable X at this point. In this case, we have to draw a straight line like this and over this straight line, our joint probability is a function of one variable Y. The integral of this function gives you the value of probability density function of X at this point. When you change this point, you change this line, and you get different values. This is very close to what we did in discrete random variables. As I said, we just only have to change probabilities to probability density functions and summation by integration. In the same way, we can define probability density function for Y, if we know joint probability. We just have to swap x and y here and integrate over X. Now as you see, joint probability density functions for a pair of random variable is a notion that is very similar to joint probability of discrete random variables.