Welcome to calculus, I'm Professor Greist, we're about to begin lecture 43 on probability densities. We've seen that fair, or uniform probabilities, lead to geometry, to counting, length, area, and volume. But, what happens when probability is not fair? In this lesson, we'll define and describe probability density functions. In our last lesson, we computed probabilities under the assumption of fairness. Namely, that any point is as likely as any other point to be chosen at random. This is not always a good assumption. There are many instances where there's a bias, where certain outcomes are more likely than others. This bias is Is encoded in the notion of a probability density function, sometimes called a pdf. This is a function over a domain, that tells you what outcomes are more likely than others, such as exam scores, or heights. We define a probability density function rho as a function that satisfies the following two criteria. First, rho is non-negative. And second, the integral of rho is equal to one. We have to specify a little bit more. Namely, a domain, D, on which we are discussing the PDF. So in particular, the integral of rho over D equals one. Now that's the definition, but it's certainly not a very intuitive definition. What does it mean? Now before answering that, let's consider a specific example in the context of a collection of light bulbs. These light bulbs will eventually fail. But, the question is when? It happens with some sort of randomness, but how is this randomness regulated? Well, there's some underlying probability density function. Let's assume that it were exponential, and that is the light bulb is more likely to fail early, and less likely to fail later on. This would be a function rho of t of the form e to the minus alpha t, let's say, where t is time and alpha is some positive constant. Is this a PDF? Well, it is certainly satisfying the first criterion, it is non-negative. As for the second criterion, let's specify a domain D for the time as zero to infinity. Then, in this case what would the integral over this domain be? While, integrating an exponential function is easy enough, this gives e to the minus alpha t times negative one over alpha. Evaluating from t to infinity, we get one over alpha. This is not going to work unless, of course, alpha is equal to one. So, what we could do is modify the pdf by adding a coefficient of alpha out in front. If we did that, then the integral is going to be equal to one. Now that's a good example of a PDF, but we still don't know quite what it means. Well let's consider that meaning in the context of fairness, which we already have some experience with. Fairness connotes a uniform density function. That means a PDF that is constant on the domain. What would that constant be? Well it has to satisfy the integral over d equals rho, that is this constant, times the volume of the domain. Now, in order to be a PDF, this has to satisfy that integral equals one. So what does that tell us about rho? Rho, this constant, must be one over the volume of the domain. Let's see what that looks like in the context of the domain being an interval, let's say from A to B. In this case, rho is one over the length of this interval. That is one over b- a. What would it look like in the case of a discrete or zero-dimensional domain? Well, let's say we had a die, a single die. Then the domain consists of six points. The different outcomes for the faces, the PDF would be one over the volume of this domain, volume in dimension zero being simply counting. This means that rho is equal to the constant 1/6. If we add a different discrete set, let's say for flipping a coin, then since we only have two points in that domain, heads and tails, then rho would be equal to 1/2 or one half. Now consider this more carefully, because what we have in general is that for the suite set on n points, rho, a uniform density is the constant, 1/n. In the case of, say, flipping a coin, notice that the value of rho is precisely the probability of getting that outcome. You have a 50/50 chance for getting heads. If you roll a six-sided die, your probability of landing on any one outcome is one-sixth. Notice also what happens if you want to consider the probability of landing in a collection of outcomes. Let's say, what's the probability of getting four or five? Well, we would add up these values of rho one-sixth plus one-sixth is one-third. Now, does that intuition carry over into the continuous case? No. The probability of landing at any single point in an interval is not one over the length of that interval. Not at all. However, if we take a sub-interval, then we can make sense of the probability in terms of lengths. If we consider, with what probability does a randomly chosen point in a domain D lie within a subset A of D, then we have answered this question in the case of a uniform probability density function. We know that the probability of landing in A is the volume fraction. That is, the volume of A divided by the volume of D. We could write that as the integral over the domain A of one over the volume of D. But that is precisely the integral of the uniform PDF rho, that constant one over volume of D, but integrated over A, not overall of D. This leads us to consider more generally the formula that the probability, capital P, of landing in A with a point chosen at random, is the ratio of the integral of rho over a to the integral or rho over d. And this explains why we want the integral of the pdf rho over all of d to be equal to one, so that we can simply write the probability of landing in A as the integral of the PDF over the sub domain A. This holds in the uniform case, but it also holds in general. If we have a non-constant PDF and we want to know what is the probability of lying, or landing, in subset a, we integrate the probability element. That is, rho of x dx, over the domain A. Let us interpret these results in the simple case of a domain being the interval from a to b. Given our PDF rho, what is the probability that a randomly chosen point in that domain lies between a and b? Well by our definition, this probability P is the integral rho of x dx, as x goes from a to b. Well that integral is, by definition, one. What does that mean? When you see a probability of one, that means yes, it will happen. But let's keep going. What's the probability that a random chosen point is exactly A. Well that probability is the integral of rho of x dx. As x goes from a to a. From what we know about integrals, that is equal to zero. When you have a probability of zero this means, no, it's not going to happen. What's the probability that a randomly point is closer to A than to B? Well, we would simply integrate rho of x dx from the left point A to the midpoint of the domain. For concreteness, consider the example of a company that advertises. Half of its customers are served within five minutes. What are your odds of having to wait for more than ten? Let's assume an exponential pdf, rho of t is alpha, e to the minus alpha t over the domain from zero to infinity. Our first problem is, we don't know alpha, but we do know the probability of your serving time being in the interval from zero to five. That is, by definition, the integral of alpha e to the minus alpha t, dt, as t goes from zero to five. And we're told that that probability is one half. Now, we can do that integral easily enough, evaluating at the limit, and then doing a little bit of algebra to solve for alpha. I'm going to leave it to you to follow the computations, and see that alpha is 1/5 times log of two. With that in hand, we can now address the question of the probability of having to wait for more than ten minutes. Now, we would compute the probability of being in the interval from ten to infinity. Thus, we would perform the same integral as before, but evaluate it at limits. T goes from 10 to infinity. This yields e to the negative 10 alpha. If alpha is 1/5 log two, what is -10 alpha? It's negative two times log of two. That is log of two to the negative two power. When we exponentiate that, we get 1/2 squared, or one fourth. That means that you have a 25% chance of having to wait for more than ten minutes. That doesn't sound so good. But what are the odds of having to wait for more than 30 minutes? Well, we would follow the same computation and need to compute negative 30 alpha. That is, log of two to the negative six. Substituting that in, would give us odds are about 1.5%. There's one type of PDF that is of crucial importance that you're going to see again and again. This is called a Gaussian or sometimes a normal PDF. This is a function, rho of x, equals one over the square root of two pi, times e to the minus x squared. You've probably seen this before. This is sometimes called a bell curve. It has a peak around x equals zero, and then drops off. Now there are a few things to observe, first of all, in this case your domain is the entire real line. That is, this is a setting of infinite extent. Anything could happen. You're PDF is certainly a positive. In fact, it's strictly positive. But, the tricky thing is verifying that it's PDF. That is, in verifying that the integral over the entire real line is equal to one. You're going to have to trust me on that for now, you don't quite have enough at your disposal to prove this. Now, you will often see Gaussians that are translated about some middle point, or mean. You'll often see them stretched out or rescaled somehow. What I want you to know about Gaussians, for the moment, is that they are everywhere and all about. Gaussians come up in somewhat surprising places. If you look at binomial coefficients that you obtain from Pasquale's triangle, and consider what the rows look like, you notice that the rows tend to go up in the middle, and then down at the sides, in a manner reminiscent of a shifted Gaussian. In fact, if you were to divide these binomial coefficients by two to the n, where n is the row number, then you'd obtain something, that in the limit, as you go down, converges to something, very much like a Gaussian. This is a hint at one of the deeper truths of mathematics that Gaussians are limits of individual decisions. Left or right. Heads or tails. They compound upon one another, to converge to such distributions. Gaussians are indeed everywhere. So now we see not only what a probability density is, but also how to compute probability by means of integration. In our next lesson, we'll introduce a few of the main characters of probability theory, and see what roll they have to play In our story of Calculus.