0:02

So previously we talked about expected values, and their properties, and a little

Â bit of how to calculate them. Now let's talk about the expected value

Â operator itself, and the expected value operator is in fact a linear operator, and

Â that'll greatly simplify calculating expected values for some relatively

Â complicated things. So, loomis in the'A' and'B' are not random

Â numbers. So, when you think about'A' and'B', you

Â think'A' as some thing like five, it's this number you can plug in.

Â And'x' and'Y' are two random variables. Then expective value of'AX plus B' works

Â out to be'A' expective value of'X' plus'B', exactly what you hope it work out

Â to be. And expective value of'X' plus'Y' works

Â out to be expective value of'X' and expective value of'Y'.

Â And the reason it works out in these cases is because the expected is a linear

Â operator, it is not always the case that expected value of g of x is equal to g of

Â expected value of x, where g is some general function that's not linear.

Â This can happen with specific random variables and specific values of g, but in

Â general it's not the case. The most, sort of, famous example where

Â it's not the case is that expected value of x squared is not equal to expected

Â value of x, whole thing quantity squared. Now, let's talk about what the difference

Â of these two entities is. What do we mean by the difference between

Â these two things? Here x is a random variable.

Â X-squared is the random variable you obtain by squaring x.

Â So, for example, if x is a die roll, it can take values one, two, three, four,

Â five, six. X-squared then can take the values one,

Â four, nine, and it takes those values with probability one-sixth each.

Â So the expected value of X squared represents the expected value of the

Â squared random variable. On the other hand, expected value of X

Â quantity squared, represents what you obtain if you first calculate the expected

Â value of X, and then square the result. And these two things are not equal.

Â This is a, well-known example where expected value of g of x is not equal to g

Â of expected value of x. And we'll see in a couple slides why it is

Â a well-known example of that property. But in general, I would like you to

Â remember that if g is not a linear function then you can't just commute

Â expected values outside of g to the inside of g.

Â And that, that rule would generally hold. If it's, if it's linear, if g is a linear

Â function you can always do it. If g is not a linear, just in general

Â things that you cannot. The expected value rule.

Â Hold, no matter what. Constitutes X and Y.

Â X could be discrete, continuous, mixed discrete and continuous.

Â Y could be discrete, continuous, and mixed and the rules still holds.

Â So, let me go through an example, supposed you flip a coin X, and as we normally do,

Â X is zero is it's tails and one if it's head and you simulate a uniform random

Â number Y. The random number is between zero and one.

Â What's the expected value of their sum. Well the sum of a coin flip and a uniform

Â random variable is weird distribution. It's not obvious, especially if all you've

Â had is the handful of lectures from this class, how you would calculate that

Â distribution, and then from that distribution then calculate the expected

Â value. However, we do know how to calculate the

Â expected value of a uniform random variable and the expected value of a coin

Â flip, and so the expected value of their sum is the sum of their expected values.

Â We know the expected value of the coin flip is.5.

Â We know that the expected value of the uniform random variable is.5, so that

Â expected value is one. So you can see how these.

Â Expected value operator rules make calculating things associated with

Â expected values a lot easier. Another example is, suppose you role a die

Â twice. What is the expected value of the average

Â of two die roles. So you often roll two dies when you're

Â playing a board game, for example. Okay, let's let x1 be the result of the

Â first die and x2 be the result of the second die.

Â Now the variable that we're interested in, let's call it y equal to x1 plus x2,

Â divided by two. Now, one way you could calculate the

Â expected value of y is to figure out what the distribution was of the average of two

Â die rolls. So, let me give you a sense of this really

Â quickly. The reason we think the distribution of a

Â single die roll is one sixth at each number, is if you roll a die a lot of

Â times you get about, one sixth of the, of the die rolls are one, one sixth are two,

Â one sixth are three, one sixth are four, and so on.

Â And, and, and then kind of geometrically we are modelling the process as if they're

Â all equally likely, and so that's why we're going to model the population of die

Â rolls as having probability one sixth on each number.

Â Now this implies a distribution on the average of two die rolls, right?

Â That the smallest number it could take is one, right?

Â One plus one divided by two, this, the average of if you were to get two 1s.

Â And the largest it could take is, is six plus six divided by two or six if you were

Â to roll two 6s. But it takes different values in between,

Â and it, and it's not equally likely. For all the, all the numbers in between.

Â A one has probability 136, but some of the middle values have higher probabilities.

Â So this, any rate, our variable y itself has a distribution.

Â And you could get a pretty good sense of it.

Â Maybe you could do this by taking two dice, rolling them, taking the average,

Â rolling them again, taking the average, doing that over and over and over again,

Â and prob, plotting, you know, a bar plot of the frequency of the, the averages that

Â you get. And that would give you a good sense of

Â what the population distribution is. Or you could work it out on pen and paper

Â as to what, what the distribution actually is.

Â And then once you get that worked out, then you could use your expected value

Â formula to calculate the expected value of y directly by doing summation overall the

Â possible values of y times p of y and calculate its expected value.

Â Another way to do it is to directly use the expected value of linear operator

Â rule. So in this case, expected value of x1 plus

Â x2 divided by two is one half expected value of x1 plus the expected value of x2

Â because the one half is the non random variable that we could just pull out and

Â the expected value goes across the two sums here to get expected value of x1 plus

Â expected value of x2. That then yields 3.5 plus 3.5 divided by

Â two, which is 3.5. Now you might be wanting, wondering.

Â After hearing this it's, "oh, that's interesting." You'd expect a value of the

Â average of two die rolls is the exact same as the expected value of an individual die

Â roll. And that is exactly the case, but you're

Â probably thinking, "Maybe does this extend beyond that.

Â Is the expected value of the average of N die rolls equal to the 3.5 as well." And

Â the, the answer is yes, that is, that's exactly true.

Â In, in fact, as a nice segue way into our next slide.

Â Where we actually derive. The property that we were hinting at in

Â the previous slide. Namely that the expected value of the

Â average of a collection of random variables from the same distribution, is

Â the same as the, the expected value of the individual random variables.

Â So lets let XI, for I equal one to N, be a collection of random variables, each.

Â Each from a distribution would mean mu. I just wanna also point out that we tend

Â to use Greek letters to represent, population quantities, in this case the

Â population mean of the distribution is mu. So lets calculate the expected value of

Â the sample average of the XI. Well, we want the expected value of the

Â sample average which is one over N, summation I equals one to the N to the

Â XI's. The one over N pulls out because its not

Â random. The expected value commutes across the

Â sum. And the expected value of each of those x

Â I's, is itself mu. So we get the summation I equals one to n

Â of mu. We get mu added up n times then which is n

Â mu divided by n on the outside, so we get mu.

Â So what this says is, it doesn't matter what the distribution of the individual

Â x's is The distribution of the mean of the Xes, has the same mean as the, the,

Â individual means. So let me just summarize one more time.

Â The expected value of the sample mean, is the population mean that it's trying to

Â estimate. The population mean of the distribution of

Â the sample mean of N observations is exactly the population mean that it's

Â trying to estimate. And so when this happens.

Â When the expected value of an estimator is what it's trying to estimate, that's a

Â good thing. We say that the estimator is itself

Â unbiased. So sample means are unbiased estimators of

Â population means. And again, there were some assumptions for

Â this to be true, right? All the axis have to be from a

Â distribution that has mean . U being the value you want estimate and

Â then the. The sample mean is, is an unbiased

Â estimator of the population mean and we finally getting to the point where we can

Â talk about how we're going to connect our probability modeling to the data that we

Â observed. We're not quite there yet but we're

Â getting closer and closer to this and I want you to remember that we're throwing

Â around the term mean a lot. And I want, and so if you get confused, I

Â want you to qualify the mean that we're talking about, whether it's a population

Â quantity. By component of the probability

Â distribution, or a sample quantity, an empirical quantity that you connect from

Â the data. And remember our goal in probability

Â modeling is to connect our sample observations to the population using our

Â probability model.

Â