[MUSIC]

We round off week three with a further look at this concept of variants, and

I'd also like to extend our discussion from the previous

session on the normal distribution as well.

So, variants.

When we introduce this as S squared, a few sessions ago,

we were looking at the sample variance of a set of data.

Now what I'd like to consider here is the so called population variance, i.e.,

the variance of a theoretical probability distribution.

So, if we backtrack a little bit and think back to our week two,

where we introduced some simple probability distributions.

Let's return to the concept of the score on a fair die.

So remember for that, we had our sample space, the possible values which this

variable x could take, namely those positive integers, one, two, three, four,

five, and six, and we said, if it was a fair die, those six outcomes were each

equally likely, and we develop the probability distribution, and

we assign the probability of 1/6 to each of those six possible outcomes.

We then introduced the concept of the expectation of X.

We viewed this as an average, effectively a mean, but

this was a mean with respect to some population or theoretical distribution.

So remember, we consider the expectation of X as a probability-weighted average,

whereby we took each value of X,

multiplied it by its respective probability, and added them all together.

So there we found that the expectation of X, where X was the score on a fair die,

was equal to 3.5.

We also noted that this was never an observable value in any

single role of the die, rather we view this as a long run average.

So, distinguish the expectation of X, which we might denote by the Greek

letter mu, to indicate a population or theoretical mean, and

contrast that with the sample mean, X bar, which we've seen this week, which is

the mean just of a set of our observations drawn from some wider population.

So having introduced X bar, we then also considered the sample variance

S squared as a measure of dispersion, but again with respect to the sample.

So I think now we are in a position to work out the equivalent

concept of the variance, but at the theoretical level,

the so called population variance with respect to a probability distribution.

So remember, S squared, I sort of instructed it to you, and

to think of it like an average.

The average square deviation about the mean, and we had our formula,

of course the mean we're talking about here, was the sample mean X bar.

What if we want to work out the variants for a theoretical distribution?

We really want the same kind of concept i.e., we need an average i.e.,

an expectation of the square deviation about the mean.

So whereas previously our expectation was the expectation of X,

the expectation of that random variable, we still require an expectation,

but now the expectation of X minus mu, all squared, i.e.,

the expected squared deviation of X about the mean.

So just as E of X was a probability-weighted average,

the expectation of X minus mu,

all squared, is also a probability-weighted average.

It's just that now,

we don't multiply the X values by their corresponding probabilities, rather

we multiply the X minus mu squared values by their corresponding probabilities.

So let's revisit the score on a fair die and

calculate the true variance of such a score.

So we know the values of X are one, two, three, four, five and six.

We've already determined that the expectation of X,

which hereafter, we can denote by mu, was 3.5.

So for each value of X, for example, the 1, we subtract mu, so

(1- 3.5) square that value,

and do a similar operation on the other remaining five values.

We then multiply each of these by the corresponding probabilities of occurrence,

but of course, as this was an assumed fair die,

each of those scores has the same probability of occurence of 1/6.

So you multiply each on of these with 1/6, and

add them all together, and doing so you will get a total of 2.92,

and this represents the variance for the score on a fair die.

If you wanted to we could take the positive square root and

consider the standard deviation of the score on the fair die, but

do be conscious of the notation being applied.

So sigma squared will correspond to a population variance and

it's positive square root, Sigma the population standard deviation,

and to be clear conceptually between the distinctions of those,

which have been derived from theoretical probability distribution with

their sample counterparts of the sample variance S squared and

the sample standard deviation S.

Now we're going to make much more use of these different means, and

variances, and standard deviations as we progress to

the statistical inference part of the course over the next couple of weeks.

But perhaps just a nice way to round off our week three, is to revisit the normal

distribution, because now we have perhaps a clearer understanding about what mu,

the population mean, and Sigma squared, the population variance, represent.

So we mentioned in the previous section that really there's an infinite number

of different normal distributions, each characterized by different combinations

of values for those parameters of mu and Sigma squared.

Now it would be helpful if we could perhaps have some kind of standardised

normal distribution.

One where it's very easy to relate to.

Well such a distribution exists, called the standard normal distribution.

Now because this is so special, we will assign it its own special letter of Z.

So whenever you come across the letter Z, in sort of statistical courses,

think in term of standardized variables.

Now why on Earth are these things of any great importance to us?

Well, first of all, let's define what we mean by a standardized variable.

This is one which has a mean of 0, and variance of 1, and of course,

given the standard deviation as the positive square root of the variance,

if the variance is 1, by extension, so too, is the standard deviation.

So in notation we might say Z, as a standard normal variable,

is distributed as a normal distribution, with a mean of 0, that's the value for

mu in this special case, and a variance sigma squared special value of 1.

So why are standardized variables, of use to us?

Well, we've previously have mentioned the concept of an outlier.

Remember when we were comparing means and medians, and

which one might be a preferable measure of central tendency, what we did note,

that means, we are very sensitive to the inclusion of any outliers.

But, as yet, we haven't really offered any sort of formal definition of

what an outlier might be, other than it's a sort of extreme observation.