0:00

[MUSIC]

At the end of week two,

we considered a few examples of different families of probability distributions.

You may recall things like the Bernoulli, the binomial, and

the Poisson distributions which we covered.

Now, one of the most important distributions

in statistics is something called the normal distribution.

Now, there's actually a bit of a distinction between those Bernoulli,

binomial and Poisson considered previously, and the normal distribution,

whereby the previous ones were what we call discrete distributions,

such that the values which available could take tended to be integer values only.

Now, we've introduced the concept of measurable variables within

this week of the course.

And the normal distribution is a distribution appropriate for measurable

variables which could be measured along a continuum, along a continuous interval.

So I'd just like just to show you a few examples of the normal distribution.

Both from a more empirical sampling perspective,

as well as the theoretical characteristics of the normal distribution.

So many of you may have come across the normal distribution may not have

known it's name.

But it's this sort of familiar bell-shaped curve.

So if we return briefly to the previous section where we introduced

the hypothetical returns on to stocks, the red stock and the black stock,

those are histograms.

You could see a reasonable approximation to a bell-shaped curve.

Indeed, in finance, quite often it is assumed that the returns of stocks,

shares, equities tend to follow a normal distribution.

However, you may recall back in week one of the course I said about the importance

of simplifying assumptions in modeling, but one should always be cautious

about whether these assumptions are truly born out in reality.

And in fact, there's some debate about whether the returns on stocks truly follow

a normal distribution or in reality might detail areas of the returns of stocks

perhaps be slightly fatter than a normal distribution, indicating that extreme

returns are more likely than a normal distribution might predict.

But ignoring those technicalities, and putting them to one side, one can say that

if these were stock returns that they follow reasonably a normal distribution.

Another example, let's take it from medicine, if we collected some data

on diastolic blood pressure of a large sample of patients and

produced the histogram of this, remembering that histograms are great

visual displays for a single measurable variable.

And if we look at a histogram of such a diastolic blood pressures across a large

sample of patients, one can see that it's a normal bell-shaped

curve could easily be superimposed on top of that histogram.

Now, in this case, we all thinking about drawing a sample of observations

from a wider population, this is not diastolic blood pressure of every human

being rather a random sample drawn from that wider population and we're going to

discuss more of sampling from populations in the other next week of the course.

But ideally any sample of our observations we observe should be fairly

representative of that wider population.

So it might be reasonable to assume,

that's our idea of making a assumption again, that perhaps the diastolic

blood pressure of all people follows a normal distribution.

And this large random sample we've drawn from that population roughly reflects

those same population characteristics within our sample or histogram.

So there could be many situations in life where a normal distribution could

reasonably be assumed, perhaps just one more variation of that.

If we return to our look at GDP per capita in the histogram reproduced for

this across a large sample of countries, here,

we see a distribution which is very much non normal.

Indeed, what we would call a heavily skewed distribution.

A large proportion of countries have a very low GDP per capita, either one or

two perhaps, outliers which on a GDP per capita basis are very wealthy indeed.

So income here on a GDP per capita basis,

clearly does not tend to follow a normal distribution.

Indeed, if we weren't looking across countries but

we're looking at within a country itself, I think it's fair to say that this will

heavily positively skewed distribution could be expected to be found.

Namely the vast majority of people in a country earning fairly or

modest incomes with just a few select ones earning a very high, high income.

These might be, for example,

those professional footballers, the CEOs of some top companies among others.

So although income clearly does not tend to follow a normal distribution,

sometimes we may be able to apply an appropriate transformation to

a variable in order to make it sort of converge more to normality.

Now, these are little tricks in modeling which you may come across in more advanced

courses.

But quite often, we may not work with income directly, but

the logarithm of income.

And sometimes this so-called log transformation, taking the original

variable and applying the logarithm to it, can turn some distributions which

appear very non normal into ones which seem more normal, and indeed for

those very much interested in this perhaps read up a little bit more on so

called log-normal distributions.

So in short, there are many situations in the real world where we could potentially

assume normality, either directly or perhaps through some log transformation.

But up to now, we've just considered some sample data sets.

I'd like to end this section

by considering the more theoretical characteristics of a normal distribution.

Now, we introduced, within this week, for example,

the letter X to denote some random variable.

So if we considered X perhaps being stock returns, height of human beings,

it would perhaps be another great example where we might assume normality.

Think of all the people you know.

You will perhaps know a few very tall people, a few very short people.

But pretty much everyone else of more sort of moderate height.

And if you do the histogram of the heights of everyone you knew,

you're likely to get something resembling a normal distribution.

6:31

So if let's say, X represented height of people, or

what's our formal notation to say that X follows some normal distribution?

Well, I would write it as follows, X and we see this tilde sign,

which we would translate as is distributed as,

the N, the normal and then we see a couple of Greek letters.

Now, this represent the parameters of the normal distribution.

Now, recall we introduced the concept of parameters back in week two.

For example with the Bernoulli distribution.

You may recall the pi, the probability of success

represented the parameter of the Bernoulli family of distributions.

For the binomial family, we had two parameters, the number of trials, N, and

also, this constant probability of success across those N trials.

Well, the normal distribution is another distributional family, but

particularly appropriate for measurable or continuous kinds of random variables.

And this is a two parameter family, whereby each member of this

family of normal distributions is distinguished by these two parameters,

the mean and the variance, which we're going to use Greek letters for.

Namely Mu to represent the so called population mean, the expected value

of X as well as sigma squared representing the variance of this distribution.

Now, within week three, we considered those measures of central tendency such

as the mean also measures of dispersion such as the sample variance.

But the X bar sample mean and S squared sample variance,

these were both with respect to a sample of a set of data.

Here though, theoretically,

we are considering things more out of population level.

So imagine we were looking at the heights not just over maybe 100 people randomly

chosen from humanity but the entire human race then we might say that X,

the height of humans, follows approximately the normal distribution with

a mean of mu and a variance of sigma squared.

So this mu would be the mean, the average but

not of a sample the average across the entire population.

The sigma to the second would be the variance, not of a sample but

the variance of height across the entire population.

Now, as you can perhaps anticipate, it may not be very easy necessarily,

to gather data on the heights of every human being on Earth, and

hence calculate the true population mean and the true population variance.

Indeed, this is a major problem and we'll address these issues next week,

when we move onto the realm of point estimation.

But nonetheless, a normal distribution is characterized by its mean, its measure of

central tendency, i.e., where is the center of that normal distribution.

And also, sigma squared the variance, how spread out that normal distribution is.

So as we vary either one or

both of those parameters, this would give a rise to different normal distributions.

Now, I did mention back in week two the concept of parameter space,

the possible values which a parameter can take.

Well, mu, the population mean, can in fact be any real number.

It can span anywhere from minus infinity to plus infinity.

But sigma squared, being a variance and perhaps to extend

our variance discussion from earlier, well, variances can never be negative.

Now, when we work this out for the sample variance in the previous section,

remember where we're taking an average of the squared deviations about the mean.

Well, when we take squared deviations, those values can never be negative.

So both sample variances, and by extension, population variances as well,

these are always going to be strictly greater than zero.

So there are an infinite number of normal distributions each

distinguished by its mean and its variance.

So the normal distribution are could be the most widely,

the most important distribution in statistics for a variety of reasons.

Naming, it can adequately represent a wide variety

of real world phenomena, such as the heights of people.

Potentially, the returns on various stocks.

And we'll also meet it again as perhaps

a distributional assumption within different types of models.

We'll briefly touch on something called regression in our

final week of the course.

As well as something called the central limit theorem.

Now, this is really at the heart of a lot of the statistical inference procedures

which we're going to be looking at in the next couple of weeks of the course.

So I appreciate the normal distribution,

already the most important one we could think of perhaps within our statistics,

and we're going to see much more use of the normal distribution going forward.

[MUSIC]