0:00

In this lecture, we're going to cover some basic distributional results.

Â And before I cover the instance where we're talking about a response and

Â a predictor.

Â Let me just talk about some results where X is normal with a vector mean mu and

Â a variance sigma.

Â So X, I would assume is m by 1,

Â now we all now that if we took sigma to the minus one-half,

Â x-mu that that's normal 0 i.

Â So if I were to take x-mu transpose sigma inverse, x-mu.

Â Well that's nearly the inter product of that vector.

Â So it's the sum of a bunch of squared iid standard normals,

Â which means it has to be chi squared n.

Â But we can actually make a stronger statement about quadratic forms.

Â Let A be an n by n, not necessarily full rank symmetric matrix.

Â So where the rank of A is equal to p,

Â which is not necessarily equal to n.

Â Now consider the quadratic

Â form x-mu transpose A x-mu.

Â Well that will in general be chi squared

Â where p is again the rank of A, if and

Â only if A sigma is item potent.

Â 1:38

So let's prove this result and it's surprisingly easy to prove.

Â Well we are going to prove one direction at least.

Â So the fact that A is idempotent means that A sigma

Â times A sigma is equal to A sigma.

Â But I'm going to rewrite this in a way that's going to be useful for

Â me by getting rid of that sigma.

Â Remember, we're assuming that x is the normal vector,

Â not a singular normal vector, so that sigma is invertible.

Â So that means that I can write that A sigma A is equal to A.

Â And then let me write the Cholesky decomposition of A as VD squared V.

Â And I'm just going to write as VDV transposed,

Â where D is a diagonal matrix of eigenvalues.

Â And V is n by p, so D is p by p and V transposes p by n.

Â Because it's the eigenvalue decomposition V

Â transpose V is equal to I, p by p identity matrix.

Â So take this statement, and I'd like to write

Â it out using the Cholesky decomposition.

Â So I get VDV transpose sigma VDV transpose is equal to VDV transpose, okay?

Â But now imagine if I were to then, and

Â premultiply this by V transpose and post multiply it by V.

Â I would have to do that same operation on this side.

Â And I would get then that DV transpose

Â sigma VD is equal to D.

Â 3:34

Now I know I regret this decision, I should have had it been D squared.

Â But now I want to divide both of these by D to the minus one-half,

Â and I get Do the one-half.

Â Which in this case, the square root is easy to think about,

Â because it's just the square root of the elements down the diagonal of D.

Â So I get DV transposed sigma VD to the one-half is equal to I.

Â Okay, so our idempotence and

Â our eigenvalue decomposition imply this relationship.

Â 4:30

Well, that's for sure going to be a normal distribution

Â because whenever we have a normal vector.

Â And we just do linear operations to it, it continues to be normal, and

Â its mean is going to be 0.

Â I think that's pretty clearly easy to see, and

Â its variance is just going to be DV transpose sigma VD,

Â I'm sorry, I should have had that D to one-half, okay?

Â So the variance of this vector, is exactly the variance

Â that we showed above is equal to I, a p by p I matrix.

Â So this quantity right here is exactly normal 0, I.

Â Well, but now if I were to take x-mu transpose or

Â this quantity right here times itself transpose.

Â So x-mu times VD to the one-half

Â D to the one-half, the transpose x-mu,

Â 5:40

That's equal to x-mu transpose A x-mu.

Â So what does this imply?

Â This implies that this quadratic form is the sum of a bunch of

Â squared iid normals of p squared iid normals, and is there for chi squared p.

Â Let's go through an example, where it's not a sigma matrix down the diagonal.

Â So I'm going to reuse some notation here,

Â so I'm drawing this horizontal line to represent the fact that I am going to now

Â sort of start over on notation for this lecture.

Â And I'm going to that y, my y outcome is normally distributed

Â with mean x beta in variant sigma squared I.

Â 6:34

Now consider the residuals, the sum of the squared residuals.

Â E transpose e divided by sigma squared.

Â Okay, that's equal to, I hope this is old hat for everyone now.

Â That's equal to y transpose I minus

Â the x hat matrix times y over sigma squared.

Â But I could just as easily write this as y minus x beta,

Â 7:11

Transpose times I minus to hat matrix times y minus x beta.

Â And the reason I can do that is because the x times this I minus

Â the hat matrix is 0.

Â So adding it in doesn't add anything and then over sigma squared.

Â Well what is this?

Â Well this is a normal vector minus its mean right here, in

Â an quadratic form with the matrix in the middle.

Â So according to my result, this will be chi squared if and only if

Â 7:48

they take to I minus H of x matrix over sigma squared, and I multiply it.

Â So that's my A matrix from my notation before.

Â And I multiply it times

Â the variance matrix from before which was I labeled sigma.

Â So that's sigma squared I, okay?

Â Sigma squared I, so

Â that is equal to I minus H of x over sigma squared times sigma squared I.

Â That's equal to I minus H of x.

Â We've seen on many occasion that that's idempotence.

Â And let's go through an argument about the rank.

Â So the rank of this matrix,

Â the rank of I minus H of x.

Â So the rank of a symmetric idempotence matrix is the trace, okay?

Â So this the rank equals the trace.

Â So that's the trace of I, the trace of H of x,

Â which is x, x transpose x inverse x transpose.

Â Okay, so the trace of this I.

Â Remember, this is an n by n matrix.

Â So that's n, the trace of I is n.

Â And then in the trace of this, I can do trace, because trace AB is trace BA.

Â I can do trace of x transpose x inverse x transpose x.

Â So this is equal to n minus the trace of a p by p now,

Â identity matrix which is equal to n minus p.

Â Now that's the rank of I minus H of x.

Â So the rank of I minus H of x over sigma squared is the same thing because I just

Â multiplied it times of scalar.

Â And so what we get according to this result is that our residuals,

Â e transpose e divided by sigma squared is exactly chi squared n minus p.

Â And so another way to write this out is n minus p times S squared or

Â various estimates divided by sigma squared is chi squared n minus p.

Â And notice has a special case of this, we get

Â the instance of the ordinarily chi squared result for normal data with just a mean,

Â that's the case where we just have an intercept in our linear regression model.

Â And this just simply proves that that is chi squared, and we get n minus 1 degrees

Â of freedom exactly like we show in an introductory statistics class, okay?

Â This is a very handy result for proving very general

Â chi squared results for general quadratic forms.

Â And we'll find it very useful throughout the class.

Â