So now we'll get into, we'll think about what two stage least squares is, and

why it might work.

So, two stage least squares is a method for

estimating a causal effect in an instrumental variables setting.

So first, we'll assume that Z is a valid instrumental variable, so

it affects treatment and the exclusion restriction is met.

And so, what is stage 1?

So, two stage least squares is well named, because there's two stages.

So stage 1, what we'll do is we'll regress the treatment received, A, and

the instrumental variable, Z.

And so here, our error term is seems to be independent,

means 0 constant variance.

And we randomize Z, or we've assume Z is randomized.

It's an instrument we've assumed it's randomized.

So, because of that Z and the error term should be independent.

So we have what looks like a standard kind of linear model.

So Z related to A.

And then we can obtain a prediction.

So we can estimate these two parameters, we have two parameters, alpha 0 and

alpha 1.

We can estimate those using least squares.

And we'll call that alpha-hat, alpha-zero-hat and

alpha-one-hat, and then we could obtain a predicted value of A for each person.

So we'll call that A-hat.

So A-hat i is a predicted value of treatment for subject i.

And that can be just written as alpha 0 hat plus Zi times alpha 1 hat,

so Zi is the value of Z for person i.

So in other words,

what we're doing here is we're getting a predicted value of A given Z.

So for person i for example, to have a particular value of Z.

And then what we're recording this A hat is what we would predict

based only under Z, as what their treatment would be, right?

So this isn't treatment received, this is just based on their instrument,

based on the value of the instrument.

What do we think, what really did we think they would get?

So that's what A-hat is.

So this is in stage one, as you just carry out a standard regression,

a standardly squares, then you can just get a predicted value of the outcome.

The outcome here is treatment received.

So if the stage one is very standard, you can just use regression software and

get a predicted value for each person.