This video is on compliance classes. So we begin by looking at how potential values of treatment can be used to classify subjects into what are known as principal strata. We'll also aim to understand what is meant by compliers, and then we'll distinguish between population causal effects and local causal effects. So as a reminder, we have this idea of potential values of treatments. So the potential value of treatments are in this case, A0 and A1, where A0 is the treatment that you would receive if you were randomized to the control condition. Or in other words, if the instrument was Z = 0, or if you were randomized to not receive encouragement, essentially. So we're imagining that everyone has this potential value, A0. So it's the treatment that you would receive if you had been randomized to the Z = 0 group. Whereas A1 is the value of treatment that you would receive if you were randomized to the Z = 1 condition. So if you had been assigned treatment. Right, so these are potential values of treatment. And we imagine this exists for everybody, we just don't necessarily see them, and we certainly don't see them all for everybody. But what we can do is we can take this pair A0 and A1, and then classify people or label people based on their pair values, okay? So if we look at this first row for example. So the first row is people who, if they were randomized to the control condition, they would not take treatment. So that's A0 equals 0, means if they were randomized to the control condition, they would not take treatment. But for these individuals also, if they were randomized to receive the treatment, so if they were in the Z = 1 arm, they still would not take the treatment. So we'll call these never-takers, and that just of course means that the never take the treatment. So no matter what you assign them, whether you assign them to the control group or the treatment group, they're just never going to take the treatment. So they could be just people who just are not interested in that treatment for whatever reasons. And you can do the same thing with other possible contrast. So you have people who, if they were assigned the control condition, they don't take treatment. But if they were assigned to treatment, they do take it, and we'll call them compliers. So these are people who are doing what they're assigned to do. And then we also have two other groups. So there's the defiers who do the opposite of what they're told. So if you assign them the control condition, they take the treatment. And if you assign them the treatment condition, then they don't take treatment. And finally, there's the always-takers and they just always take treatment no matter what they're assigned. So this sort of layout is also what's known as part of the Rubin causal model. And it's also a general kind of approach that's known as principal stratification. So we'll briefly talk about each of these subpopulations. But one thing to note is to really think of these as subpopulations of people. So think of never-takers as one group of people. Again, so these are people who no matter what they're assigned, they are not going to take treatment. So encouragement for this group does not work. So one thing to note about this population is that, if we knew who this population was, we wouldn't be able to learn anything about the causal effect for treatment for them, right? Because there would be no actual variation in treatment received. So in this population, they never take treatment, so we would never observe an outcome under treatment for anyone in this group. All right, so there's no way from data that we could learn about the causal effect of treatment for this population, at least without making some other strong assumptions. So in general, we don't have variability in treatment in this group. So we don't have much hope of learning about the causal effect of treatment for that group. Then we have this population of compliers. So they take treatment when they're encouraged to, and they don't otherwise. So treatment received for this group is always equal to treatment assigned. So in this group, we get variation in treatment received. So in this population, some people will take the treatment and some people won't, and it's entirely based on this coin flip. It's based on randomization. So this is a population we have a lot of hope for learning about a causal effective treatment since we're directly randomizing Z treatment assignment. And they do what they're told. So we actually are directly randomizing treatment received in this population. So this is a population that we hope we can learn something about. Then we have defiers, and they do the opposite of what they're encouraged to do. So in this group you could think of treatment received as randomized, but just sort of in the opposite way than you intend. So there's still sort of randomization happening here but they're just doing the opposite of what they were told. So in principle, we also could hope to learn about the causal effect of treatment in this group. As we'll see later, that this is a group that we tend to think would either be very small or not exist. In some cases, in like in randomized trials, the group assigned to no treatment or to the control condition, might not even have access to the treatment. So it could be some new drug, and they might not even have access to it, for example. Or it could be some specific intervention that they wouldn't have access to. So in some cases, we will imagine this population doesn't exist. But if it does exist, typically, we'll think it's pretty small. Because if Z is encouragement, we wouldn't think there's many people who would sort of do the opposite of what they're told. So if you're a parent and you have teenagers, perhaps your encouragement would cause them to do the opposite. But in general, we think these are probably pretty unusual. Then there are the always-takers, so they always take treatment. So no matter what, whether you encourage or not, they take treatment. And again this is a group that there's no variation in treatment received. All right, so we don't expect to have information about the causal effect in this population. So one of the motivations for using instrumental variable methods is there's this concern about unmeasured confounding. And just as a reminder, if there is unmeasured confounding, then we can't average or marginalize overall the confounders, right? because some of the confounders are unobserved, all right? So we can't sort of condition or match on these confounders and then average over it, because we don't observe them all. So then I'm emphasizing this here because the causal methods that we focused on in other videos really does focus on causal effects in the whole population, right. But if you have unmeasured confounding, it would be very difficult to actually obtain causal effects for the whole population, considering really the way to do that is to average over all confounders. So instrumental variable methods are not going to focus on the average casual effect for the whole population. So remember instrumental variable methods, we're hoping that they can still be used to estimate a valid casual effect even if there's unmeasured confounding. But as I just mentioned, if you have unmeasured confounding it's really, you shouldn't expect to be able to estimate a causal effect for the whole population. So instrumental variable methods are not going to do that. Instrumental variable methods are going to try to estimate a causal effect locally. So what we'll call local average treatment effect, and I'll define what we mean by that. So here's what's meant by a local average treatment effect. So in instrumental variable methods our target of influence is going to be this. So first thing to note is we're contrasting means of potential outcomes. So the potential outcomes are here under either Z equal 1 or equals 0. So what you're assigned to or what you're prescribed, what you're randomized to. And it's a valid causal effect because we're comparing the same subpopulations. So you'll notice what we're conditioning on, here and here, are exactly the same. So it's the same subpopulation of people. So any time you can trust a potential outcome in the same subpopulation of people, you have a valid causal effect. So now what we need to do though is think about who is the subpopulation of people? So I'll just clear this off a little bit so you can see better. So what you'll notice is that A0 = 0, and A1 = 1. That's actually just the population of compliers, right? So we're looking at the subpopulation of people who, if they were prescribed treatment or if assigned treatment, they would take it. So that's A1 = 1. But if they were assigned to the control condition, then they wouldn't take it, that's A0 = 0. We're conditioning on that. So we're saying, restrict to the subpopulation of people that if assigned treatment, take it. If not assigned treatment, don't take it. Well that's a subpopulation of compliers. So what we're really looking at here is the average causal effect of treatment assigned on the population of compliers. So this is what we mean by local. It's local in the sense that it's an inference about a subpopulation. So that's what we mean by local as a subpopulation. The subpopulation happens to be compliers. And it actually turns out, if you restrict to the subpopulation of compliers, I can do this little thing here where I go from indexing by Z to indexing by A. So now I'm actually comparing potential outcomes based on treatment received, as opposed to treatment assigned. And so as a thought experiment, or just something to think about, I'm asking why? Why was I able to just swap what I'm indexing the potential outcomes by? So I went from Z = 1 to A = 1, and from Z = 0 to A = 0. So now, this would be a good place to pause the video and think about it for a minute or two and have your answer. I'll assume you've done that. Well so now, remember that we're restricting to the subpopulation of compliers. So compliers are people who do what they're told. So Z = 1 for compliers is always going to correspond exactly to A = 1. And for a complier Z = 0 is always going to correspond exactly to A = 0, meaning Z = 0 implies A = 0 for compliers. Z = 1 implies A = 1 for compliers. So as long as I'm restricting the subpopulation of compliers, I can use Z and A interchangeably. So as I mentioned, this is a causal effect. It's contrasting counterfactuals in a common population, and in fact, it's a causal effect of treatment received, right? Because you see that I have potential outcomes index by treatment received, so it is a causal effect of treatment received. But it's in a subpopulation so its local. It's only looking at compliers. And this is sometimes refered to as the complier average causal effect. So sometimes this is refered to a local average treatment effect. That's kind of the general term for causal effects in subpopulations. But in this specific situation where you are in a randomized trial with noncompliance, then this is referred to as the complier average causal effect or sometimes abbreviated CACE. So you'll notice that what I'm saying in an instrumental variable analysis, the target of inference is the cause and effect of treatment received among compliers. So you'll notice that we're not talking about defiers, always-takers, or never-takers. So we're not actually, in instrumental variable analysis, going to make inference about those populations. And we'll discuss that more in other videos. But hopefully it's especially clear why we won't be making inference about always-takers and never-takers. And that's because for those groups, there's no variation in treatment received. And so we shouldn't expect to learn anything about the causal effect in those subpopulations. Okay, so now we want to think about, that was a concept, we have a target of inference but that target of inference involves potential outcomes. So what we need to start thinking about observed data because we're going to somehow, or at some point, go from observed data to potential outcomes. We're going to have to estimate this. So it would be good to sort of think about observed data here. So what I'm writing down here is what we observe and then also these, sort of potential treatments. So we have, one thing we observe is Z, so that's treatment assigned. Another thing we observe is treatment received. And so you'll notice what I've done in these two columns, is I've put all four possible combinations of those two variables. So it could be 00, 01, 10, or 11. So think of these as the four combinations that you could observe in your data. And now I'm also going to write down what the corresponding potential treatments are. So the first one, this one here is a potential treatment if assigned a control. And then the other one is a potential treatment if assigned to treatment. And so you'll notice if you're in the Z = 0, A = 0 group, we observe this one, right? So this is somebody who was actually assigned to the control condition. And in this case, they did what they were told, they didn't actually take treatment. So we know that A0 is equal to 0. But we don't know what they would have done, had they been assigned treatment 1. So what all we know about this group of people is that they're either never- takers, right? They might be people who just would never take the treatment regardless of what they were assigned. Or they might be compliers, they might always do what they're told. So we've narrowed it down to two choices, but we don't know for sure which one they are. So you can do to have a similar exercise with these other cases. So somebody who was assigned the control condition but actually took treatment, right? Well, we would know that if assigned Z = 0, they take treatment, which means they're either always-takers or defiers. So if they're always-takers that would mean A1 would equal 1, if they were defiers that would mean A1 equals 0. But we don't know which one they are. Then you could do the same thing, where you have Z = 1, A = 0. In that case, they're either, well, so they were assigned treatment, they didn't take it. So they're either never-takers or defiers. And then finally, if you're assigned treatment and take it, then we know you're either an always-taker or a complier. So you see that from the observed data, we can narrow it down to two options for each person. So we don't know which group you're in for sure, but we can narrow it down to two options. So what I'm calling compliance classes, these are also known as principal strata. So these things that we're labeling never-takers, compilers, defiers, and so on, these are things that you could stratify. And in fact, to get the causal effect we want, we stratify compilers. We stratify on those, those are their principal strata. We could also think of those as compliance classes or sort of classes or labels based on characteristics of people. But they're latent, meaning they're unobserved, right? So we don't actually know which of these groups you're in. We can narrow it down to two based on your observed data, all right? So that creates a challenge because we want to make inference about the complier average causal effect. So that's a subpopulation of people and we don't actually know for sure who they are. So how do we estimate a complier average causal effect? So, in the future videos, we're going to discuss how we would actually estimate it and what assumptions we would need to do it. So the purpose of this video, which is sort of really set up the problem, think about why we care about the casual effect among the compilers. And now we see that there is a challenge in terms of estimation that we'll have to address.