Hi. In this video, we're going to talk about an alternative criterion, the disjunctive cause criterion. So, the objective is to understand what the criterion is, and given a DAG, how to use it to identify a set of variables to control for. Imagine that you're interested in selecting variables to control for in an analysis. So one method for doing that is what's known as the disjunctive cause criterion. There you'll select the set of variables that are causes of the exposure, the outcome, or both. And from the set of variables what we really mean is, all observed variables. So, imagine that you have a lot of variables in your data set and you want to know which of these variables should you control for. So it's possible that there are unobserved variables that you of course cannot control for. So, the advantage of this method is that you do not have to know the whole causal graph. You simply have to be able to identify which variables affect the exposure or the outcome. So, one property of this criterion is that if there exists a set of observed variables that satisfy the backdoor path criterion, then, to set a variable selected based on the disjunctive cause criterion will be sufficient to control for confounding. So, as long as on a given DAG, there's a set of observed variables that you can use to control for confounding. Then, selecting the variables that are causes of exposure or the outcome or both will also be sufficient to control for confounding. So, to illustrate, let's consider an example where we have three observed pre-treatment variables that we'll call M, W and V. And let's imagine that there's also some unobserved pre-treatment variables, U1 and U2. So in practice, of course, it would be typically many more observed variables and far more than just two unobserved variables but we're just going to keep things simple and say there are three observed variables and two unobserved variables. So of course it's impossible to control for the unobserved variables directly in an analysis. Now suppose we also know that W and V are causes of either A, Y, or both. And let's assume that M is not a cause of either A or Y. There's a number of things you could do then to select variables to control for. You could draw a DAG and then use the backdoor path criterion to select some set of variables. But here we're going to imagine that we actually don't know what the DAG is, but we might have some information about the variables. So one thing you could do is just use all pre-treatment covariates. So you could kind of, what some people might view as playing it safe, you could just decide, I'm going to control for everything. So in this example that we'll be controlling for M, W and V. So, you could think of this is one way to select variables which is just use everything you have. Alternatively, you could use the disjunctive cause criterion, and in this case that would be just W and V because on the previous slide we noted that, we're assuming that W and V are causes of either the treatment or outcome or both. So what we're going to do in the next few slides is look at some hypothetical DAGS, and see which of these criterion would be sufficient to control for confounding in those different situations. So here's one example, where you see the true DAG. And in this DAG you can see that V and W are causes of either A or Y or both, and you can also see that M does not affect either A or Y. So that meets the definitions we had on the previous slide. So we're imagining that this is a true DAG. And then if you use the criterion where you use all pre-treatment covariates, in that case we control for M, W and V, you'll see that that does satisfy the backdoor path criterion, because there is only one backdoor path from A to Y, and that's through V and W, and we block that path. So that's fine. So those variables are sufficient to control for confounding. If you look at the second one here where we use the disjunctive cost criterion, we simply control for W and V. We don't include M because that's not a cause of A or Y. Well, it turns out that also satisfies the backdoor path criterion, because we are blocking that one backdoor path from A to Y by controlling for W and V. So here's an alternative true DAG where there are again three variables that we might want to control for V, M, and W. In this case, we actually don't need to control for any variables because there's no unblocked backdoor path from A to Y because there's a collision at M. So technically, you wouldn't have to control for any variables here. But, if you do control for all pre-treatment covariates which is M, W and V, that's fine. It will satisfy the backdoor path criterion because even though when we condition on M, it opens a path between V and W, we're blocking that path by controlling for V and W. So there's no problem there. And similarly, the disjunctive cause criterion also is fine. It controls for W and V, it doesn't condition on the collider, doesn't create any new confounding, and so either of these would work in this example. So here's another hypothetical DAG, where you see that W affects A, V affects Y, and then there's a variable M that doesn't affect A or Y at all. But then here we have two unmeasured variables, U and Y, and I use these dash arrows just as a reminder that we don't observe U1 and U2. So those are not variables that we can control for. And again, we can note that we actually don't need to control for anything in this DAG because the only backdoor path from A to Y has a collision at M. So because there's a collider there, there's no unblocked backdoor path for A to Y. So, we don't actually need to control for anything. But if you didn't know the DAG, then you wouldn't know that that's true. So, suppose because you don't know what the DAG is, you decide you're going to control for M, W and V, in other words, you control for all pre-treatment covariance, in that case you would not satisfy the backdoor path criterion. And again the reason being is because you control for M and there's a collision at M, and that opens a path between U1 and U2, and therefore you can go from A to U1 to U2 to Y. So there is confounding on this graph if you control for M. So using all pre-treatment covariates in this case would end up creating confounding when there was none. And so you wouldn't be controlling for confounding with that criterion. But if you use a disjunctive cause criterion, where you just control for W and V here, it does satisfy the backdoor path criterion. We didn't control for and therefore we didn't open a path between the use. In this example, the true DAG is one such that there is no way to satisfy the backdoor path criterion just by controlling for the observed variables. And so we'll illustrate that here where we have W and V both affect Y, and then there's two unmeasured variables, U1 and U2, and then there's also a variable M but that doesn't affect anything. So M is just an independent variable. And so what we'll see here is that, in general, if you can only control for observed variables and not unobserved ones, you'll see that there is a path from A to Y that goes through W, but there's also a collision at W. And so because there is a collision a W, that opens a path from U1 to U2. So if we control for W, there's a path from U1 to U2, and then you could get from A to Y using that backdoor path. And so, in this case, if you select all pre-treatment covariates, M, W and V, that won't satisfy the backdoor path criterion, because again you open up this path from U1 to U2 that allows for A to be associated with Y in a non causal way. And similarly, if you just control for W and V using the disjunctive cause criterion, you also won't satisfy the backdoor path criterion. So in this example, there's no set of variables that you could control for that would satisfy the backdoor path criterion. So in that case, there's nothing you could do. There's no set of observed variables that would solve the problem and therefore, the disjunctive cause criterion is also not going to work. So to summarize the disjunctive cause criterion, it's not always going to select the smallest set of variables as we saw earlier where in some cases with select variables in situations where you didn't even need to control for anything. But it's conceptually simple, in that you're just listing variables that are causes of treatment or outcome or both. And it's guaranteed to select a set of variables that are sufficient to control for confounding, as long as such a set exists. So as long as your data set contains a set of observe variables that are sufficient to control for confounding. And importantly, you also have to correctly identify all of the observed causes of A and Y. So there's an additional burden there that you have to know something about the causal structure. So you don't have to know the entire causal graph, but you do have to know something about the relationship between these variables so that you can list variables that are causes of A or Y. So now that we have ideas on how to select variables to control for, then we need to think about how do we actually go about controlling for them. So, some general approaches for doing that include matching and inverse probability of treatment weighting. So these and other methods will be discussed in future videos.