So far in our discussions of graphing, we've considered data obtained from one variable, either categorical or quantitative, and we've learned how to describe the distribution of that single variable using the appropriate visual displays, as well as numerical measures of center and spread. Next, we're going to be visualizing our association of interest by exploring the relationship between two variables. >> Recall that our basic research question is simply whether or not two variables are associated with one another. So is there a relationship between gender and test scores? >> Is there a relationship between the type of light a baby sleeps with and whether or not the child develops nearsightedness? Are the smoking habits of a person related to that person's gender? Or how well can we predict a student's freshman year GPA from his or her SAT score? Before we test our own association of interest with inferential statistics, we're going to work on visually describing that relationship. It's important to understand that when studying two variables, each variable has a role to play. That is, a variable may either be a response variable, also known as the dependent variable or outcome variable, or it could be the explanatory variable, also known as the independent variable or predictor variable. >> At this point, I'm going to be asking you to impose a causal model on your research question. To do this, you'll need to designate which of your variables is the explanatory variable and which is the response variable. Note that we can only impose a causal model, rather than actually test for causation, because we're working with what's known as observational data. In other words, the datasets that we're using are based off studies where the sample is simply observed, rather than being manipulated or influenced in any way as it would be in an experiment. Even though we won't be able to be certain that the association we're testing is causal, for the purposes of exploring our question, it remains important to determine the role that each variable will play in our model. >> This Role-Type classification can be summarized and easily visualized in this table. This classification system serves as the structure for the rest of this course. You find that not only does it help you to construct graphs, but it's also the basis for selecting statistical tools that can be used to explore the relationship of variables that you're interested in. The tools for statistical analysis and for visually representing the relationship between variables is based on the role and type of each variable, whether response or explanatory, and whether categorical or quantitative. To get the hang of this, let's go back to some examples and determine which of the role types represents each research question. If we want to explore whether the outcome of the study, the test score, is affected by the test-taker's gender, we would designate gender as the explanatory variable and test score as the response variable. If we want to explore whether the nearsightedness of a person can be explained by the type of light that person slept with as a baby, light type would be the explanatory variable and nearsightedness would be the response variable. If we are examining whether a student's SAT score is a good predictor of their GPA freshman year, the SAT score would be the explanatory variable and the freshman year GPA would be the response variable. If we want to see whether a person's pass/fail outcome on a driving test can be explained by the length of time that they practice driving, prior to the test, time would be the explanatory variable and driving test the response variable. For our sample research question, we've decided that smoking will be the explanatory, or independent variable, in nicotine dependence, the response or dependent variable. More specifically, we're interested in the level of smoking at which nicotine dependence is experienced. How do we examine the association between two variables graphically? When we graph the association between two variables, the independent, or explanatory variable, is plotted on the X axis. The dependent, or response variable, is plotted on the Y axis. This is a most important convention to use when graphing relationships. However, before we actually construct our graph, there are a few questions we need to ask about the types of explanatory and response variables that we'll be working with.