Now, it's time to practice, and you'll have to do the work yourself. So open up your Minitab and get ready. As a teacher, I am interested in what helps students to graduate. One of the factors that we have hypothesized about is the knowledge of mathematics. We measured for 30 students, their final grade at high school for math. And we want to know if this affects their chances to pass the first year in university. So my question to you is the following, is the probability of a successful first year related to the final math score in high school? A small hint before you start. You will need to transform the data, which is easily done using a cross tabulation. Now pause the video, load your data, answer the question before you continue. And of course, good luck. So, did you find the answer? Let's have a look at the solutions. As a first step, you will always have to determine what is my y variable and what is my x variable. The Y-variable is whether or not the student passed their first year, and the X- variable is the final grade that students received in high school for math. The Y-variable is categorical, and the X-variable is numerical. Our tree diagram shows us that we should perform a logistic regression analysis. Do you remember the three steps of such an analysis? Let's start by putting the data in the events/trial format. Remember, that this is the event/trial format. In a previous example, we saw that Calls was the number of trials and HungUp was the number of events. And HoldTime was our x variable. Now in this example, this is our data. Let's take a look at what to do. You need a column with your X-variable values, so with all the possible grades. Then a column with the number of events. That is the number of students that passed the first year of university, with their respective grade. Then the number of trials is the total number of students with that grade. As the hint said, you can use the cross tabulation. So let's go to Minitab and make a cross tabulation to get the data in this format. This is your data Minitab with student in the first column, math grade for that student in the second, and whether or not that student passed in the third column. To make a cross tabulation, you go to the menu stat, you go to Tables and we go to Cross Tabulation. Now, we want in our rows our x variable, that's Math grade. And the columns, how often that occur, that's Pass, okay. Now, Minitab prints in the session window, your cross-tabulation, let's have a look. For example, if you look at seven, we have four people that did not pass the first year, and one person that did, and if then in total, five students with a grade seven. Now, copy this data and paste it into your worksheet here. Minitab asks you, do you want it in separate columns, yes. Now we have the data in our worksheet. Let's give them some headers to make life easier. So this is our Math grade, this is the number of students that did not pass, this is the number of students that passed, and this is the total number of students. Now that our data is in the correct format, we can go to the second step and we have to make a fitted line plot and evaluate the fit. Let's return to Minitab to have a look what to do. To make a fitted line plot, you can go to stats, regression, and we go to binary fitted line plot, okay. Now our data is in an event/trial format, so click that. Our event name is Pass, whether or not they passed the first year, the number of events is our column C7 pass, the number of trials is the number of students with that grade, and our predictor already influenced factor the X variable is of course math grade, okay? Then Minitab gives you this fitted line plot and a lot of output in your session window. Let's study this. The fitted line plot shows us a slightly bending upward line. This indicates that the chance of passing the first year of university increases as the math grade that students receive in high school improves. So students with a seven for math have about 20% probability of passing the first year. However, a student with a nine for math, this probability increases to almost 80%. However, we still have to decide whether these results are statistically significant. And this is the third step of logistic regression. For this step, we take a look at the session window that Minitab already provided for us. We tested whether the math grade influenced a likelihood that students passed the first year in university. The null hypothesis states that the variables are not related. Whereas, the alternative states that they are. The Minitab app shows us very small p-value. This means that we found a statistically significant relationship and the alternative hypothesis can be supported. Note that you should always be careful with causality. Math grade is a predictor of passing probability. But this does not mean that if we would have offered additional Math tutoring to our students in the first year, these students would immediately have a higher probability of passing the first year. It could for example also be that there is a third variable, like intellect or discipline that influences both math grade and the probability of passing the first year. In summary, we transformed the data into the event/trial format by making a cross tabulation. The next step was to make a fitted line plot which showed us a positive relationship between math grade in high school and the chances of passing the first year at university. The p-value showed that the results are significant and not due to random fluctuations.