0:05

So here we are in our studio, having just completed the Mann-Whitney U

test on two levels of IDE, Visual Studio and Eclipse.

And now we're moving to three levels,

which will require that we use a one way ANOVA.

So let's read in our new data file, IDE3,

because it has three levels, and let's view that as we normally do.

And we can see the first 20 subjects use Visual Studio.

The next 20 used Eclipse, and the final 20 now used PyCharm all with times in

minutes for how long it took them to write these programs in the various languages.

As is our practice, we'll turn subject into a nominal factor and

we can summarize over each of those columns here.

We can see the mean time is 353 minutes for writing these programs.

We can also look by the three levels of IDE now with our DD ply

command as we've done in the past.

We can see means and standard deviations and other information there.

Remember, we're back on the time measure here,

not log time as we transformed our data last time.

1:25

We can see Visual Studio and Eclipse haven't changed, the histogram for

PyCharm is new, and there it is, we can see there were

quite a few students who took between 200 and 300 minutes for these programs.

And a box plot now compares all three.

Looks like PyCharm might have a little edge on Visual Studio, and

both look like they were faster to do than Eclipse.

1:50

As we did before, we'll test for

normality on the result of this time of the new level of PyCharm and

the P value is significant, so we have a departure from normality and the response.

But we should test it on the residuals as is more proper and so,

we fit our model and test the residuals.

Again, seeing a departure and we can see that again with our QQ plot,

quite a departure obviously here on the end for normality.

2:18

We can test lognormality as we did before of the PyCharm level.

We already did it previously on Digital Studio and Eclipse.

And the lognormality test with the KS test shows that we're

not statistically significant in a departure from lognormaility.

So again as before, that that gives us an indication that the PyCharm times may

be lognormally distributed.

So let's create the LogTime column and we'll go ahead and

view that having been created just as before.

The only difference from before is that we have PyCharm now as well with a log time

result.

3:00

And we can do the normality test on log time as the response and

see that in fact we are now no longer significantly different from a normal

distribution according to the Shapiro-Wilk test.

And we can also do the same test on the residuals now for log time.

We can see that departure, while still present, is not as severe and

we have a 0.08 result with the Shapiro-Wilk test.

So it's nearly a departure from normality, but not technically and not quite.

So we'll proceed with some confidence there since ANOVA is somewhat

robust to mild departures anyway.

We'll also do our homoscedaticity test with Lavine's test, the Brown-Forsythe

version, and we see that we're not significantly different there,

so that means that we don't have a violation with our log time result.

3:56

Our variances are similar enough.

And so now, we're going to fit the actual one way ANOVA, so we fit the model and

then we use the ANOVA command to calculate and report the ANOVA.

And so again, we see the f value,

this column here is the column of particular interest,

and it shows an f statistic of 8.796.

And of course, the p value is much less than 0.05.

What that means is the overall ANOVA or the omnibus test,

as it's called, shows that there's some difference among,

I'll go back to here, among these levels of IDE.

It does not tell us exactly what the difference is, nor

does it tell us where exactly the difference lies in terms of comparisons

between each of these IDEs, so we have to look further.

But that first test being significant, that omnibus test gives us permission

to do what are called post hoc tests, meaning follow-up tests that

are pairwise comparisons that will tell us where those differences lie.

So now we can go back to an independent samples T test between subjects T test,

between the levels of Eclipse, PyCharm and

Visual Studio to see which two are different.

And just looking at the graph we might ask,

is PyCharm different from Visual Studio?

Because those are the ones that obviously look close together.

The eclipse level being so different is reason enough for

the overall F test to be significant.

5:40

We'll load in the multicomp library for multiple comparisons.

And we'll run this line here.

I'll explain a little bit here.

The GLHT command is doing the test for

us and MCP is a command for multiple comparisons.

We say which factor we're testing over, of course we have one IDE with three levels,

and when we say two key here it's a short hand for all the pairwise comparisons.

And as we've done before, we're adjusting for multiple

test because they all by chance have a 1 in 20 chance of being significant.

And so we adjust with Holm Sequential Bonferroni procedure which accounts for

the factor making multiple comparison so

we don't get an inflated chance of seeing significance where there isn't any.

Okay, so let's go ahead and run that.

6:35

And we can see, here are our pairwise values.

And the t statistic and the p value in the right column here.

So PyCharm versus Eclipse is significantly different and

we can see PyCharm is faster.

Visual Studio versus Eclipse, we found from before and that hasn't changed.

And then Visual Studio versus PyCharm is a p value 0.67.

So these are not detectably different according to this test.

7:06

Just as a matter of completeness in r,

I show another way that we can use the GLHT command.

This time with the LSM command

which allows us to specify that we want pair-wise comparison.

This is another way of executing exactly the same result and you may find a value.

You can do a question mark MCP and a question mark LSM and

read more about how exactly these are formulated.

But for completeness, I've run the same analysis, and

you can see the result there.

We've just completed our analysis of IDE2 and IDE3.

And we just saw our first F test on IDE3.

Recall IDE had three levels, Visual Studio, Eclipse, and PyCharm.

We found that both Visual Studio and

PyCharm were significantly shorter in programming time needed than Eclipse, but

not significantly different from each other, Visual Studio and PyCharm.

8:41

Often the numerator can be thought to as being related to the number of

levels here.

We have three and so

we have the number of levels minus 1 as the numerator degrees of freedom.

For simple analysis like this, that will hold.

The 57 is also called the residual degrees of freedom, and

that's where you'd read it in the table in your r output.

9:07

Then we have the F statistic, the value produced in this particular analysis,

and a p level that is less than 0.001.

You'll recall the range of acceptable p values that you can report.

So that's how we report a significant overall or

omnibus ANOVA result for a one way ANOVA.