To evaluate whether two samples could originate from the same distribution, a T-test is often used to evaluate whether just the means of the two samples are different. To test the same basic hypotheses you could also look at other statistics, for example, the mean rank. The test using this statistic is called the Wilcoxon test for two samples. In this video, I'll explain how the test works. It's probably good to start with an example. The best confectioner in town is not only making delicious pastries, also, her shop looks fantastic. However, she has this feeling that those cakes that are placed at the left side of the counter are being sold less frequently than the ones at the right. This table shows today's sales for the cakes that were placed at the two sides. There were three cakes at the left and four at the right. It's not possible to conduct a paired comparison here because none of the cakes was present at both sides. Also, it's not wise to use a two-sample T-test because the immensely popular fruit cake is kind of an outlier. But the Wilcoxon test for two samples would just work fine. The null hypothesis for this test would be that there is no difference between the sales of cakes at the left and right. To conduct this test, the first step is to combine the values for the two groups and rank them. If you have ties, you give each of the cases an average rank. Next, the ranks for the cakes at the left are summed, as are the ranks for the cakes at the right. Now, if the sales of the cakes at the two sides will be similar, the two rank signs would be similar. There is in fact this fixed relation between the rank sums for the two groups, if there are in total N samples in one group and N samples in another group. So how would you decide whether the value you found for the rank sum in one group will be unlikely, given the null hypothesis? If you have small sample sizes with fewer than ten cases in both groups you can look up the critical values in a statistical table. But it's also possible to evaluate all possible orderings of the original data, calculate the rank sum for one group in each ordering and next, make a probability distribution for all these rank sums. And finally, evaluate how extreme the observed value of the rank sum is in this distribution. For the seven cakes, the null distribution of the rank sum for the smallest group of three is given here. As you see, the observed rank sum of 15 or higher Is occurring in one-fifth of the cases. This is not particularly extreme, and therefore the null hypotheses of equal sales for the left and right side of the counter is not rejected. If group sizes are bigger than ten, the rank sums become normally distributed, so then a Z-test can be used. If you have group sizes of M and N for variables X and Y respectively, then the mean of the rank sum for X under the null hypothesis is given by this equation. And the standard deviation under the null hypothesis is given by this equation. You can then determine the difference of the observed rank sum and its significance with the Z test. If the Z value is very small or very big, it would mean that the rank sums between the two variables are different and lead to rejection of null hypothesis. Let's see how the calculation with big samples works through an example. Assume we would have obtained observations on two groups with sizes of 12 and 17 cases respectively, and you would like to know whether the values in the smaller group are on average higher than those in the bigger group. You order all the data and find that a rank sum of the smaller group is 220, and that of the bigger group is 215. You can then calculate the Z score using the equation. This leads to zed value of 1.77, and a corresponding p value of 0.038. So you would reject the null hypothesis of equal averages. The Wilcoxon test of two independent samples is often the most suitable test to compare measures of central tendency. It can be applied to data with ordinal as well as numerical measurement levels. For large sample sizes it's almost as powerful as a two sample T-test. While for small sample sizes with unknown distribution, it can even be more powerful. The Wilcoxon test is sometimes said to test for a difference in medians among two samples. This is indeed the case if you accept the assumption that the distribution of the two samples are identical apart from a possible shift. But if you test for a difference in mean ranks, you don't need this assumption. Let me summarize what I have explained in this video. The Wilcoxon test can be seen as a non parametric counterpart for two sample T-test. It assumes independent random samples from two groups and requires that the data have at least an ordinal measurement level. The null hypothesis in this test is, that the two groups originate from the same population. This would imply that the mean ranks for the two groups do not differ. The test statistic in the test is the difference between the mean ranks for the two groups. For small sample sizes, the value of this test statistic is compared against the theoretical value that is determined by evaluating all possible orderings of the data. But with larger samples, the test statistic adjusted according to this formula follows a Z distribution.