In this brief video, we will define what we mean by robust statistics and

discuss robust measures of center and spread.

We define robust statistics as measures on which

extreme observations have little effect.

Let's give a quick example.

We start with a small data set of values between one and six, and the mean and

the median for these data are both 3.5.

What if we change one of the values in the data set to be much larger?

Say 1000.

The mean increases greatly, but the median stays the same at 3.5.

In other words, the mean is robust to the extreme observation.

This is because while the mean depends on all observations in the data set,

it is the arithmetic average, after all.

The median only depends on the midpoint of the distribution and

the values of the end point are irrelevant to its calculation.

We just established that the median is a more robust statistic of center than

the mean.

Going along with this the IQR, which is based on the median, is a more robust

statistic than the standard deviation which is calculated using the mean.

As well as range which relies solely on the most extreme observations.

Robust statistics are most useful for describing skewed distributions, or

those with extreme observations.

While non-robust statistics like mean and standard deviation are useful for

describing symmetric distributions