A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

Loading...

From the course by Johns Hopkins University

Statistical Reasoning for Public Health 1: Estimation, Inference, & Interpretation

136 ratings

Johns Hopkins University

136 ratings

A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

From the lesson

Module 3A: Sampling Variability and Confidence Intervals

Understanding sampling variability is the key to defining the uncertainty in any given sample/samples based estimate from a single study. In this module, sampling variability is explicitly defined and explored through simulations. The resulting patterns from these simulations will give rise to a mathematical results that is the underpinning of all statistical interval estimation and inference: the central limit theorem. This result will used to create 95% confidence intervals for population means, proportions and rates from the results of a single random sample.

- John McGready, PhD, MSAssociate Scientist, Biostatistics

Bloomberg School of Public Health

Okay, let's do some practice exercises related to the material we've covered, in this set of lecture sections in sec, lecture six.

So first let's look at how our simulation results compare with what the Central Limit Theorem would predict them to be.

So we've learned about the CLT. And what it tells us about the theoretical sampling distribution of a sample statistic. And we showed in sections b and c, the results of some simulations to illustrate the CLT results. Let's check the piece about the standard error, or variability in our statistics being a function of the population level variation, and the sample size each statistic is based upon.

So let's recall the samples taken on the number of kidney or urinary DRG discharges, from the hospital population. It has a mean of 69.2 discharges, and a standard deviation of 58.4. What I want you to compare is how the observed standard errors for the simulations based on 2,000 random samples, for each of the three sample sizes, 50, 250, and 400. To compare to what is actually predicted by the CLT.

And just to recall, here are the results of the estimated sampling distributions. And so what I'm calling here, just to remind ourselves, even though we call it standard error, standard error just measures the variability in the set of numbers. Where the numbers happen to be summary statistics across multiple random samples. So what we have here, is the estimated standard errors by simulation. So, the estimated standard error for means based on samples of size 50, from this population, is 8.1. Okay. Now recall the simulated samples. Let's do a binary example now. Recall the simulated samples taken on Baltimore residents that they were from a population in which the true proportion of residents in poverty was 22.9%. And I wanted you to compare how the observed standard errors from the simulations, based on 1000 random samples for each of the sample sizes 50, 150 and 500. Compare to what is predicted by the CLT. And here are the results from our estimated sampling distributions, just to refresh your memory.

So for example, we took 1,000 proportions, each based on a random sample size 50. The observed variability in these 1,000 proportions was 0.58. This again is we, the variability or standard deviation of these 1,000 sample proportions. But we generally call this, to distinguish it from variation in individual values in the sample or population, we call this variation in statistics, the estimated. In this case estimated by simulation. Standard error estimated by simulation.

Okay. So let's look at another example, weight change and diet type. This is a data set we've looked at before. A low, or the results from an article that we've looked at before. A low carbohydrate as compared with a low fat diet in severe obesity. And this is where 132 severely obese subjects were randomized in one of two diet groups. And the subjects were followed for a six-month period.

So what we're going to try and do with this is estimate the characteristics of a sampling distribution from a single sample. And we've done this in lecture 16. I want you to focus on the sample, these were people who were randomized to a low-carb diet. So this is our sample of persons randomized to the low-carb diet. There's 64 people. Their mean weight change, post-diet, less pre-diet, was negative 5.6 kilograms. And the standard deviation of these 64 individual weight changes was 8.6.

So I'd like you to use the CLT. What you know, the CLT tells us, and these sample results, to estimate the characteristics for the sampling distributions for sample means of weight change. From samples of size 64 from the low-carb diet population. And then one more example. The maternal infant HIV transmission that we looked at in several places in the course thus far. Let's focus our efforts on the placebo group now, as opposed to the entire sample of data. So of the 183 births to mothers in the placebo group, 40 infants were HIV infected. So using the CLT in these sample results, I'd like you to estimate the characteristics of the sampling distributions. For sample proportions of children contracting HIV within 18 months of birth, based on samples of size 183 HIV infected pregnant women. Who were not treated with AZT or any other treatment.

Now I'll give you a minute now to turn off the video and go back and do these exercises. And when you resume, we'll look at my take on the solutions.

Okay, welcome back. Hope you found these exercises useful. So let's first look at this thing about comparing the observed variation in sampling distributions that were simulated, by taking multiple random samples from the same population. To what we'd expect the variability to be, given the Central Limit Theorem.

So let's recall the results from our estimated sampling distributions. Okay, we have, for sample sizes of 50 from this population of discharges, from this hospital population. Our observed variability In the 2000 sample means we computed it, with 8.1. 8.1 discharges. Now the Central Limit Theorem tells us that the theoretical standard error of means

of size 50 from this population, will be the population standard deviation in individual values. The between hospital variation in discharge counts, divided by the square root of our sample size. Ordinarily we would have to estimate this from a single sample. But in this simulation, we know how variable the observations in the population are, because we've sampled from that population multiple times. So I'm going to plug this in. It was 58.4.

So if we divide that by the square root of 50, and you can check my math. It turns out to be about 8.25 discharges. So our estimate, while in the same order of range is slightly higher than what we observed in these 2,000 sample means. But remember this 8.1 is just an estimate based on 2,000 samples. The Central Limit Theorem is giving us the standard error amongst means from all possible random samples of size 50. So these two look similar in value.

So in short, what we saw in our simulation results was pretty close to what we predict. Or what the Central Limit Theorem tells us it should be.

How about when we we're dealing with samples of size 250? Well the same logic applies. The estimated standard error, is still, formulaically pretty much the same, except that we replace the denominator with our, new sample size of 250. And if we do this out, we get 58.4 divided by the square root of 250. And this gives about 3.7. So here again, our observed estimated standard error is slightly less than what the Central Limit Theorem tells us it should be. But remember, this is is just an estimate of the true standard error based on only 2,000 samples. So in all, these sync up pretty closely.

Finally, let's look at the results of samples from size 400. And the drill is again, the same here. The Central Limit Theorem tells us that the standard error of means from this, based on samples of 400 from this population, should equal the true variation across the hospitals in discharges. Divided by the square root of the sample size, square root of 400. Here again we have,

true variation is 58.4. Divided by the square root of 400 is approximately equal to 2.9. So again, our estimate underestimates what the Central Limit Theorem tells us it could be, but this is just an estimate. Had we taken a different 2000 samples, and computed their sample means, we might get something closer to 2.9 or above it, just by chance. Because we're only looking at 2,000, such estimates from a, almost infinite, set of possibilities. But on the whole, these two things look pretty similar. So what we're seeing is the results from our simulation track with what we'd expect 'em to be, more or less, based on the Central Limit Theorem. So this is just trying to show you that this theorem has some validity. By comparing what we actually observed in simulations, to what we'd expect. Let's do this again. Let's compare what we observe to what we'd expect to get via the Central Limit Theorem with our binary outcomes.

And this is the simulated samples taken on Baltimore residents who were from a population in which the true proportion of residents in poverty was 22.9%. And I ask you how the standard errors from the simulations, based on 1,000 random samples for each of the three sample sizes, compared to what is predicted by the CLT.

Here are the results from our estimated sampling distributions for samples of size 50. We took 1,000 propor, samples of size 50, computed 1,000 sample proportions. And this is the observed variability in those 1,000 sample proportions. Now let's compare that with what we expect from the CLT. The CLT says, look, if you're sampling from a population of binary data, and you compute a sample proportion.

It should be equal to the square root of the true proportion times 1 minus the true proportion, over the sample size of the sample that the estimate was based on. Now, generally we don't know the true proportion. But in this simulation we do, so for samples of size 50, we'd expect standard error to be 0.229 times 1 minus 0.229 over 50. And I'll let you work this out and verify my math. But it comes out to be 0.059 or 5.9%. So very close to the actual variability we observed, in our estimated sample distribution of proportions, based on 50 observations at a time. Let's do the same thing, and see how it syncs up for samples of 150 from this binary outcomes situation. So we took 1,000 samples, 150 people each. Variation in those 1,000 sample proportion estimates was 3.4%. What would we get vis-a-vis the Central Limit Theorem? What would it tell us?

So this is not reality. But the true standard error, based on the population we're sampling from, would look like this.

And in this case, that actually equals 0.034. So we have a perfect match up. That won't always happen between the simulated results in what we pred, as we saw before, they were slightly off with the discharge this example, but very close. Here they're actually the same.

So what we saw among these 1,000 estimates is exactly what we would have predicted from the Central Limit Theorem.

Finally, let's do this for the samples of 500 each. So the standard error of our sample proportions based on 500, and I'll write this out more quickly. because it's more of the same, would be 0.229, 1 minus 0.229 over 500. Which equals approximately 0.019, we saw 0.018. These are probably closer than they look, because we've rounded both. But pretty much off slightly, but pretty much the same thing. So, again, what we observed in our simulation, syncs up with what the Central Limit Theorem predicts for us. So this hopefully starts to give you some proof that the Central Limit Theorem works, in terms of what it's telling us about these characteristics in sampling distribution between the shapes we've seen. Where they're centered, and now that the formula gives us for variability, syncs up with what we've observed.

Now let's get in the situations that we will have in reality, where we only get to observe one sample. We won't be able to do the simulations we've done. We won't be able to verify or rectify the differences between what the Central Limit Theorem would tell us, versus our simulations. And the differences were minimal, as we saw. We're only going to have one chance to characterize the behind-the-scenes sampling distribution, based on the results of a single sample of data.

So what I've asked you to do here, is look at this sample of subjects who were given a low-carb diet. And characterize the sampling in distribution for mean estimates from samples of size 64. From a theoretical population of subjects on the low-carb diet. Okay. So what do we know? Well let's c, couple things we know in advance. The CLT tells us, look, if you do this study over and over again, and kept getting different subsets of persons from the population under study who are put on the low-carb diet. And you did a histogram of your sample mate, mean weight change estimates.

And you did this histogram. It would a, be well approximated by a normal distribution. It would be centered at the true mean weight change, amongst everybody in the population being put on a low-carb diet.

Centered at that truth, which we can't directly observe. And the variation in this, well, the standard error of means. Okay, based on 64 persons each,

would look like this. It would be the true population variability. And weight change amongst everyone given the low-carb diet. Divided by the square root of the sample size, the sample we had was 64. But again, we don't know this. So what can we do? Well we have an estimate of this from the original sample size 64 we got, of 8.6. So we can estimate the standard error as an estimate. And it turns out to be, about 1.1 with rounding. So, we fully characterize what could have happened across all random samples of size 64 in terms of the distribution of a resulting sample mean estimate. Using the results from the CLT and this estimate from our sample. Pretty powerful stuff.

Let's look at one more example of characterizing the sampling distribution using the results from a single sample. Remember in the maternal-infant HIV transmission study, of the 183 births to mothers in the placebo group, those who were not treated by AZT or anything else. 40 infants were HIV infected after 18 months, or within 18 months of birth.

So I wanted to use the CLT in these sample results to estimate the characteristics of the sampling distributions for sample proportions of children contracting HIV, based on samples of size 183. So the CLT again tells us, look if you were to do this study over and over again,

and get random samples of 183 HIV positive mothers, and not treat them with anything. And look at the proportion whose children contracted HIV. If you were to look at the distribution of this proportion estimates across multiple random samples of 183 women, and do a histogram.

Furthermore, the center of this histogram would be that, on average these estimates, some would be above, some would be below. But the true proportion, or true transmission proportion.

How variable would these estimates be around that truth? Well the true standard error, would be something we can't directly compute, because it's a function of this true proportion, and the sample size 183. But we can estimate the standard error, based on our sample results by plugging in our best guess for the true proportion. Which is our estimate of 22%. So this would equal. Our estimated standard error would be, or 0.031, 3.1%. Alright, so onward and upward to the next section where we'll actually take what we've done here and go to the next level which means creating what's called a confidence interval for the underlying unobservable truth, we're trying to estimate.

Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.