A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

Loading...

From the course by Johns Hopkins University

Statistical Reasoning for Public Health 1: Estimation, Inference, & Interpretation

136 ratings

Johns Hopkins University

136 ratings

A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

From the lesson

Module 3B: Sampling Variability and Confidence Intervals

The concepts from the previous module (3A) will be extended create 95% CIs for group comparison measures (mean differences, risk differences, etc..) based on the results from a single study.

- John McGready, PhD, MSAssociate Scientist, Biostatistics

Bloomberg School of Public Health

So what confidence intervals for single po-, population summary measures, like a single population mean or proportion. Can help give a range of possible values for some underlying truth. But one of the ways in which these intervals become extremely useful is when comparing populations on some outcome interest. So we've already seen how to estimate such comparisons. Some mean differences, differences in proportions, relative risks, et cetera. So in this lecture we will actually work on putting confidence limits on these measures. And this will allow us to actually look at a range of possible values for the difference between the populations we're comparing and also ascertain whether the difference is real or not. After accounting for the uncertainty of your estimates.

Okay, so in this next set of lectures we're going to be considering how to estimate confidence intervals for two population comparison measures or measures of association. Things like a difference in means between two populations.

So we're just going to give an overview in this section to get us started, and to set us up for the specifics that we'll deal with in subsequent sections of this lecture set.

So upon completion of this first lecture section, you will be able to extend the concept of sampling distributions to include measures of association that compare two populations.

Extend the principles of confidence interval estimation from single population quantities to measure of association comparing two populations. Appreciate the confidence interval computations for ratios need to be done on the natural log scale, and then the results then transformed back to the ratio scale for presentation.

Explain the concept of the null value, the value meaning no association for such measures of association and what its absence or presence in a confidence interval signifies. So just to remind you, something that we've talked about before frequently in public health, medicine, and science. Researchers or practitioners are interested in comparing two or more outcomes between populations using data collected on samples from these populations. And such comparisons can be used to investigate questions such as how do salaries differ between males and females? Males and females being two populations of interest. How to cholesterol levels differ across weight groups? So we might have four different weight groups. We might have samples from, the four different populations constituting persons, all persons in each weight group.

How does AZT impact the transmission of HIV from mother to child. So we might compare those from a population of HIV positive mothers given AZT to the population. Not giving AZT based on a sample from each. Or how is a drug associated with survival among patients with the disease? And compare those in the drug to those who get some sort of placebo.

So it's not only, important to estimate the magnitude of the difference in the outcome of interest, which we've done extensively thus far.

But all, but also to recognize the uncertainty in this estimate when making conclusions about the populations understudy.

The summary measures developed, developed thus far, the things we've mentioned in the beginning are all sample-based and hence, sample statistics. And these are subject to sampling error just like single sampry, sample summary statistics.

But let's talk about the types of studies we'll be looking at. In this course, and that are commonly done in public health. First, we'll just talk about types of two-group comparisons for continuous outcomes. One, we've actually seen an example of already. And we'll define it and give some more examples of it in section b, but it's what's called a paired study. Where we're extensively comparing two populations through two samples, but the two samples, and hence the two populations we're comparing, have some sort of linkage.

So for each person and observation in population one, there is a coresponding Observation in population two.

And hence our samples are constructed in the same way. For each person or observation in the first sample, there's a corresponding observation in the second sample.

There's another type of study however, that we haven't encountered yet. Except to summarize, but we haven't even counted in terms of Dewey confidence limit, and we'll look at the mechanics of 3D confidence interval for a mean difference in unpaired situation. And this is where we have two populations from which two samples were taken, or were assigned to.

And there's no linkage between the two populations. So we might have patients who are randomly assigned to receive a treatment, extensively representing the population of all such patients given the treatment. And we might have another group randomly assigned to be in a control group, extensively representing the population of all such patients given a control. And there's no correspondence between each observation in this treatment group and any one person or observation in the control group. There's no inherent linkage. So these groups are functionally independent. And we'll be able to create a mean that summarizes the experience in each group and look at it's difference. To quantify the difference in the average outcome. But when it comes to estimating the standard error and dealing with the uncertainty, we're going to have to do a little bit differently than when we had a paired situation.

We'll do the same sort of unpaired comparisons for binary outcomes. We're not going to do paired comparisons. They exist but they're rarely used. Then we're going to focus on unpaired. So again this is where we have two samples from two populations that we want to compare and there's no link between the two samples and hence the two populations. And the same thing with time-to-event outcomes.

So how are we going to apply the central limit theorem, and figure out how to get confidence limits on measures of association to compare to the populations through two samples? Well, for differences, things that are quantified as differences, such as the mean difference between two groups.

We can actually extend the basic principles of the central limit theorem to understand and quantify the sampling variability of these two sample differences.

So it turns out that the difference is the two quantities whose distribution is normal. Have themselves a normal distribution. So we've learned with relatively large samples, that the distribution of a sample mean among all possible random samples of the same size. So if sample mean were size from a sample of size N1 we'll call it. All possible random samples. This was the theoretical sampling distribution

centered around the true mean for the population of one that's taken if we have another measure, another sample from another population of size n2, it doesn't have to be the same size as the first sample, a similar result holds. And, if we were to actually look at, if we did a study over and over again, and we took independent samples of size n1 from the first population, computed a mean on those, and then Independent sample size n2 for a second population. And then we look at the mean difference. Suppose we do this study over and over again, so the second time we took

a sample size n1 from the first population, and a sample size n2. And we got many different estimates of the mean difference in the outcome between these two population based on comparing samples of size one and size n one and n two. We looked at the distribtuion of the estimated mean differences across the different iterations of our study. We'd find that this too is normally distributed and centered at the true mean difference.

And the same sort of logic applies to if we were instead summarizing binary data from samples of size, say m1 and m2. The difference and proportions, cross multiple iterations of the same study would also be normally distributed around the true underlying population level difference in proportions between the two populations we're comparing. [SOUND] So this is really handy. This means, ultimately, if want to conf-, create a 95% confidence interval.

For a population mean difference based on a single study where we have one sample, for the first population of size n1, and another sample for the 2nd population of size n2. We can do this using the same old logic where we add and subtract two standard errors or estimated standard errors of this difference in sample mean. And I'm going to show how to estimate this using the results from two samples, from said populations. Theo, the theoretical difference just to give you a head start.

What the central limit theorem tells us the real true standard error is, and this will look very familiar, is it's the true variability of individual values on the first population squared divided by the size of the first sample from that population.

And the true variability of the individual measures in the second population squared, divided by the size of the second sample we took from that second population. These pieces look somewhat familiar. Think about that and we'll show how to actually estimate, I don't think it'll be a surprise how to estimate these parts. And how to do this, and create a confidence interval for the true mean difference.

we'll take our observed difference in proportions and add and subtract two estimated standard errors of the difference in proportions which will again Be able to estimate from the two samples we have. Just for a heads up, the theoretical standard error, the true standard error which we can observe, just like we can observe the true proportions, from proportions from samples of size n1 and n2, the difference in them is

And I imagine this looks, at least the pieces of this, either should be 1, P1 times 1 minus P1 over N1 plus P2 times 1 minus

I want you to notice though, in both cases with, we're comparing independent groups, the uncertainty, and we'll talk about this in detail in the lecture sections, the uncertainty and the difference in our sample estimates is an additive function of the uncertainty.

In each piece, a proportion in each group or the mean in each group. So this is really nice. So this, this extension to the Central Limit Theorem is natural. It gives us an easy way to estimate the uncertainty and the interpretations confidence interval in terms of. Conceptually is exactly the same as when we were doing confidence intervals through singer, single summary measures like a single mean or a proportion. This means for most of the studies, roughly 95% of the studies for 95% of the combination of samples we could get just by chance from the first population and second, if we were to employ this method. Take our estimated difference, either in means or proportions, and add and subtract two estimated standard errors. 95% of the studies we could do, and take samples.

Get 95% of the samples. We would get from both populations if we were to employ this method. This interval would include the truth that we're trying to estimate, and 5% of the time, it would miss it.

So again, sampling distributions of differences are roughly normally distributed and centered at the true difference for large samples and there's corrections we can make for small samples which the computer can handle, the important thing is the concept is exactly the same.

Ratios are a bit different. But once we get over a little hurdle, and we've already talked about some of the quirks of ratios and their scaling, well this is going to play in to how we compute the confidence intervals as well.

Once we get over that minor quirk, it's relatively easy to handle in terms of sampling distribution.

So just to remind you, the thing with ratios is that ratios have to be positive the way we've defined them. And, because we're comparing positive quantities. We either being probabilities and risks in two groups, or instance rates, both of which are positive. So the ratios will always be 0 or greater. And, so the range of possible values for a ratio is between zero and positive infinity. Theoretically. So, ratios can't be negative. But we've seen that when we're comparing two groups, if the group on top, so we're comparing group one to group two by a ratio, group one If the group on top has a lower value of the outcome measure, whether it be a proportion or instance rate, then the group on the bottom, the range of possible values for that association on the ratio scales between zero and one. If the group on top has a greater value than the group on the bottom the range of possible values for that association on the ratio scales, one all the way up to, theoretically, up to positive infinity. So we've seen that there's an imbalance in the ranges, it's a much more compressed range for associations where the first sample has smaller value than the second. So it turns out when we take things to the log scale, if we take the natural log of values between zero and one, the natural, the theoretical natural log of zero is all the way out at negative infinity.

And the natural log of one is zero. So we take something. That on the original scale is tightly constrained between zero and one, and on a long scale it can range from basically the entire number line below zero.

If we do the same thing, for those associations in which the group The first group has larger than the second group. And we map this to the log scale.

Then one becomes zero, and infinity, the log of infinity's still infinity. So the range of possible val, values for ratio in the first group Has larger value in the second is zero to positive infinity. By taking things on the log scale, we've made equal the ranges for which we can hgave the twpo types of association.

Additionally, if you think about it, on the natural log scale, ratios are expressed as differences, so for example Why I have something like this, P one hat over P two hat. And I take the log of that. This is equivalent to taking the log

And it turns out these differences are what we were shown Before have equal range of possibilities when the first is smaller than second versus the second smaller than the first.

So what does this all mean? Well the ultimate thing that this tells us is that when we're doing a study with binary outcomes Sample size n1 and n2. If we were to repeatedly do the study over and over again, and randomly sample independently from the sample size n1 and n2 from population one and population two, and get estimates of the proportion or the, we could also say incident rates here.

Over and over again, and then compute relative risk based on these different samples we got repeatedly. If we then plotted a log of these ratios, these relative risk on the different studies we had, these would be

normally distributed. Sorry for the drawing here. The histogram of these estimates would be normally distributed on an average would equal the log

of the true ratio we were trying to estimate. So it's business as usual. As long as we put these things on the launch scale, the same holds, if we replaced these proportions with incidence rates. And, replace this relative risk with the incidence rate ratio. Same idea holds true.

So as it turns out, the sampling distribution for the natural log, of ratio is normally distributed and centered at the natural log of the true population value. Of the ratio being estimated. So by all the same logic we used to figure out how we could use the theoretical result when we're only dealing with results from one study to actually quantify something about the uncertainty of our estimate. In fact, [INAUDIBLE] the story is that it's business as usual, the same principles apply. Most of the studies we could do would yield an estimated of law of ratio, that fell within plus or minus two standard errors of the law of two ratios, we wanted to estimate. And so if we take an interval, where we and subtract two standard errors of our estimated ratio for 95% of the studies we could do, and take samples from the populations randomly. This interval would include the true value of the log ratio, the log of the ratio that we're trying to estimate. So let's think about two things here. What we're going to need to know [INAUDIBLE] before, well, we won't know the true standard error so we're going to have to learn how to estimate these from a single study. And we'll show that there are slightly different formulas for relative risks. The log of the relative risk.

The log of an odds ratio when we're dealing with binary data. And each of these based on the counts in the respective two by two table, representing our data results, and then there's a separate formula also for the law of minimum instance rate. But they're all pretty straightforward to compute. And so we'll delve into that in lecture sections C, D, and E.

So what we're going to end up getting is endpoints, I'll just call them a and b, that will be the confidence interval for the log of a ratio. And you'll say, that's no good, John, because nobody thinks on the log scale. And I agree. But let's just think about some things here for the moment.

If we have estimates of the confidence interval for the log ratio, and it's a natural log, we can usually get things back to the ratio scale by antelogging, or what we call exponentiating. So the computations will be done on the log scale. And then the results will be exponentiated back to the ratio scale. This is because the sampling behavior and the log scale is normally distributed. And we can use that standard procedure to get a 95% confidence interval by adding or subtracting 2 standard errors on the log scale. If we wanted to do a 99 percent confidence interval, we could go slightly further. If we wanted to do a %90 confidence interval we could go 1.65 standard errors in either direction. The exact same idea applies.

So let's just talk a little bit about null values. The null value for a measurement association in comparing two populations is the value of this measure association. If both the population outcome quantities being compared by this measure are equal. And hence there is no association between this outcome and the populations. So, for example, if I'm comparing continuous measures between two populations by the means and there's no population level difference in the means.

So the mean blood pressure is exactly the same for those who received the drug versus those that received the placebo and the true means are equal and the difference in means at the population level is zero. It indicate there's no association between average blood pressure and treatment because the average is of the same. If I'm comparing proportions between two populations, and the proportions are equal, then the difference of proportions would be the same. So for example, if I looked at the pram, percent of persons, if I was comparing HIV infant transmission among mothers who got AZT during pregnancy, And mothers who got a placebo and there was no effective AZT good or bad that if the population, the proportion of infants contracting HIV born to those two groups of mothers would be equal and the difference in proportions would be zero.

or incidence rates. If the at the population the two populations have the same value. There's no difference and hence no association These ratios will have the same numerator and denominator and the null value would be one. The log scale, and we'll just use a relative risk as an example, if the true population proportions are equal, the ratio will be equal to one. And the log if P, so if P1 equals P2 And the ratio, true ratio is one and the log of one, one is zero. So if the true proportions are equal at the population level then the level null value for the log of the ratio is zero.

And remember, we can express this as a difference in the log of the numerator and denominator, and if those numerator and denominators are equal, then the logs are equal and the difference is zero, so this makes sense.

So what we're going to see and think about is if the null value appears in the confidence interval for a measure of association. Then, and excuse the phrasing here, but this is how it's done. Then no association between, the outcome and the populations being compared is a plau-, plausible conclu-conclusion. It can't be ruled out. So for example, if we look at the average blood pressure difference

if we look at the blood pressure difference. And between say, low fat minus low carb and the mean difference is negative four, millimeters of mercury, indicating that in our study those in the low fat diet had to lower blood pressures. But the confidence interval goes from negative ten millimeters of mercury

Then after accounting for the sampling variability, in our estimate, we get both plausible negative and positive values for the true mean difference, and we can't reach a conclusion, statistically. And included at this interval is 0, the possibility of no difference. So we would say, that after counting for the uncertainty, we found no statistical difference in the average blood pressures, we may call this a non statistically significant result.

Get into the language of statistical significance in the next set of lectures, but we're just laying the groundwork here. If a null value does not appear in the confidence interval for a measure of association, then no association is not within the range of possible population level associations. And hence we can rule out, and that sounds weird verbally, can we rule out. No associations of pop, is a possibility say there is evidence of an association. So if we estimate for example, the relative risk with the relative risk

of HIV transmission infants from mothers. Who get AZT versus mothers who get a placebo and the estimated relative risk is .32. Then

we put confidence limits on that and it goes from 0.18 to .58 when we're done with our computations.

Notice that this interval only includes values less than one, it does not include one, and what we've found here is that we've given a range of possible values for the true association between AZT and infant transmission. But after account for the uncertainty in your estimate, all possibilities favor AZT and we've ostensibly ruled out One as a possibility for the true value of association and we've ruled out the idea that AZT is not associated with placebo and we'd say there's a statistically significant finding indicating that after counting for the sampling variability in our data we found a real Evidence of a real protective association between AZT and HIV transmission to infants.

So again, if we do not include the null value in our confidence interval for measure association be it a mean difference, a difference in proportions, or ratio based measure this finding will, is called statistically significant and we're going to get into much more detail on that terminology lectures nine and ten. So in summary what we'll be exploring in this next set of lectures is we can easily compute confidence intervals. 95% in other levels for things like

Mean differences between two populations, differences in proportions between two populations, relative risks and odds ratios, and then incidence rate ratios. And we can do this using Data from two samples from the two populations being compared. And the general rule for difference is. So just the difference means your proportion is to take our observed difference,

and add and subtract. To estimate the standard errors. And we'll show you how to estimate these in subsequent sections. [INAUDIBLE] difference.

Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.