Hi, my name is Brian Caffo. And this is Mathematical Biostatistics Boot Camp,

Lecture thirteen on Binomial Proportions. In this lecture, we're going to talk about

confidence intervals for binomial proportions using frequentness techniques.

We're going to talk about why the standard frequentness interval has a little bit of

problems, and we're going to give you a simple fix for it.

And then, we're going to use confidence intervals for binomial proportions to

motivate Bayesian analysis, and we'll go through a very simple conjugate Bayesian

analysis as it relates to estimating the success probability of a coin flip.

So, when variable X is binomial with n trials and success probability p,

We've learned several things in the class so far.

The first is that the MLE for the success probability of the coin from the Bernoulli

trials is p hat, The sample proportion of successes.

So, X over n is the MLE for p. We know that the MLE is unbiased, in this

case. Expected value of p hat is p.

And we know that the variance of p hat is p one - p over n.

We also know from the Central Limit Theorem,

Because p hat is simply an average of Bernoulli trials that p hat minus it's

mean, p, divided by it's standard error, Square root p hat one minus p hat divided

by n, But this follows a normal distribution for

large values of b. In a previous lecture, we inverted this

statement. We said the probability of this test

statistic lies below the upper and lower alpha over two quantile leads to a

confidence interval for p, that is of the form p for hat plus or minus the relevant

normal quantile, times the standard error. This interval is so called Wald interval,

named after mathematical statistician named Wald, of course.

There is one problem with this interval is that it performs very badly.

So remember, the fact that we grab the appropriate normal quantile means that as

antithetically, via the central limit theorem, the coverage of the inner vow is

say 95% when alpha equals 0.05. But, coverage varies wildly for finite

sample sizes and it can be very low for certain values of n, especially when p is

near the boundaries. And it can even happen when p is near 0.5,

though the coverage is best when p is exactly 0.5 for values quite close to p

equal 0.5. And, in fact, I give an example here. When

p is 0.5 and n is 40, the actual coverage of a 95% interval is only 92%..

Then, it's also true when p is small or large,

Meaning small near zero or large near one. The coverage can be very, very poor even

for extremely large values of n, And I give an example here.

When p is five percent and n is 1,876, The actual coverage rate of a 95% interval

is 90%. I got these numbers from a paper in the

Journal of Statistical Science by Brown, Cai, and DasGupta, and you can see the

reference here. So, because the Wald interval performs

poorly, we need some sort of fix. There's quite a few fixes you can actually

use. I'm going to present a particularly easy

one. And the idea is to add two successes and

two failures, and simply then create the interval as if that were the data.

So, in specific, let p tilde be X plus two over n plus four, then the interval I'm

proposing is p tilde, plus or minus the standard normal quantile, times square

root p tilde, one minus p tilde, over n. And I call this, the Agresti-Coull

interval, some textbooks call it the Wilson score interval.

But I like to call it Agresti and Coull because those guys are my friends.

Though, I'm sure Wilson is quite nice, too.

So, why do we do the interval this way? It turns out that there is a reason

related to hypothesis testing as to why we can construct the interval this way.

It turns out that this interval is exactly the so called inverse of a score test.

But, I don't think in this class we're going to get a chance to cover hypothesis

testing or score tested to cover that. A more heuristic motivation is when p is

large or small, We saw an example in the lecture on the

Central Limit Theorem, that the distribution of p hat winds up being very

skewed. It's not a symmetric distribution.

The distribution is very symmetrical when p is 0.5, but as p heads towards zero or

one, the distribution is skewed. And then, at that point, it doesn't

necessarily make sense anymore to center the interval right at the MLE.

We might want to pull it towards 0.5 because if p is close to zero or close to

one, then the distribution will be skewed high.

If the p is near zero or skewed towards low value, is if p is, is high.

So it makes sense in either case to shrink it towards 0.5.

So, later on , we'll show that this interval is related to something we do

with Bayesian posterior intervals, and it's often the case that Bayesian

procedures perform very well in terms of their frequentist performance.

If this confidence interval is an example of Bayesian thinking that it has

reasonable frequentist properties. So, let's go through an example.

Suppose that in a random sample of an at-risk population, 30 out of twenty

subjects had hypertension, and you wanted to estimate the prevalence of hypertension

in this population. In this case, p hat is 0.65, thirteen

divided by twenty. N is of course twenty.

P tilde from the Agresti-Coull interval is 0.63, so that's fifteen divided by 24, and

n tilde is 24. The, 97.5th quantile from the normal

distribution is 1.96n of course. Thenn you wind up with a Wald interval of

0.44 to 0.86. We wind up with an Agresti-Coull interval

of 0.44 to 0.86, and you wind up with an likely interval of 0.42 to 0.84.

So, in this case, any of the techniques seems to give roughly about the same

value, which is good. And here on the next slide, I show you the

likelihood for p and draw the one-eighth, 1/16 and 1/32 likelihood reference lines,

but we see the likelihood is of course peaked right at the MLE of about 0.65, and

we get a rough sense of the spread in our uncertainty in estimating p.