Now the stratified sampling approach has, as we've seen, some advantages.
I mentioned one about credibility.
It just makes more sense.
It's more believable to people to say that when I drew my sample,
I made sure that I had the right distribution.
Across the frame, the population, that it got replicated in the sample, as well.
That's more acceptable, more credible.
But we've also seen how we can get gains in precision,
depending on the allocation, though, and
that's part of what we're doing in these two lectures on allocations.
There's also advantages here.
We've talked about guaranteed representation of important domains.
And there are some things about flexibility that we'll see
here in allocations and administrative convenience that we can take advantage of.
But these, in order to understand these potential benefits of stratified sampling,
depend on understanding something about the allocation, as well.
We've been looking at a problem in which we have a basic sample size,
in our illustration with 400 faculty, we were drawing a sample of 80, and
we chose a particular allocation, but many allocations are possible.
Many different sampling fractions of sebetra are possible across those stratum.
So for example, for our six stratum, as shown in the last line on this display,
the allocations could have been take one from stratum 1,
one from stratum 2, one from stratum 3, one from stratum 4,
one from stratum 5, and 75, the balance, from stratum 6.
Or we could have taken two from stratum 1, and one from each of the stratum 2,
3, 4, and 5, and then the remaining 74 from stratum 6.
And so on.
There's lots and lots of possible allocations there.
It turns out that some of these allocations can be beneficial, and
some can be harmful to our overall estimates.
Some can be beneficial when we want to do domain estimation.
Some can be beneficial when we want to combine a cross estimate
of these estimates.
And the allocation that we did was related to the drawing in the lower left.
This is sort of a population-sized
distribution of the states in the United States.
California is a large share of the US population,
10% among the 50 states, and Florida and New York are fairly large,
as is Texas, in terms of their relative population sizes.
And others are very tiny, they become very small on that map.
As that population size varies, which it will,
and it's outside of our control, it's something that's going to be
there with respect to the auxiliary data that we have available.
We can take advantage of that in order to get gains and precisions.
One way to do that,
that we've already seen, was to use the allocation that we already did.
The one that was the proportionate distribution.
This particular distribution, we had a reason for using.
And that was because we actually got a gain in precision.
Taking the same percent or fraction of the elements in each of the six stratum, and
we have a nice property to it too.
That distribution, 8, 5, 4, 15, 10, 38, across the six stratum,
also is achieved by taking the same sampling fraction in each of the stratum.
Well that seems straightforward.
It has nice properties with respect to the sample design.
You're making the sample designers very happy.
But does that have any other payoff for us?
Well actually we saw that this kind of thing has a benefit that goes beyond it.
It gives us a balance then between the sample and the stratum.
That is when we draw the sample in such a way that we have the same sampling
fraction, lower case f sub h,