even larger because the background has a high probability for the word and

the coefficient in front of 0.9 which is now 0.5 would be even larger.

When this is larger, the overall result would be larger.

And that also makes this the less important for

theta sub d to increase the probability before the.

Because it's already very large.

So the impact here of increasing the probability of the is somewhat

regulated by this coefficient, the point of i.

If it's larger on the background,

then it becomes less important to increase the value.

So this means the behavior here,

which is high frequency words tend to get the high probabilities, are effected or

regularized somewhat by the probability of choosing each component.

The more likely a component is being chosen.

It's more important that to have higher values for these frequent words.

If you have a various small probability of being chosen, then the incentive is less.

So to summarize, we have just discussed the mixture model.

And we discussed that the estimation problem of the mixture model and

particular with this discussed some general behavior of the estimator and

that means we can expect our estimator to capture these infusions.

First every component model

attempts to assign high probabilities to high frequent their words in the data.

And this is to collaboratively maximize likelihood.

Second, different component models tend to bet high probabilities on different words.

And this is to avoid a competition or waste of probability.

And this would allow them to collaborate more efficiently to maximize

the likelihood.