And so, we have nu plus 2 to the power minus n, times epsilon over 4, minus tau.

[BLANK_AUDIO]

Now let us take a close look at this term, 2

to the power minus n, times epsilon over 4, minus tau.

[BLANK_AUDIO]

Recall that epsilon is fixed.

Since tau tends to 0, as delta tends to 0, we can choose delta to be

sufficiently small, so that epsilon over 4, minus tau, is greater than 0.

[BLANK_AUDIO]

If so, 2 to the power minus n, times epsilon over 4,

minus tau would tends to 0, as n tends to infinity.

[BLANK_AUDIO]

Now let nu be less than epsilon over 3, which is so for a sufficiently small delta,

and let n be sufficiently large, so that 2 to the power minus

n times epsilon over 4, minus tau, is less than epsilon over 6.

[BLANK_AUDIO]

Then we obtain the probability of the error event, less than epsilon

over 3, plus epsilon over 6, which is equal to epsilon over 2.

Thus, we have shown, that with a suitable choice

of parameters, for the random coding scheme, the probability

of the error event, is less than epsilon over

2, for any arbitrarily small epsilon, when n is sufficiently large.

[BLANK_AUDIO]

The idea of the analysis is the following.

First of all, let the block length n be large.

[BLANK_AUDIO]

Then by the joint AEP, the probability that tilde{X}(1), that is the codeword

being sent, is jointly typical with the received sequence Y, would tend to 1.

[BLANK_AUDIO]

At the same time, the probability that any

other codeword is jointly typical with the received sequence, Y,

is approximately equal to 2 to the power minus

n times the mutual information between X and Y.

[BLANK_AUDIO]

If the size of the codebook, that is M, grows at

a rate strictly less than the mutual information between X and Y,

then the probability that a codeword tilde{X}(w) jointly typical with the

sequence Y, for some W not equal to 1, can be made arbitrarily small.

[BLANK_AUDIO]

Then the probability of the error event, that is W hat

not equal to W, can be made arbitrarily small.

[BLANK_AUDIO]

We have already shown that for the random coding scheme,

the probability of the error event can be made arbitrarily small.

We now show the existence of a

deterministic coding scheme, that can do the same.

[BLANK_AUDIO]

According to the random coding scheme, the probability of the error

event, can be written as the summation over all codebook C,

the probability of the codebook C being chosen, times

the probability of the error event for that codebook C.

In other words, the probability of the error event is

a weighted average of the probability of the error event

over all the codebooks.

[BLANK_AUDIO]

Then there exists at least one codebook C

star, such that the probability of error P_e,

is less than or equal to the probability

of error, weighted over all the possible codebooks,

which we have shown, is less than epsilon over 2.

And by construction, this codebook has rate

1 over n, times log M, greater than I(X;Y),

minus epsilon over 2.

By letting p(x) be the input distribution, that achieves the channel capacity,

the rate of this code can be arbitrarily close to the channel capacity.

[BLANK_AUDIO]

To complete the proof of the achievability of

the channel coding theorem, we are now going to

show that from this deterministic code, with P_e, the

average probability of error, less than epsilon over 2,

we can construct another deterministic code whose maximal conditional

probability of error, lambda_max, is less than epsilon.

Without loss of generality,

assume that the conditional probability of errors are in ascending order.

That is lambda_1, less than or equal to lambda_2,

less than or equal to all the way to lambda_M.

From P_e less than epsilon over 2, we have 1 over M, times summation

w equals 1 up to M, lambda_w, less than epsilon over 2.

This implies that the summation of all the lambda_w's

is less than M over 2, times epsilon.

[BLANK_AUDIO]

Since M is even, M over 2 is an integer.

Then the second half of the summation, namely

summation w equals M over 2 plus 1, to

M, lambda_w, which is upper bounded by the summation, w equals 1 up to M, lambda_w

is less than M over 2, times epsilon.

[BLANK_AUDIO]

Thus we have shown that the second half of

the summation, is less than M over 2, times epsilon.

[BLANK_AUDIO]

From this, we have 1 over M over 2, times

the second half of the summation, is less than epsilon.

As there are exactly M over 2 terms in the second half of

the summation, this means that the average of all the lambda_w's

for w from M over 2 plus 1 to M, is less than epsilon.

[BLANK_AUDIO]

Since we assume that lambda_w are in ascending order.

[BLANK_AUDIO]

The smallest value, in the second half of the summation,

namely lambda_{M over 2 plus 1}, is less than epsilon.

[BLANK_AUDIO]

And this implies that lambda_{M over 2}, which is less than or

equal to lambda_{M over 2 plus 1}, is also less than epsilon.

[BLANK_AUDIO]

The conclusion is that if the average probability of

error, P_e is less than epsilon over 2,

then lambda_1, lambda_2 up to lambda_{M over 2} are all less than epsilon.

Thus, by discarding the worst half of the codewords in the codebook C star, namely,

those codewords that correspond to lambda_{M over 2 plus 1}, up to lambda_M,

we obtain

a new codebook with lambda_max, less than epsilon.

[BLANK_AUDIO]

After discarding the worst half of C star, the rate of the code

becomes 1 over n, times log M over 2, because the size of

the message set, instead of M, is now equal to M over 2.

Now this is equal to 1 over n, times log M, minus 1 over n,