source coding theorem

The task of compression: Convert sequence of symbols to sequence of encoded symbols. It's desirable for the encoded sequence to be as short as possible.

1 informal statement

Given a random variable \(X\) with entropy \(H(X)\), any encoding of \(X\) will not have a smaller rate than \(H(X)\). Conversely, any rate smaller than \(H(X)\) will make loss very likely. However, you can get the rate arbitrarily close to \(H(X)\) and get arbitrarily small error in the limit as the sequence length \(n\rightarrow \infty\).

1.1 intuitive statement

I am trying to find an encoding scheme that will compress the input information from \(X\). The rate is the number of coding bits needed per input bit. This theorem is telling us that we can't compress below a certain rate, namely \(H(X)\), without incurring the penalty of likely loss.

2 making the rate arbitrarily close to \(H(X)\)

Shannon's source coding theorem says that: Given a random variable \(X\) with entropy \(H(X)\), define a notation for \(n\) i.i.d repetitions of \(X\): call it \(X^{1:n} = X_1,...,X_n\) For any \(\epsilon > 0\), for a rate of \(H(X)+\epsilon\), there exists:

large enough \(n\) and
a code that will permit any \(X^{1:n}\) to be encoded in \(n(H(X)+\epsilon)\) bits such that the original \(X^{1:n}\) can be recovered with probability of at least \(1-\epsilon\)

Here, rate means the average number of bits per symbol used by the encoder.

A personal confusion I have with the above is why it's phrased so as to get better error bounds as you decrease \(\epsilon\). Wouldn't I expect the theorem to tell me I get better error bounds as I increase the rate above \(H(X)\)?

source coding theorem

1 informal statement

1.1 intuitive statement

2 making the rate arbitrarily close to \(H(X)\)

3 helpful links