In communication theory, the statement that the output of any information source having entropy H units per symbol can be encoded into an alphabet having N symbols in such a way that the source symbols are represented by codewords having a weighted average length not less than
(where the base of the logarithm is consistent with the entropy units). Also, that this limit can be approached arbitrarily closely, for any source, by suitable choice of a variable-length code and the use of a sufficiently long extension of the source (
see source coding).
The theorem was first expounded and proved by Claude Elwood Shannon in 1948.