Notes

Chapter 10:Processes of Perception and Analysis

Section 5:Data Compression

Maximal block compression

If one has data that consists of a long sequence of blocks, each of lengthb, and each independently chosen with probabilityp[i] to be of typei, then as argued byClaude Shannon in the late 1940s, it turns out that the minimum number of base 2 bits needed on average to represent each block in such a sequence ish = -Sum[p[i] Log[2, p[i]], {i, 2^b}]. If all blocks occur with an equal probability of2^-b, thenh takes on its maximum possible value ofb. If only one block occurs with nonzero probability thenh 0. Following Shannon, the quantityh (whose form is analogous to entropy in physics, as discussed on page1020) is often referred to as "information content". This name, however, is very misleading. For certainlyh does not in general give the length of the shortest possible description of the data; all it does is to give the shortest length of description that is obtained by treating successive blocks as if they occur with independent probabilities. With this assumption one then finds that maximal compression occurs if a block of probabilityp[i] is represented by a codeword of length-Log[2, p[i]]. Huffman coding with a large number of codewords will approach this if all thep[i] are powers of 1/2. (The self-delimiting of codewords leads to deviations for small numbers of codewords.) Forp[i] that are not powers of 1/2, non-integer length codewords would be required. The method of arithmetic coding provides an alternative in which the output does not consist of separate codewords concatenated together. (Compare algorithmic information content discussed on pages554 and1067.)

ページ先頭