Minimum-Entropy Differencing

Before a sampled data set is encoded using a tailored Huffman code, it is often desirable to put it in a form which will result in as compact a code as possible. A typical way of doing this is called "minimum-entropy differencing," and is outlined below in two sections. The first section is a very brief overview of the concept of "entropy" as applied to information theory, while the second section gives an example of "minimum-entropy differencing."

Entropy of a Communication Channel

In his 1949 book, "The Mathematical Theory of Communication," Claude Shannon used the concept of entropy as gleaned from statistical thermodynamics to define the entropy of a communications channel as follows:

  H = ( P₁I₁ +  P₂I₂  +  ...  P_nI_n )  /  ( P₁ + P₂ +  ...  P_n )

where I_n is the information contained in the nth element of a set of n elements each having a probability of P_n and is defined as:

  I_n = log₂ (1/P_n) Shannons    (or,alternatively, I_n = log₁₀(1/P_n) Hartleys, or I_n = log_e(1/P_n) nats)

Since, by definition, the sum of the probabilities is unity, (P₁ + P₂ + ... P_n = 1), then

  H  =  P₁I₁ +  P₂I₂  +  ... P_nI_n   or   H  =  P₁log(1/P₁) +  P₂log(1/P₂)  +  ...  P_nlog(1/P_n)

In the worst case, all events are equally probably. The entropy then reaches its maximum and becomes

     H = log₂(n)

as, for instance, in a binary channel where the values of 0 and 1 are equally probable, and

     H = log₂(2) = 1 Shannon

or alternatively, in a channel coded as 10 equally probable decimal digits:

     H = log₂(10) = 3.32 Shannons = 1 Hartley

The goal, of course, is to minimize the entropy prior to applying the Huffman encoding algorithm. In this sense, the entropy of communication theory is much like the entropy of thermodyamics -- an increase in entropy corresponds to an increase in disorder, and the more disordered (or random) that a data set is, the less structure there is for taking advantage of redundancy. If there is no redundancy to take advantage of, then the use of compression does not work. Notice also that if a given signal distribution is unimodal and approximately Gaussian, then the entropy behaves monotonically with respect to the variance.

An interesting footnote to the publication of Shannon's paper in 1948 is the change in attitude that it provoked as regards the role of noise in communication. Before 1948, noise set an inescapable limit on the accuracy of information transfer. After 1948, noise set an inescapable limit, not on the accuracy of the information, but on the rate at which accurate information can be transmitted. In theory, one could obtain an efficiency of 99.9% at a -200 dB SNR, provided one used a long enough code.

Example of Minimum-Entropy Differencing

TRANSMITTER END

If the n-th sample in a series of sampled data points is denoted by f_n, then form an average-value estimate:

f _n-av = ( f _n-1 + f _n+1 ) / 2

then form a difference given by

delta _n = f _n - f _n-av = .5 ( - f _n-1 + 2 f _n - f _n+1 )

Transmit f ₁ and f ₂, followed by values of delta _n for succeeding points.

RECEIVER END

The data can be reconstructed using the same equations used to generate the differences to give

f _n+1 = 2 f _n - 2 delta _n - f_n-1 where the values of f_n-1 and f_n are generated recursively.

If, in the process of doing the arithmetic, one more bit of precision is used than that used for the sample word size, this process will not involve truncation or round-off error, and the filter will not drift. The reconstruction process at the receiver end is, therefore, perfect, provided that the channel is quiet (has no noise).

Information on how to encode delta-modulated data using Huffman codes can be found under Huffman Coding.

Home Page or Table of Contents