Jump to content

Information theory

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 195.149.37.177 (talk) at 07:02, 28 April 2002 (*fix entropy link). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Information theory is a branch of the mathematical theory of probability and mathematical statistics, that deals with communication systems, data transmission, cryptography, signal to noise ratios, data compression, etc.

(This is quite different from Library and Information Science.)

Claude E. Shannon (1916-2001) has been called "the father of information theory". His theory "considered the transmission of information as a statistical phenomenon" and gave communications engineers a way to determine the capacity of a communication channel in terms of the common currency of bits. The transmission part of the theory is not "concerned with the content of information or the message itself," though the complementary wing of information theory concerns itself with content through lossy compression of messages subject to a fidelity criterion. These two wings of information theory are joined together and mutually justified by the information transmission theorems, or source-channel separation theorems that justify the use of bits as the universal currency for information in many contexts.

It is generally accepted that the modern discipline of information theory began with the publication by Claude E. Shannon of his article "The Mathematical Theory of Communication" in the Bell System Technical Journal in July and October of 1948. In the process of working out a theory of communications that could be applied by electrical engineers to design better telecommunications systems, Mr. Shannon defined a measure of entropy (H = - Σ pi log pi) that, when applied to an information source, could determine the capacity of the channel required to transmit the source as encoded binary digits. Shannon's measure of entropy came to be taken as a measure of the information contained in a message, as opposed to the portion of the message that is strictly determined (hence predictable) by inherent structures, like for instance redundancy in the structure of languages or the statistical properties of a language relating to the frequencies of occurrence of different letter or word pairs, triplets etc. See Markov chains.

Entropy as defined by Shannon is closely related to entropy as defined by physicists. Boltzmann and Gibbs did considerable work on statistical thermodynamics. This work was the inspiration for adopting the term entropy in information theory. There are deep relationships between entropy in the thermodynamic and informational senses. For instance, Maxwell's Demon needs information to reverse thermodynamic entropy and getting that information exactly balances out the thermodynamic gain that the demon would otherwise achieve.

Other useful measures of information include mutual information which is a measure of the correlation between two event sets. Mutual information is defined for two events X and Y as

M (X, Y) = H(X,Y) - H(X) - H(Y)

where H(X,Y) is the joint entropy or

H(X,Y) = - Σx, y p(x,y) log p (x,y)

Mutual information is closely related to the log-likelihood ratio test for multinomials and to Pearson's χ2 test.

The field of Information Science has since expanded to cover the full range of techniques and abstract descriptions for the storage, retrieval and transmittal of information. It has little to do with the organization of information, unless you mean by that how databases are designed to organize information into data records.


Claude E. Shannon's original paper is available at http://galaxy.ucsd.edu/new/external/shannon.pdf