Jump to content

Entropy in thermodynamics and information theory

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 130.94.162.61 (talk) at 03:06, 12 March 2006 (Black holes). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Introduction

"Gain in entropy always means loss of information, and nothing more" (G. N. Lewis, 1930).

There are close links between the information-theoretic entropy of Shannon and Hartley, usually expressed as H, and the thermodynamic entropy of Clausius and Carnot, usually denoted by S, of a physical system — in particular between the Shannon entropy and the statistical interpretation of thermodynamic entropy, established by Ludwig Boltzmann and J. Willard Gibbs in the 1870s.

A quantity defined by the entropy formula for was first introduced by Boltzmann in 1872, in the context of his H-theorem. Boltzmann's definition, based on frequency distribution for a single particle in a gas of like particles, was subsequently reworked by Gibbs into a general formula for the statistical-mechanical entropy (or "mixedupness"), based on the probability distribution pi for a complete microstate i of the total system:

The relation between Gibbs's statistical mechanical definition of entropy and Clausius's classical thermodynamical definition is explored further in the article: Thermodynamic entropy.

It is evident that

where the Shannon entropy H is measured in nats, and the constant of proportionality kB is Boltzmann's constant. Boltzmann's constant appears here due to the conventional definition of the units of temperature. Beyond that it has no particular fundamental physical significance in the definition of statistical mechanical entropy here.

In fact, in the view of Jaynes (1957), statistical thermodynamics should be seen as an application of Shannon's information theory: the thermodynamic entropy is interpreted as being an estimate of the amount of further Shannon information needed to define the detailed microscopic state of the system, that remains uncommunicated by a description solely in terms of the macroscopic variables of classical thermodynamics. (See article: MaxEnt thermodynamics).

Equilibrium statistical mechanics gives the prescription that the probability distribution which should be assigned for the unknown microstate of a thermodynamic system is that which has maximum Shannon entropy, given that it must also satisfy the macroscopic description of the system. But this is just an application of a quite general rule in information theory, if one wishes to a maximally uninformative distribution.

The thermodynamic entropy, measuring the phase-space spread of this equilibrium distribution, is just this maximum Shannon entropy, multiplied by Boltzmann's constant for historical reasons.

A neat physical implication was established by Szilard in 1929, in a refinement of the famous Maxwell's demon thought-experiment. Consider Maxwell's set-up, but with only a single gas particle in a box. If the supernatural demon knows which half of the box the particle is in, it can close a shutter between the two halves of the box, close a piston unopposed into the empty half of the box, and then extract joules of useful work if the shutter is opened again, and particle isothermally expands back to its original equilibrium occupied volume. In just the right circumstances therefore, the possession of a single bit of Shannon information (a single bit of negentropy in Brillouin's term) really does correspond to a reduction in physical entropy, which theoretically can indeed be parlayed into useful physical work.

A corollary is that in storing one bit of previously unstored information in a system, one inevitably potentially reduces the system's entropy by J K-1. This is only thermodynamically possible if the storage process releases at least joules of energy into the system's surroundings. Rolf Landauer (1961) showed that the couterpart of this process can occur as well: an array of ordered bits of memory can become "thermalized," or populated with random data, and in the process cool off its surroundings. N bits would then increase the entropy of the system by as they thermalize.

Heat generation is one of the banes of computer hardware design; so Landauer's principle is interesting as a fundamental physical limit to computation: it is impossible to physically erase one bit of stored information, without the system heating up by an energy of at least joules. This fundamental limit was one of the original spurs to research into reversible computing, which in turn proved essential for research into quantum computers.

The relation between information entropy and thermodynamic entropy has become common currency in physics. Thus Stephen Hawking often speaks of the thermodynamic entropy of black holes in terms of their information content; and it is not surprising that computers must obey the same physical laws that steam engines do, even though they are radically different devices.

But it should also be remembered that Gibbs's statistical mechanical entropy is only one application of information theory to physical systems, relevant when the particular 'message' not yet communicated is the underlying microstate of the physical system.

Other physical 'messages' will have their own information entropies. For example, the information rate of a macroscopic physical system obeying stochastic or chaotic behavior can be equal to the information rate of an equivalent Markov process. This entropy is quite likely negligibly tiny and practically quite irrelevant as a contribution to the overall thermodynamic entropy. But if this is the message of interest, then it is the thermodynamic entropy which is irrelevant, and this Shannon information which is everything.

Equivalence of form of defining equations

Discrete case

The defining equation for entropy in the theory of statistical mechanics established by Ludwig Boltzmann and J. Willard Gibbs in the 1870s, is of the form:

where is the probability of the microstate i taken from an equilibrium ensemble; which reduces for the special case of the microcanonical ensemble to

where W is the number of microstates, given the fundamental postulate that all the microstates are equiprobable.

The defining equation for entropy in the theory of information established by Claude E. Shannon in 1948 is of the form:

where is the probability of the message taken from the message space M. This also reduces to

where is the cardinality of the message space M, under the assumption that all the messages are equiprobable.

In the former case the natural logarithm was taken, and in the latter case the logarithm can also be taken to the natural base, as long as we measure information in nats. In this case we can write

where k is Boltzmann's constant to express the formal equivalence of these two discrete notions of entropy.

See Figure 2 of Frank's paper for a striking illustration of this concept, (which he calls "physical information").

This is more than just a formal resemblance of defining equations, however. As Landauer explains, any physical representation of information, such as in data processing equipment, must be somehow embedded in the statistical mechanical degrees of freedom of a physical system. Some of those degrees of freedom are simply taken to represent meaningful information according to the relation just expressed. And as Frank so clearly illustrates, it is rather arbitrary which of those degrees of freedom we take to represent "known information." For example, the conversion of general thermodynamic entropy (if it is indeed possible) to "known information" would be a very good source of random numbers, which are quite useful for many computational purposes.

Continuous case

Boltzmann's H-function likewise formally resembles Shannon's entropy in the continuous case. Basically, the H-function can be expressed, up to sign convention, as the information-theoretic joint entropy of the continuous probability distributions of the coordinates and momenta of the particles under consideration.

But this connection remains murky, and is not nearly so clear and undeniable as the discrete case. It is complicated by the fact that the joint entropy in the continuous case fails to be invariant under linear transformation (an important property of the mutual information). This makes the definition of the H-function dependent on a choice of the units of measure used for the coordinates and momenta.

Hirschman showed in 1957, however, that Heisenberg's uncertainty principle can be expressed as a particular lower bound on the sum of the entropies of the observable probability distributions of a particle's position and momentum, when they are expressed in Planck units. (One could speak of the "joint entropy" of these distributions by considering them independent, but since they are not jointly observable, they cannot be considered as a joint distribution.)

The von Neumann-Landauer bound

A theoretical application of this formal equivalence of thermodynamic entropy and information-theoretic entropy in the discrete case yields a lower bound on the amount of heat generated by an irreversible computational process, known as the von Neumann-Landauer bound.

Rolf Landauer argued in a 1961 paper that computational operations that are logically irreversible are also physically irreversible in the sense that reversing them would break the second law of thermodynamics. This result is known as Landauer's principle. In that paper, he also quantified the minimum net increase in thermodynamic entropy that must take place for an operation in which one bit of information is lost: . This increase in entropy that occurs for an irreversible bit operation must be expelled as heat to the environment at an absolute temperature T. The factor comes from the fact that 1 bit = (ln 2) nat.

This principle is important because it establishes physical limits to computation. There are ideas to implement schemes of reversible computing, but interestingly, Landauer argued in his paper that such schemes would be impractical because of a large increase in the amount of memory that would be required, and moreover the heat that would be generated by the irreversible step of initializing this memory would offset any heat savings realized by the implementation of reversibility. Nevertheless, much research has been devoted to the theory of reversible computing.

This has been applied to the paradox of Maxwell's demon which would need to process information to reverse thermodynamic entropy; but erasing that information, to begin again, exactly balances out the thermodynamic gain that the demon would otherwise achieve.

Black holes

Stephen Hawking often speaks of the thermodynamic entropy of black holes in terms of their information content. Do black holes destroy information? Didn't Hawking lose a bet about that one? See Black hole information paradox.

The Fluctuation Theorem

The fluctuation theorem provides a mathematical justification of the second law of thermodynamics under these principles, and precisely defines the limitations of the applicability of that law to the microscopic realm of individual particle movements.

Topics of recent research

Is information quantized?

In 1995, Tim Palmer signalled two unwritten assumptions about Shannon's definition of information that may make it inapplicable as such to quantum mechanics:

  • The supposition that there is such a thing as an observable state (for instance the upper face a die or a coin) before the observation begins
  • The fact that knowing this state does not depend on the order in which observations are made (commutativity)

The article Conceptual inadequacy of the Shannon information in quantum measurement [1], published in 2001 by Anton Zeilinger [2] and Caslav Brukner, synthesized and developed these remarks. The so-called Zeilinger's principle suggests that the quantization observed in QM could be bound to information quantization (one cannot observe less than one bit, and what is not observed is by definition "random").

But these claims remain highly controversial. For a detailed discussion of the applicability of the Shannon information in quantum mechanics and an argument that Zeilinger's principle cannot explain quantization, see Timpson [3] 2003 [4] and also Hall 2000 [5] and Mana 2004 [6]

For a tutorial on quantum information see [7].

See also

References

  • Leon Brillouin, Science and Information Theory, Mineola, N.Y.: Dover, [1956, 1962] 2004. ISBN 0486439186
  • Michael P. Frank, "Physical Limits of Computing", Computing in Science and Engineering, 4(3):16-25, May/June 2002.
  • Andreas Greven, Gerhard Keller, and Gerald Warnecke, editors. Entropy, Princeton University Press, 2003. ISBN 0691113386. (A highly technical collection of writings giving an overview of the concept of entropy as it appears in various disciplines.)
  • I. Hirschman, A Note on Entropy, American Journal of Mathematics, 1957.
  • R. Landauer, Information is Physical Proc. Workshop on Physics and Computation PhysComp'92 (IEEE Comp. Sci.Press, Los Alamitos, 1993) pp. 1-4.
  • R. Landauer, Irreversibility and Heat Generation in the Computing Process IBM J. Res. Develop. Vol. 5, No. 3, 1961
  • H. S. Leff and A. F. Rex, Editors, Maxwell's Demon: Entropy, Information, Computing, Princeton University Press, Princeton, NJ (1990). ISBN 069108727X
  • Claude E. Shannon. A Mathematical Theory of Communication. Bell System Technical Journal, July/October 1948.