How fundamental is information?

This web page is a critical essay that considers how well-defined and fundamental the concept of information is. The essay has been inspired by an article "What Brings a World into Being?" by David Berlinski [Berlinski].

The thrust of Berlinski's article is to caution against an information-centric view of the material world. It has become popular to regard information as the fundamental essence of all things. As the article says,

The thesis that the human mind is nothing more than an information-processing device is now widely regarded as a fact. 'Viewed at the most abstract level,' the science writer George Johnson remarked recently in the New York Times, 'both brains and computers operate the same way by translating phenomena -- sounds, images, and so forth -- into a code that can be stored and manipulated' . More generally, the evolutionary biologist Richard Dawkins has argued that life is itself fundamentally a river of information, an idea that has in large part also motivated the successful effort to decipher the human genome. Information is even said to encompass the elementary particles. 'All the quarks and electrons in the cosmic wilds,' Johnson writes, 'are exchanging information each time they interact.'

Information in Shannon communication theory

It is instructive to review the origins of the notion of information: Shannon's mathematical theory of communication. The classical derivation of entropy as the measure of information (see below) has remarkably nothing to do with meaning. Shannon himself cautioned against confusing a signal with what it signifies:

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is, they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages [Shannon].

As Chris Hillman pointed out [Hillman]

... "Information" appears in Shannon's theory only to the degree that the successful communication of information may lead to statistical correlations between the behavior of two systems. Any such correlation must presumably reflect some common causal influence upon these systems, but communication theory is emphatically not a theory of causal influence; nor is it a theory of knowledge or meaning. Rather, it is a purely probabilistic theory concerned with those statistical phenomena which are relevant to the following two fundamental problems, [of data storage and of transmission of messages over noisy communication channels.]

Semantic information is ill-defined and subjective

Reading a book requires an understanding of how words weave into sentences. The sentences furthermore must relate to reader's prior experience to create a narrative and the meaning. The language understanding and the prior experience are prerequisites of reading comprehension. Yet they are not a part of the book. In fact, understanding of a language is not explained in any book as it is largely unknown. According to Chomsky and the recent experiments, language understanding is an innate ability, which must be tuned in in the first few years of person's life.

The same conclusion -- the meaning of a book depends on its reader -- holds for another book, DNA. DNA is read by cell's machinery, which is far more complex than the DNA itself. Without a competent reader DNA means nothing. A magnetic tape with a code for a tape driver is meaningless without a driver to read it. It is often said that DNA encodes all that is in a cell. That is not correct [Feitelson][Bethell]. It is a cell that encodes in its own machinery what it is in the cell. Combining parents' DNA does not make an embryo by itself, in a flash of a green light. The combined DNA must be a part of an egg cell, whose division and specialization make an embryo. The fist cell that emerged, created or brought in a protean soup some four billion years ago is still alive -- in each of us. A Jurassic-park--like experiment to "resurrect" an extinct species can only succeed when there is a close living relative, whose eggs have an ability to interpret the DNA of the past species. If a Mars explorer finds something that looks like a DNA but is not the DNA, and the explorer finds no live Marsian cell, the chance of resurrecting a Marsian is nil. Without a reader, a book has no meaning.

It has been determined that a human genome is only twice as complex as that of a fruit fly, and about as complex as the genome of a common weed. Are humans themselves only twice as complex as a fruit fly? The size, in CPU instructions, of a driver for a modern network interface card can be comparable with the size of a tape driver of an old 8-bit computer. Does this mean that a modern computer is just as comparable to an old C-64?

Can "information" act by itself, without a material agent? The end of Berlinski's article discusses this question. We have seen that a message in isolation from a messenger and a receiver means nothing. If understanding of a message always requires a material agent, then information does not have an independent existence, let alone an independent operation.

Logic encoded in a medium (DNA, a book, a sequence of amino-acids) cannot be understood given the same logic -- extra logic and an agent are required. Information is not everything, because it is subjective.

A simple derivation of Shannon entropy

Given below is an elementary derivation for Shannon entropy, the measure of information. This derivation is different from the one given in Theorem 2 and Appendix 2 of the Shannon paper [Shannon]. The present derivation is somewhat close to the one hinted at in Theorem 4. I did it one evening while doing laundry.

Consider a source and a sink connected by a communication channel. Two events, A and B, happen at the source with a particular rate Rs and with probabilities p and q (=1-p) correspondingly. A communication channel transmits a symbol X or a symbol Y at a rate Rc . We want to modulate the stream of X and Y in such a way so to let the sink know what precisely is happening at the source.

Consider a sufficiently large time interval T . During this time, the channel will carry N symbols from the source to the sink. At the source, M events will happen, (M=T*Rs) : on average, pM of these events will be of type A and the rest, qM , will be of type B. During each time interval T , a particular combination of pM evens A and qM events B happens at the source, and a particular sequence of X and Y symbols of length N is transmitted to the sink. There are at most 2^N distinct N-element sequences of two symbols. There are choose(pM out of M) distinct combinations of pM evens A and qM events B:

choose(pM out of M) = M!/(pM)!/(qM)! = (pM+qM)!/(pM)!/(qM)! = { using an approximation n! = (n/e)^n which is accurate for big n } ((pM+qM)/pM)^(pM) * ((pM+qM)/qM)^(qM) = (1/p)^(pM) * (1/q)^(qM).

T , we need to send distinct sequences of X and Y symbols for distinct sequences of events A and B. Hence it must hold (1/p)^(pM) * (1/q)^(qM) <= 2^N Thus to guarantee the accurate transmission, the communication channel must at the very least has the capacity Rc/Rs = N/M = -p*log2(p) -q*log2(q) which is the channel capacity required for the most efficient coding, given in Theorem 9 of the Shannon's paper. If we want to accurately convey to the sink what happens at the source during the interval, we need to send distinct sequences of X and Y symbols for distinct sequences of events A and B. Hence it must holdThus to guarantee the accurate transmission, the communication channel must at the very least has the capacitywhich is the channel capacity required for the most efficient coding, given in Theorem 9 of the Shannon's paper.

Now we can start drawing over-arching conclusions about information, the measure of information and uncertainty, etc. But we should remember that at the heart of Shannon's derivation of entropy is a trivial counting argument, a pigeonhole principle in disguise: to code events without the loss of "information" we need to have at least as many distinct codes as there are distinct event sequences.

References

Commentary

Feitelson's and Treinin's article shows that DNA is a rather incomplete code for life. DNA does not even completely specify a protein. Special peptides, chaperons, are needed to help fold a newly synthesized protein into the correct form. Furthermore, DNA has "multiple readings". A particular transcription is selected based on the mix of the proteins in the cytoplasm -- the current state of a cell. "Thus, DNA is only meaningful in a cellular context in which it can express itself and in which there is an iterative, cyclic relationship between the DNA and the context."

Lastly, not all inheritance is accomplished via DNA. Prions, infamous for the Mad cow disease, is one example of the DNA-less inheritance. Inheritance of major cell structures such as membranes and mitochondria occurs without the direct DNA participation. Membranes, for example, include lipids, which are not coded in DNA. Lipids are absorbed from food intake. Even when the building material is protein, protein molecules do not automatically come together to make a cell organell. Protein molecules are assimilated into the correct locations of the existing structure, e.g., mitochondria, which thus grows and eventually divides. This fact implies that the organelles must exist to begin with.

Feitelson's and Treinin's article concludes,

A major difficulty with the notion that the DNA contains all the required information is that this information seems useless without the surrounding cellular machinery. While the DNA contains basic instructions on how to prepare many components of the machinery -- namely, proteins -- it is unlikely to contain full instructions on how to assemble them into supermolecular structures to create a functional cell. Thus, we are left with the chicken-and-egg problem: How did this process start? This question is actually quite complicated because the feedback cycle is very tight. Proteins replicate, maintain, and protect the DNA. We can't create proteins from DNA if we don't already have proteins to assist in the transcription process as polymerases and to assist the new proteins in folding correctly into their intended 3D configurations as chaperones.... Much work remains in biology before we will understand these and related issues.

[O]ur understanding of the human genome has changed in the most fundamental ways. The small number of genes -- some 30,000 -- supports the notion that we are not hard wired. We now know the notion that one gene leads to one protein, and perhaps one disease, is false.

One gene leads to many different protein products that can change dramatically once they are produced. We know that some of the regions that are not genes may be some of the keys to the complexity that we see in ourselves. We now know that the environment acting on our biological steps may be as important in making us what we are as our genetic code.

Last updated August 1, 2004

This site's top page is http://okmij.org/ftp/

oleg-at-okmij.org

Your comments, problem reports, questions are very welcome!