$\begingroup$

The conversion factor from physical entropy to information entropy (in random bits) uses Landauer's limit: (physical entropy)=(information bits)*kb*ln(2). The number of yes/no questions that have to be asked to determine which state a physical system is in is equal to Shannon's entropy in bits, but not Shannon's intensive, specific entropy H, but his extensive, total entropy of a data-generating source: S=N*H where H=1 if the n bits are mutually independent.

Landauer's limit states that 1 bit of information irreversibly changing state releases entropy kb*ln(2), which is a heat energy reelase for a given T: Q=T*kb, implying there was a stored potential energy that was the bit. This shows that entropy is information entropy: the ln(2) converts from ln() to log2(). kb is a simple conversion factor from average kinetic energy per particle (definition of temperature) to heat joules which has units of joules/joules, i.e. unitless. If our T was defined in terms of joules of kinetic energy (average 1/2 mv^2 of the particles) instead of Kelvins, then kb=1. So kb is unitless joules/joules. It's not a fundamental constant like h. c also does not have fundamental units if you accept time=i*distance as Einstein mentioned in appendix 2 of his book, allowing use of the simpler Euclidean space instead of Minkoswki space without error or qualification and in keeping with Occam's razor.

Shannon's "entropy" (specific, intensive) is H=sum(-p*log(p)) and he stated 13 times in his paper that H has units of bits, entropy, or information PER SYMBOL, not bits (total entropy) as most people assume. An information source generates entropy S=N*H where N is the number of symbols emitted. H is a "specific entropy" based on the probability of "n" unique symbols out of N total symbols. H is not a "total entropy" as is usually believed, finding its physical parallel with So=entropy/mole. Physical S=N*So and information S=N*H. It is rare to find texts that explain this.

An ideal monoatomic gas (Sackur-Tetrode equation) has an entropy from N mutually independent gas particles of S=kb*sum(ln(total states/i^(5/2)) where the sum is over i=1 to N. This is approximated by Stirling's formula to be S=kb*N*[ln(states/particle)+5/2]. I can't derive that from Shannon's total entropy S=N*H even though I showed in the first paragraph the final entropies are exactly the same. I am unable to identify an informatic "symbol" in a physical system. The primary problem seems to be that physical entropy is constrained by total energy which gives it more possible ways to use the N particles. 1 particle carrying the total energy is a possible macrostate (not counting the minimal QM state for the others), but information entropy does not have "check sum" like this to use fewer symbols. Physical entropy seems to always(?) be S=kb*N*[ln(states/particle)+c] and the difference from information entropy is the c. But in bulk matter where the energy is spread equally between bulks, physical entropy is S=N*So. Information entropy is perfectly like this (S=N*H), but I can't derive So from H. Again, S bits =S/(kb*ln(2)).

So Shannon's entropy is a lot simpler and comes out to LESS entropy if you try to make N particles in a physical system equivalent to N unique symbols. The simplest physical entropy is of independent harmonic oscillators in 1D sharing a total energy but not necessarily evenly is S=kb*ln[(states/oscillator)^N / N!] which is S=N*[log(states/particle)+1] for large N. So even in the simplest case, the c remains. Shannon entropy is of a fundamentally different form: S~log((states/symbol)^N) = N*log(states/symbol) when each symbol is mutually independent (no patterns in the data and equal symbol probabilities). For example, for random binary data S=log2(2^N) = N bits. So it is hard to see the precise connection in the simplest case (the +1 is not a minor difference), even as they are immediately shown by true/false questions to be identical quantities with a simple conversion factor. Stirling's approximation is exact in the limit of N and Shannon's H depends in a way on an infinite N to get exact p's, so the approximation is not a problem to me.

I have not contradicted anything user346 has said but I wanted to show why the connection is not trivial except in the case of looking at specific entropy of bulk matter. QM uses S=sum(-p*log(p)) but Shannon entropy is S=N*sum(-p*log(p)). They come out the same because calculating the p's is different. Physical's p=(certain macrostate)/(total microstates) but the numerator and denominator are not simply determined by counting. Information's p=(distinct symbol count)/(total symbols) for a given source. And yet, they both require the same number of bits (yes/no questions) to identify the exact microstate (after applying kb*ln(2) conversion).

But there's a problem which was mentioned in the comments to his answer. In an information system we require the bits to be reliable. We can never get 100% reliability because of thermal fluctuations. At this limit of 1 bit = kb*ln(2) we have a 49.9999% probability of any particular bit not being in the state we expected. The Landuaer limit is definitely a limit. The energy required to break a bond that is holding one of these bits in a potential memory system is "just below" (actually equal) to the average kinetic energy of the thermal agitations. Landauer's limit assumes the energy required to break our memory-bond is E=T*kb*ln(2) which is slightly weaker than a van der waals bond which is about the weakest thing you can call a "bond" in the presence of thermal agitations.

So we have to decide what level of reliability we want our bits. Using the black hole limit also seems to add a problem of "accessibility". It is the information content of the system, but it is not an information storage system.