Donald Hobson comments on What’s So Bad About Ad-Hoc Mathematical Definitions?

Donald Hobson 16 Mar 2021 17:08 UTC
50 points
Shannon mutual information doesn’t really capture my intuitions either. Take a random number X, and a cryptographically strong hash function. Calculate hash(X) and hash(X+1).
Now these variables share lots of mutual information. But if I just delete X, there is no way an agent with limited compute can find or exploit the link. I think mutual information gives false positives, where Pearson info gave false negatives.
So Pearson Correlation ⇒ Actual info ⇒ Shannon mutual info.
So one potential lesson is to keep track of which direction your formalisms deviate from reality in. Are they intended to have no false positives, or no false negatives. Some mathematical approximations, like polynomial time = runnable in practice, fail in both directions but are still useful when not being goodhearted too much.
- johnswentworth 16 Mar 2021 17:28 UTC
  11 points
  Parent
  This is particularly relevant to the secret messages example, since we do in fact use computational-difficulty-based tricks for sending secret messages these days.
- ForensicOceanography 18 Mar 2021 18:17 UTC
  2 points
  Parent
  Actually the mutual information has some well-defined operational meaning. For example, the maximum rate at which we can transmit a signal through a noisy channel is given by the mutual information between the input and the output of the channel. So it depends on which task you are interested in.
  - Donald Hobson 19 Mar 2021 16:55 UTC
    9 points
    Parent
    A “channel” that hashes the input has perfect mutual info, but is still fairly useless to transmit messages. The point about mutual info is its the maximum, given unlimited compute. It serves as an upper bound that isn’t always achievable in practice. If you restrict to channels that just add noise, then yeh, mutual info is the stuff.
    - ForensicOceanography 22 Mar 2021 12:43 UTC
      1 point
      Parent
      Yes, it is the relevant quantity in the limit of infinite number of uses of the channel. If you can use it just one time, it does not tell you much.