cousin_it comments on Open thread, August 21 - August 27, 2017

cousin_it 23 Aug 2017 13:34 UTC
12 points
0
Here’s an old puzzle:

Alice: How can we formalize the idea of “surprise”?

Bob: I think surprise is seeing an event of low probability.

Alice: This morning I saw a car whose license plate said 3817, and that didn’t surprise me at all!

Bob: Huh.

For everyone still wondering about that, here’s the correct answer! The numerical measure of surprise is information gain (Kullback-Leibler divergence) from your prior to your posterior over models after updating on the data. That gives the intuitive answer to the above puzzle, as long as none of your models assigned high probability to 3817 in advance. It also works for the opposite case, if you expected an ordered string but got a random one, or ordered in a different way.

This is actually well known, I just wanted to put it on LW.
- arundelo 23 Aug 2017 17:04 UTC
  2 points
  0
  Parent
  Just to make sure I understand prior and posterior over models, is the following about right?
  - Alice starts with a prior of 0.999 that non-vanity plates are generated basically randomly (according to some rule of “N letters followed by M digits” or whatever, and with rules e.g. preventing swear words).
  - Alice sees “3817” (having seen many other 4-digit plates previously).
  - Alice’s posterior probability over models is still about 0.999 on the same model.
  - cousin_it 23 Aug 2017 21:34 UTC
    2 points
    0
    Parent
    Yeah.
- Dagon 24 Aug 2017 21:25 UTC
  0 points
  0
  Parent
  Wait. If you’re talking about surprise because you have said “update your model based on how surprised you are”, you can’t turn around and say “surprise is defined by how much you should update your model”. “update your model based on how much you should update your model” isn’t very helpful.
  - arundelo 25 Aug 2017 6:31 UTC
    0 points
    0
    Parent
    The intuitive sense of what surprise is corresponds well to the rules for updating your probability distribution over models, which we can therefore take as a formal definition of surprise.
- cousin_it 24 Aug 2017 6:23 UTC
  0 points
  0
  Parent
  Hmm, I thought about it some more and maybe it’s not that simple. If we formalize surprise like that, it’s easy to come up with situations where you expect to be very “surprised” no matter what data you see. That doesn’t seem right. Does anyone have better ideas?
- IlyaShpitser 23 Aug 2017 16:17 UTC
  0 points
  0
  Parent
  How is a Frequentist surprised?
  - cousin_it 23 Aug 2017 16:33 UTC
    0 points
    0
    Parent
    I’m missing a lot of knowledge to answer that. Can you?
    - IlyaShpitser 23 Aug 2017 17:41 UTC
      0 points
      0
      Parent
      Presumably, F folks talk about how “surprised” an element of a statistical model is, relative to observed data (maximum likelihood as minimizing surprise in KL sense). That’s about all I can think of.