tailcalled comments on Six (and a half) intuitions for KL divergence

tailcalled 10 Oct 2022 9:40 UTC
13 points
0
I also saw a good intuitive example of the asymmetry once. If you’ve got a bimodal distribution and a monomodal distribution that lies at one of the peaks of the bimodal distribution, then the KL-divergence will be low when P is the monomodal distribution and Q is the bimodal distribution, while the KL-divergence will be high when P is the bimodal distribution and Q is the monomodal distribution.
- CallumMcDougall 10 Oct 2022 9:51 UTC
  20 points
  1
  Parent
  Oh yeah, I really like this one, thanks! The intuition here is again that a monomodal distribution is a bad model for a bimodal one because it misses out on an entire class of events, but the other way around is much less bad because there’s no large class of events that happen in reality but that your model fails to represent.
  For people reading here, this post discusses this idea in more detail. The image to have in mind is this one: