[Question] Where can one learn deep intuitions about information theory?

I’m currently going through Brilliant’s course on “Knowledge and Uncertainty”. I just got through the part where it explains what Shannon entropy is. I’m now watching a wave of realizations cascade in my mind. For instance, I now strongly suspect that the “deep law” I’ve been intuiting for years that makes evolution, economics, and thermodynamics somehow instances of the same thing is actually an application of information theory.

(I’m honestly kind of amazed I was able to follow as much of rationalist thought and Eliezer’s writings as I was without any clue what the formal definition of information was. It looks to me like it’s more central than is Bayes’ Theorem, and that it provides essential context for why and how that theorem is relevant for rationality.)

I’m ravenous to grok more. Sadly, though, I’m bumping into a familiar wall I’ve seen in basically all other technical subjects: There’s something of a desert of obvious resources between “Here’s an article offering a quick introduction to the general idea using some fuzzy metaphors” and “Here’s a textbook that gives the formal definitions and proofs.”

For instance, the book “Thinking Physics” by Lewis Carroll Epstein massively helps to fill this gap for classical physics, especially classical mechanics. By way of contrast, most intro to physics textbooks are awful at this. (“Here we derive the kinematic equation for an object’s movement under uniform acceleration. Now calculate how far this object goes when thrown at this angle at this velocity.” Why? Is this really a pathway optimized for helping me grok how the physical world works? No? So why are you asking me to do this? Oh, because it’s easy to measure whether students get those answers right? Thank you, Goodhart.)

Another excellent non-example is the Wikipedia article on how entropy in thermodynamics is a special case of Shannon entropy. Its length is great as a kind of quick overview, but it’s too short to really develop intuitions. And it also leans too heavily on formalism instead of lived experience.

(For instance, it references shannons (= bits of information), but it gives no hint that what a shannon is measuring is the average number of yes/​no questions of probability 12 that you have to ask to remove your uncertainty. Knowing that’s what a shannon is (courtesy of Brilliant’s course) gives me some hint about what a hartley (= base ten version instead of base two) probably is: I’m guessing it’s the average number of questions with ten possible answers each, where the prior on each answer is 110, that you’d have to ask to remove your uncertainty. But then what’s a nat (= base e version)? What does it mean for a question to have an irrational number of possible equally likely answers? I’m guessing you’d have to take a limit of some kind to make sense of this, but it’s not immediately obvious to me what that limit is let alone how to intuitively interpret what it’s saying. The Wikipedia article doesn’t even hint at this question let alone start to answer it. It’s quite happy just to show that the algebra works out.)

I want to learn to see information theory in my lived experience. I’m fine with technical details, but I want them tied to intuitions. I want to grok this. I don’t care about being able to calculate detailed probabilities or whatever except inasmuch as my doing those exercises actually helps with grokking this.

Even a good intuitive explanation of thermodynamics as seen through the lens of information theory would be helpful.

Any suggestions?