I think it’s different because entropy is an expectation of a thing which depends on the probability distribution that you’re using to weight things.
Like, other things are maybe… A is the number of apples, sum of p×A is the expected number of apples under distribution p, sum of q×A is the expected number of apples under distribution q.
But entropy is… -log(p) is a thing, and sum of p × -log(p) is the entropy.
And the sum of q × -log(p) is… not entropy! (It’s “cross-entropy”)
That makes sense. In my post I’m saying that entropy is whatever binary string assignment you want, which does not depend on the probability distribution you’re using to weight things. And then if you want the minimum average string length, it becomes in terms of the probability distribution.
Ah, I missed this on a first skim and only got it recently, so some of my comments are probably missing this context in important ways. Sorry, that’s on me.
I think it’s different because entropy is an expectation of a thing which depends on the probability distribution that you’re using to weight things.
Like, other things are maybe… A is the number of apples, sum of p×A is the expected number of apples under distribution p, sum of q×A is the expected number of apples under distribution q.
But entropy is… -log(p) is a thing, and sum of p × -log(p) is the entropy.
And the sum of q × -log(p) is… not entropy! (It’s “cross-entropy”)
That makes sense. In my post I’m saying that entropy is whatever binary string assignment you want, which does not depend on the probability distribution you’re using to weight things. And then if you want the minimum average string length, it becomes in terms of the probability distribution.
Ah, I missed this on a first skim and only got it recently, so some of my comments are probably missing this context in important ways. Sorry, that’s on me.