Extremely interesting — thanks for posting. Obviously there are a number of caveats which you carefully point out, but this seems like a very reasonable methodology and the actual date ranges look compelling to me. (Though they also align with my bias in favor of shorter timelines, so I might not be impartial on that.)
The expected number of bits in original encoding per bits in the compression equals the entropy of that language.
Wouldn’t this be the other way around? If your language has low entropy it should be more predictable, and therefore more compressible. So the entropy would be the number of bits in the compression for each expected bit of the original.
Extremely interesting — thanks for posting. Obviously there are a number of caveats which you carefully point out, but this seems like a very reasonable methodology and the actual date ranges look compelling to me. (Though they also align with my bias in favor of shorter timelines, so I might not be impartial on that.)
One quick question about the end of this section:
Wouldn’t this be the other way around? If your language has low entropy it should be more predictable, and therefore more compressible. So the entropy would be the number of bits in the compression for each expected bit of the original.