When rereading [0 and 1 Are Not Probabilities], I thought: can we ever specify our amount of information in infinite domains, perhaps with something resembling hyperreals?
An uniformly random rational number from Q∩[0;1] is taken. There’s an infinite number of options meaning that prior probabilities are all zero (∀k∈Q:P(k)=0), so we need infinite amount of evidence to single out any number. (It’s worth noting that we have codes that can encode any specific rational number with a finite word—for instance, first apply bijection of rationals to natural numbers, then use Fibonacci coding; but in expectation we need to receive infinite bits to know an arbitrary number).
Since ∞ symbol doesn’t have nice properties with regards to addition and subtraction, we might define a symbol ΛN which means “we need some information to single out one natural number out of their full set”. Then, the uniform prior over Q would have form [0:⋯:0:2−ΛN:2−ΛN:⋯:0] (prefix and suffix standing for values outside [0;1] segment) while a communication “the number is k” would carry ΛN bits of evidence on average, making the posterior [⋯:0:20:2−ΛN:2−ΛN:…]∼[⋯:0:1:0:0:…].
The previous approach suffers from a problem, though. What if two uniformly random rationals (x,y) are taken, forming a square on coordinate grid? If we’ve been communicated ΛN information about x, we clearly have learned nothing about y and thus cannot pinpoint the specific point, requiring ΛN more bits.
However, there’s bijection between Q2 and N, so we can assign a unique natural number to any point in the square, and therefore can communicate it in ΛN bits in expectation, without any coefficient 2.
When I tried exploring some more, I’ve validated that greater uncertainty (ΛR, communication of one real number) makes smaller ones (ΛN) negligible, and that evidence for a natural number can presumably be squeezed into communication for a real value. That also makes the direction look unpromising.
However, there can be a continuation still: are there books/articles on how information is quantified given a distribution function?
When rereading [0 and 1 Are Not Probabilities], I thought: can we ever specify our amount of information in infinite domains, perhaps with something resembling hyperreals?
An uniformly random rational number from Q∩[0;1] is taken. There’s an infinite number of options meaning that prior probabilities are all zero (∀k∈Q:P(k)=0), so we need infinite amount of evidence to single out any number.
(It’s worth noting that we have codes that can encode any specific rational number with a finite word—for instance, first apply bijection of rationals to natural numbers, then use Fibonacci coding; but in expectation we need to receive infinite bits to know an arbitrary number).
Since ∞ symbol doesn’t have nice properties with regards to addition and subtraction, we might define a symbol ΛN which means “we need some information to single out one natural number out of their full set”. Then, the uniform prior over Q would have form [0:⋯:0:2−ΛN:2−ΛN:⋯:0] (prefix and suffix standing for values outside [0;1] segment) while a communication “the number is k” would carry ΛN bits of evidence on average, making the posterior [⋯:0:20:2−ΛN:2−ΛN:…]∼[⋯:0:1:0:0:…].
The previous approach suffers from a problem, though. What if two uniformly random rationals (x,y) are taken, forming a square on coordinate grid?
If we’ve been communicated ΛN information about x, we clearly have learned nothing about y and thus cannot pinpoint the specific point, requiring ΛN more bits.
However, there’s bijection between Q2 and N, so we can assign a unique natural number to any point in the square, and therefore can communicate it in ΛN bits in expectation, without any coefficient 2.
When I tried exploring some more, I’ve validated that greater uncertainty (ΛR, communication of one real number) makes smaller ones (ΛN) negligible, and that evidence for a natural number can presumably be squeezed into communication for a real value. That also makes the direction look unpromising.
However, there can be a continuation still: are there books/articles on how information is quantified given a distribution function?