How would you answer this without looking at the csv?
I wrote a post on my prior over Bernoulli distributions, called “Rethinking Laplace’s Law of Success”. Laplace’s Law of Succession is based on a uniform prior over [0,1], whereas my prior is based on the following mixture distribution:
The first term captures logistic transformations of normal variables (weight w1), resolving the issue that probabilities should be spread across log-odds
The second term captures deterministic programs (weight w2), allowing for exactly zero and one
The third term captures rational probabilities with simple fractions (weight w3), giving weight to simple ratios
The fourth term captures uniform interval (weight w4), corresponding to Laplace’s original prior
The default parameters (w1=0.3, w2=0.1, w3=0.3, w4=0.3, sigma=5, alpha=2) reflect my intuition about the relative frequency of these different types of programs in practice.
Using this prior, we get the result [0.106, 0.348, 0.500, 0.652, 0.894]
The numbers are predictions for P(5th trial = R | k Rs observed in first 4 trials):
If you see 0 Rs in the first 4 trials (all Ls), there’s a 10.6% chance the 5th is R
If you see 1 R in the first 4 trials, there’s a 34.8% chance the 5th is R
If you see 2 Rs in the first 4 trials, there’s a 50% chance the 5th is R
If you see 3 Rs in the first 4 trials, there’s a 65.2% chance the 5th is R
If you see 4 Rs in the first 4 trials (all Rs), there’s an 89.4% chance the 5th is R
The Laplace’s Rule of Succession five numbers using the are [0.167, 0.333, 0.500, 0.667, 0.833], but I think this is too conservative because it underestimate the likelihood of near-deterministic processes.
How would you answer this without looking at the csv?
I wrote a post on my prior over Bernoulli distributions, called “Rethinking Laplace’s Law of Success”. Laplace’s Law of Succession is based on a uniform prior over [0,1], whereas my prior is based on the following mixture distribution:
Using this prior, we get the result [0.106, 0.348, 0.500, 0.652, 0.894]
The numbers are predictions for P(5th trial = R | k Rs observed in first 4 trials):
If you see 0 Rs in the first 4 trials (all Ls), there’s a 10.6% chance the 5th is R
If you see 1 R in the first 4 trials, there’s a 34.8% chance the 5th is R
If you see 2 Rs in the first 4 trials, there’s a 50% chance the 5th is R
If you see 3 Rs in the first 4 trials, there’s a 65.2% chance the 5th is R
If you see 4 Rs in the first 4 trials (all Rs), there’s an 89.4% chance the 5th is R
The Laplace’s Rule of Succession five numbers using the are [0.167, 0.333, 0.500, 0.667, 0.833], but I think this is too conservative because it underestimate the likelihood of near-deterministic processes.