While the model is interesting, it is almost irremediably ruined by this line: “since by definition P(A) = Ta/n”, which substantially conflates probability with frequency.
Think of P(A) merely as the output of a noiseless version of the same algorithm. Obviously this depends on the prior, but I think this one is not unreasonable in most cases.
Think of P(A) merely as the output of a noiseless version of the same algorithm.
because P(A) is the noiseless parameter. Anyway, the entire paper is based on the counting algorithm to establish that random noise can give rise to structured bias, and that this is a problem for a bayesian AI. But while the mechanism can be an interesting and maybe even correct way to unify the mentioned bias in human mind, it can hardly be posed as a problem for such an artificial intelligence.
A counting algorithm for establishing probabilities basically denies everything bayesian update is designed for (the most trivial example: extraction from a finite urn).
Well, yes, the prior that yields counting algorithms is not universal. But in many cases it’s good idea! And if you decide to use, for example, some rule-of-succession style modifications, the same situation appears.
In the case of a finite urn, you might see different biases (or none at all if your algorithm stubbornly refuses to update because you chose a silly prior).
Think of P(A) merely as the output of a noiseless version of the same algorithm. Obviously this depends on the prior, but I think this one is not unreasonable in most cases.
I’m not sure I’ve understood the sentence
because P(A) is the noiseless parameter.
Anyway, the entire paper is based on the counting algorithm to establish that random noise can give rise to structured bias, and that this is a problem for a bayesian AI.
But while the mechanism can be an interesting and maybe even correct way to unify the mentioned bias in human mind, it can hardly be posed as a problem for such an artificial intelligence. A counting algorithm for establishing probabilities basically denies everything bayesian update is designed for (the most trivial example: extraction from a finite urn).
Well, yes, the prior that yields counting algorithms is not universal. But in many cases it’s good idea! And if you decide to use, for example, some rule-of-succession style modifications, the same situation appears.
In the case of a finite urn, you might see different biases (or none at all if your algorithm stubbornly refuses to update because you chose a silly prior).