Out of curiosity, suppose you record every datapoint used to generate these priors (and every subsequent datapoint). How do you make AI systems that don’t fall into this trap?
My first guess is it’s a problem in the same class as where when training neural networks, the starting random values are the prior. And therefore some networks will never converge on a good answer simply because they start with incorrect priors. So you have to roll the dice many more times on the initialization.
Out of curiosity, suppose you record every datapoint used to generate these priors (and every subsequent datapoint). How do you make AI systems that don’t fall into this trap?
My first guess is it’s a problem in the same class as where when training neural networks, the starting random values are the prior. And therefore some networks will never converge on a good answer simply because they start with incorrect priors. So you have to roll the dice many more times on the initialization.