… priors are not actually necessary when working with Bayesian updates specifically. You can just work entirely with likelihood ratios...
I think that here you’re missing the most important use of priors.
Your prior probabilities for various models may not be too important, partly because it’s very easy to look at the likelihood ratios for models and see what influence those priors have on the final posterior probabilities of the various models.
The much more important, and difficult, issue is what priors to use on parameters within each model.
Almost all models are not going to fix every aspect of reality that could affect what you observe. So there are unknowns within each model. Some unknown parameters may be common to all models; some may be unique to a particular model (making no sense in the context of a different model). For parameters of both types, you need to specify prior distributions in order to be able to compute the probability of the observations given the model, and hence the model likelihood ratios.
Here’s a made-up example (about a subject of which I know nothing, so it may be laughably unrealistic). Suppose you have three models about how US intelligence agencies are trying to influence AI development. M0 is that these agencies are not doing anything to influence AI development. M1 is that they are trying to speed it up. M2 is that they are trying to slow it down. Your observations are about how fast AI development is proceeding at some organizations such as OpenAI and Meta.
For all three models, there are common unknown parameters describing how fast AI progresses at an average organization without intelligence agency intervention, and how much variation there is between organizations in their rate of progress. For M1 and M2, there are also parameters describing how much the agencies can influence progress (eg, via secret subsidies, or covert cyber attacks on AI compute infrastructure), and how much variation there is in the agencies’ ability to influence different organizations.
Suppose you see that AI progress at OpenAI is swift, but progress at Meta is slow. How does that affect the likelihood ratios among M0, M1, and M2?
It depends on your priors for the unknown model parameters. If you think it unlikely that such large variation in progress would happen with no intelligence agency intervention, but that there could easily be large variation in how much these agencies can affect development at different organizations, then you should update to giving higher probability to M1 or M2, and lower probability to M0. If you also thought the slow progress at Meta was normal, you should furthermore update to giving M1 higher probability relative to M2, explaining the fast progress at OpenAI by assistance from the agencies. On the other hand, if you think that large variation in progress at different organizations is likely even without intelligence agency intervention, then your observations don’t tell you much about whether M0, M1, or M2 is true.
Actually, of course, you are uncertain about all these parameters, so you have prior distributions for them rather than definite beliefs, with the likelihoods for M0, M1, and M2 being obtained by integrating over these priors. These likelihoods can be very sensitive to what your priors for these model parameters are, in ways that may not be obvious.
I think that here you’re missing the most important use of priors.
Your prior probabilities for various models may not be too important, partly because it’s very easy to look at the likelihood ratios for models and see what influence those priors have on the final posterior probabilities of the various models.
The much more important, and difficult, issue is what priors to use on parameters within each model.
Almost all models are not going to fix every aspect of reality that could affect what you observe. So there are unknowns within each model. Some unknown parameters may be common to all models; some may be unique to a particular model (making no sense in the context of a different model). For parameters of both types, you need to specify prior distributions in order to be able to compute the probability of the observations given the model, and hence the model likelihood ratios.
Here’s a made-up example (about a subject of which I know nothing, so it may be laughably unrealistic). Suppose you have three models about how US intelligence agencies are trying to influence AI development. M0 is that these agencies are not doing anything to influence AI development. M1 is that they are trying to speed it up. M2 is that they are trying to slow it down. Your observations are about how fast AI development is proceeding at some organizations such as OpenAI and Meta.
For all three models, there are common unknown parameters describing how fast AI progresses at an average organization without intelligence agency intervention, and how much variation there is between organizations in their rate of progress. For M1 and M2, there are also parameters describing how much the agencies can influence progress (eg, via secret subsidies, or covert cyber attacks on AI compute infrastructure), and how much variation there is in the agencies’ ability to influence different organizations.
Suppose you see that AI progress at OpenAI is swift, but progress at Meta is slow. How does that affect the likelihood ratios among M0, M1, and M2?
It depends on your priors for the unknown model parameters. If you think it unlikely that such large variation in progress would happen with no intelligence agency intervention, but that there could easily be large variation in how much these agencies can affect development at different organizations, then you should update to giving higher probability to M1 or M2, and lower probability to M0. If you also thought the slow progress at Meta was normal, you should furthermore update to giving M1 higher probability relative to M2, explaining the fast progress at OpenAI by assistance from the agencies. On the other hand, if you think that large variation in progress at different organizations is likely even without intelligence agency intervention, then your observations don’t tell you much about whether M0, M1, or M2 is true.
Actually, of course, you are uncertain about all these parameters, so you have prior distributions for them rather than definite beliefs, with the likelihoods for M0, M1, and M2 being obtained by integrating over these priors. These likelihoods can be very sensitive to what your priors for these model parameters are, in ways that may not be obvious.