should I think of it as: two parameters having similarly high Bayesian posterior probability, but the brain not explicitly representing this posterior, instead using something like local hill climbing to find a local MAP solution—bistable perception corresponding to the two different solutions this process converges to?
Yup, sounds right.
to what extent should I interpret the brain as finding a single solution (MLE/MAP) versus representing a superposition or distribution over multiple solutions (fully Bayesian)?
I think it can represent multiple possibilities to a nonzero but quite limited extent; I think the superposition can only be kinda local to a particular subregion of the cortex and a fraction of a second. I talk about that a bit in §2.3.
in which context should I interpret the phrase “the brain settling on two different generative models”
I wrote “your brain can wind up settling on either of [the two generative models]”, not both at once.
I wrote “your brain can wind up settling on either of [the two generative models]”, not both at once.
Ah that makes sense. So the picture I should have is: whatever local algorithm oscillates between multiple local MAP solutions over time that correspond to qualitatively different high-level information (e.g., clockwise vs counterclockwise). Concretely, something like the metastable states of a Hopfield network, or the update steps of predictive coding (literally gradient update to find MAP solution for perception!!) oscillating between multiple local minima?
Yup, sounds right.
I think it can represent multiple possibilities to a nonzero but quite limited extent; I think the superposition can only be kinda local to a particular subregion of the cortex and a fraction of a second. I talk about that a bit in §2.3.
I wrote “your brain can wind up settling on either of [the two generative models]”, not both at once.
…Not sure if I answered your question.
Ah that makes sense. So the picture I should have is: whatever local algorithm oscillates between multiple local MAP solutions over time that correspond to qualitatively different high-level information (e.g., clockwise vs counterclockwise). Concretely, something like the metastable states of a Hopfield network, or the update steps of predictive coding (literally gradient update to find MAP solution for perception!!) oscillating between multiple local minima?