There aren’t just two possibilities “ideal bayesian reasoning” and “useless rubbish”. There is a huge range of heuristics, ad-hoc models, evolved instincts, and everything else in the mix. These are all ‘outside bayesianism’ and while the collection is almost certainly worse than ideal bayesian reasoning, they are not useless.
That also doesn’t mean that we can only improve by having more people actively do bayesian reasoning about stuff, though there are certainly many cases where people would be better off actively doing bayesian reasoning.
There are many ways to improve incredibly complex systems such as human minds and their interactions. It’s far from certain that applying more bayesian reasoning is the best way. We are definitely not capable of reaching the ideal, and will have to settle for something imperfect. Maybe there is a better approximation than “try bayesian reasoning as far as our limited human brains can handle”, maybe there is not.
The main point is that we don’t know anything better, and pretty much everything else that we do know looks worse. However, there is a lot that we don’t know, and far more that we don’t even know that we don’t know.
One thing that is pretty clear is that an ideal bayesian reasoner would distribute some probability across all non-self-contradictory hypotheses that you can express in text of bounded length. There are only finitely many of them, so a failure to include some of them would be a pretty major departure from idealness.
The problem isn’t in the simulation part, but in the “supports” part.
You can certainly write a simulation in which an agent decides to take both boxes. By the conditions of the scenario, they get $1000. Does this simulation “support” taking both boxes? No, unless you’re only comparing with alternative actions of not taking a box at all, or burning box B and taking the ashes, or other things that are worse than getting $1000.
However, the scenario states that the agent could take 1 box, and it is a logical consequence of the scenario setup that that in the situations where they do, they get $1000000. That’s better than getting $1000 under the assumptions of the scenario, and so a simulation that actually follows the rules of the scenario cannot support taking two boxes.