James Stephen Brown comments on Motivated reasoning, confirmation bias, and AI risk theory

James Stephen Brown 15 May 2026 22:31 UTC
3 points
0
Hey Seth, this was fascinating, a really beautifully thought out piece which I learned a lot from, and which also fired up a lot of associations. I hope you don’t mind but I wrote the thoughts that came to mind while listening and where it crosses over with ideas I’ve explored (less rigorously than you). They’re not arguments, just different ways I’ve thought about similar things, often in a way that ends up aligning with you. You’ve mentioned that you haven’t written for lay-people yet, but I found this quite accessible, and I’m sort of a layperson.
One short isolated association: your idea about infinite cognitive ability allowing us to believe true things but profess convenient falsehoods reminds me of Thrasymachus in Plato’s Republic, who proposes that we would do best to profess moral selflessness while acting selfishly in secret—Plato unfortunately doesn’t clock “massive cognitive dissonance” as a problematic factor, lol.
When it comes to human idiosyncracies like bias, I’m always asking how this bug might be a feature—I’ve done this with cogitive bias while simulating political alignment. I often find myself arguing in a way that validates the status quo, because I think often there are hidden or taken-for-granted elements or practices involved in the status quo that give it a sort of logic it doesn’t appear to have on a superficial level (and while I’m not dogmatically attached to the status quo, I find it’s counterintuitively underrepresented in arguments).
When a study in isolation finds a bias towards one view (like the studies you mention early in the piece) I ask “have they taken into account the path to that bias?”. Perhaps the subject got to that “bias” through reason and, having done that work already, are loathed to do it again (so it’s really an efficiency bias). If someone has already found 100 arguments against their position wanting, it makes sense to reduce the weight of new arguments. This is a sort of approximation of Bayesian reasoning (you acknowledge this around 24 minutes calling it a “inference machine” but only in a narrow domain, and then you mention something similar at 30 minutes—sorry, I’m listening obviously) and nature has cleverly done this to avoid us flip-flopping constantly whenever we are faced with a seemingly deductive argument (you mention later that memory isn’t relevant when weighing evidence, but it sort of is, if you’ve remembered evidence you’ve pre-processed). The weight of experience and our bias towards cognitive coherence protects against being duped on the reg.
I really felt like many of the issues I thought of, you addressed soon after they occurred to me (a sign of good writing). When thinking about bias toward the weight of personal experience, I was thinking, if this is valid then it’s also rational to take into account the quality and quantity of your interlocutor’s experience, and to weight that accordingly. You address this relationship when talking about the bias inherent in deferring to experts (which assumes the process I just mentioned).
The outsourcing to experts idea reminded me of an idea for digital democracy my mate proposed to me a couple of decades ago that, rather than politicians, we could have a range of political issues on a sort of perpetual referendum, but to avoid the overwhelm of having to constantly vote on every issue, we would nominate experts (who share some common moral compass) to vote on our behalf in relation to topics that are within their domain of expertise. Which I thought was a pretty clever idea—when the technology gets up to the task.
While I think deferring to experts is rational, I see your point with double-counting. This can be seen where one study, like that famous autism study, gets distributed widely before it is debunked and then the truth spends the next decades chasing the falsehood.
Where my intuitions about feature / bug break down (and they must, because problematic cognitive bias is a bug in the world, for sure) is this first-mover advantage when it comes to chaotic systems, leading me on a path of adjacent possibles, that are only available due to initial accidents of exposure (because chaotic systems are characteristically sensitive to “initial conditions”). I’m not sure how to protect against this, and don’t trust first principles thinking is going to help all that much, as it’s pretty prone to bias itself. I think perhaps a sort of “comparative religion” approach where you step back and look at the field of possibilities (the “raw distribution of beliefs” you mention) from time to time (I see you go on to suggest something very similar—naming the views). The scout mindset you mention later also adds a layer of redundancy to initial conditions in a similar way.
I think, bearing all this in mind your assertion that epistemic humility leads to clustering makes a lot of sense—and is an entirely new idea in my head, thanks. Come to think if it, it’s the sort of dynamic you’d expect to see with the digital democracy idea above, which might be an argument against that, I guess.
When you mentioned the “most irritating arguments” in relation to the “strongest arguments” I couldn’t help think… um, for me, those are almost always the same! But then I realised they are subtly different. Strong arguments might actually change my mind, so I actually see value in them (though they obviously make me feel uneasy, I’m only human). Irritating arguments, on the other hand, are, to me, those that I can see will be convincing to others who don’t have the weight of my experience telling me they are obviously incorrect—giving me the obligation to unpack that vague weight of experience and put it into words that will immunise those (gullible) people, without that experience, against the irritating argument. Such arguments include the ontological argument, and arguments for libertarianism.
Even though you’ve used estimates, I love that you’ve done actual Bayesian calculations in an accessible way, the chart tells the story nicely.
On the steelmanning point, an extension of this is the reverse argument, where you and your opponent, after arguing for a bit, switch roles. I’ve done this in my misspent youth arguing with religious apologists, and the reverse argument was the only thing that ever influenced the other person to change (and it upgraded my reasoning ability in the space too)—a friendly opponent I’d been arguing with for months, on my suggestion, switched roles, we went back and forth for a week then it trailed off. Two weeks later he informed me he was now an agnostic atheist. I didn’t ask why, but I think a week making arguments for the position (after months of exposure to pretty strong arguments for the position, in a polite and friendly exchange) had something to do with it.
I also like the idea of “identity defense” as a mindset to avoid.
Again, it was really nicely written and clear, I liked that it extended outside a strictly AI alignment realm into more general applications.
- Seth Herd 15 May 2026 23:03 UTC
  6 points
  0
  Parent
  Thanks for the extensive feedback! Agreed on most, obviously.
  
  Your belief contagion simulation looks awesome! I hadn’t seen it. I gave that subject short shrift here in the interest of wrapping this up in a finite time, but I think it might be a very strong source of bias, even within expert communities, let alone novices who defer to experts and surround themselves likeminded people because why not. I have ideas for further work, and for expanding your simulator. I did a very similar thing thirty years ago now on an internship at Oak Ridge, but it was for tracking climate change through ecosystem contagion! I had a vast simplification of the math which would now be unnecessary with modern computers. Anyway, I’d like to talk about it; those types of intuitively and visually appealing simulations seem high-value in illustrating points like this one.
  
  On irritating vs. strong arguments: Exactly! I didn’t spell this out, but that’s the direction I was trying to gesture at. What’s irritating/emotionally salient varies among people.
  
  Tiny nitpick for other readers, since you clearly got the semantic distinction and terminology isn’t the point: I and others use epistemic humility for just correcting so as not to be overconfident by default. I and at least some others use epistemic modesty differently, for weighing others’ beliefs as evidence fo your own.
  
  On switching perspectives: I think this sounds like an amazing epistemic move, and I wonder if there’s a way to make that more common like “I’d like to try making your argument to see if I understand it; would you be willing to do the same for mine?”
  
  On the rest, thanks for listening in such detail; we’re in agreement, and I’m glad to hear that!