James Stephen Brown

Karma: 418

I write and talk about game theory, moral philosophy, ethical economics and artificial intelligence—focused on non-zero-sum games and their importance in solving the world’s problems.

I have admitted I am wrong at least 10 times on the internet.

James Stephen Brown 21 Jun 2026 21:40 UTC
1 point
−2
on: James Stephen Brown’s Shortform
Empower AIs to regulate each other by limiting their individual power (consumption).

Listening to Emmett Shear on multi-agent systems made me think that, rather than having amorphous rules determining “morality” to achieve AI alignment, we could take his multi-agent system seriously and instead place a restriction on the size of the models, so any model that is drawing too much processing power needs to be regulated. Doing this would lead bigger companies to develop more models that are less individually powerful.

Regulating this could be partly possible, because the big players are, by necessity, very visible (because they are often commercially driven) so once a certain threshold is reached, transparency of power / processing usage becomes mandatory. Like anti-monopoly policies, this would stop any one player from becoming dominant and would mean that other models would be able to take part in regulating each other (because there is some parity in their power). This could lead to cartels or factions, but that might be a feature rather than a bug—a cartel might be necessary to take down a powerful player. The fact remains that the majority will always want the infinite game, so the majority will outnumber individual defectors.
This restriction has a number of benefits:
- Environmental (cuts down on power usage)
- Restricts run-away AI
- It’s a controllable extension of something natural—computational complexity limitation (it is visibility on an natural defence, like detecting T-cells to identify the presence of cancer)
- Empowers self-regulation by competitors through greater parity (turns an arms race into open-source monitoring / peer-review)
- Slows AI progress to a less disruptive level, so society can adapt
- Doesn’t require a codification of “morality” to work
It has a few downsides too:
- Slows technological progress on potentially life-saving / improving tech
- Very difficult to get global agreement
- Very difficult to regulate powerful tech companies
- Potentially easy to hide processing in distributed / cloud-based networks.
These are difficult problems, but the first 3, at least, are going to affect any regulatory approach. Here I’m wanting to identify a particular approach, that has some unique benefits including avenues for more robust self-regulation, which could complement a wider regulatory system.

James Stephen Brown 15 May 2026 22:31 UTC
3 points
0
on: Motivated reasoning, confirmation bias, and AI risk theory
Hey Seth, this was fascinating, a really beautifully thought out piece which I learned a lot from, and which also fired up a lot of associations. I hope you don’t mind but I wrote the thoughts that came to mind while listening and where it crosses over with ideas I’ve explored (less rigorously than you). They’re not arguments, just different ways I’ve thought about similar things, often in a way that ends up aligning with you. You’ve mentioned that you haven’t written for lay-people yet, but I found this quite accessible, and I’m sort of a layperson.
One short isolated association: your idea about infinite cognitive ability allowing us to believe true things but profess convenient falsehoods reminds me of Thrasymachus in Plato’s Republic, who proposes that we would do best to profess moral selflessness while acting selfishly in secret—Plato unfortunately doesn’t clock “massive cognitive dissonance” as a problematic factor, lol.
When it comes to human idiosyncracies like bias, I’m always asking how this bug might be a feature—I’ve done this with cogitive bias while simulating political alignment. I often find myself arguing in a way that validates the status quo, because I think often there are hidden or taken-for-granted elements or practices involved in the status quo that give it a sort of logic it doesn’t appear to have on a superficial level (and while I’m not dogmatically attached to the status quo, I find it’s counterintuitively underrepresented in arguments).
When a study in isolation finds a bias towards one view (like the studies you mention early in the piece) I ask “have they taken into account the path to that bias?”. Perhaps the subject got to that “bias” through reason and, having done that work already, are loathed to do it again (so it’s really an efficiency bias). If someone has already found 100 arguments against their position wanting, it makes sense to reduce the weight of new arguments. This is a sort of approximation of Bayesian reasoning (you acknowledge this around 24 minutes calling it a “inference machine” but only in a narrow domain, and then you mention something similar at 30 minutes—sorry, I’m listening obviously) and nature has cleverly done this to avoid us flip-flopping constantly whenever we are faced with a seemingly deductive argument (you mention later that memory isn’t relevant when weighing evidence, but it sort of is, if you’ve remembered evidence you’ve pre-processed). The weight of experience and our bias towards cognitive coherence protects against being duped on the reg.
I really felt like many of the issues I thought of, you addressed soon after they occurred to me (a sign of good writing). When thinking about bias toward the weight of personal experience, I was thinking, if this is valid then it’s also rational to take into account the quality and quantity of your interlocutor’s experience, and to weight that accordingly. You address this relationship when talking about the bias inherent in deferring to experts (which assumes the process I just mentioned).
The outsourcing to experts idea reminded me of an idea for digital democracy my mate proposed to me a couple of decades ago that, rather than politicians, we could have a range of political issues on a sort of perpetual referendum, but to avoid the overwhelm of having to constantly vote on every issue, we would nominate experts (who share some common moral compass) to vote on our behalf in relation to topics that are within their domain of expertise. Which I thought was a pretty clever idea—when the technology gets up to the task.
While I think deferring to experts is rational, I see your point with double-counting. This can be seen where one study, like that famous autism study, gets distributed widely before it is debunked and then the truth spends the next decades chasing the falsehood.
Where my intuitions about feature / bug break down (and they must, because problematic cognitive bias is a bug in the world, for sure) is this first-mover advantage when it comes to chaotic systems, leading me on a path of adjacent possibles, that are only available due to initial accidents of exposure (because chaotic systems are characteristically sensitive to “initial conditions”). I’m not sure how to protect against this, and don’t trust first principles thinking is going to help all that much, as it’s pretty prone to bias itself. I think perhaps a sort of “comparative religion” approach where you step back and look at the field of possibilities (the “raw distribution of beliefs” you mention) from time to time (I see you go on to suggest something very similar—naming the views). The scout mindset you mention later also adds a layer of redundancy to initial conditions in a similar way.
I think, bearing all this in mind your assertion that epistemic humility leads to clustering makes a lot of sense—and is an entirely new idea in my head, thanks. Come to think if it, it’s the sort of dynamic you’d expect to see with the digital democracy idea above, which might be an argument against that, I guess.
When you mentioned the “most irritating arguments” in relation to the “strongest arguments” I couldn’t help think… um, for me, those are almost always the same! But then I realised they are subtly different. Strong arguments might actually change my mind, so I actually see value in them (though they obviously make me feel uneasy, I’m only human). Irritating arguments, on the other hand, are, to me, those that I can see will be convincing to others who don’t have the weight of my experience telling me they are obviously incorrect—giving me the obligation to unpack that vague weight of experience and put it into words that will immunise those (gullible) people, without that experience, against the irritating argument. Such arguments include the ontological argument, and arguments for libertarianism.
Even though you’ve used estimates, I love that you’ve done actual Bayesian calculations in an accessible way, the chart tells the story nicely.
On the steelmanning point, an extension of this is the reverse argument, where you and your opponent, after arguing for a bit, switch roles. I’ve done this in my misspent youth arguing with religious apologists, and the reverse argument was the only thing that ever influenced the other person to change (and it upgraded my reasoning ability in the space too)—a friendly opponent I’d been arguing with for months, on my suggestion, switched roles, we went back and forth for a week then it trailed off. Two weeks later he informed me he was now an agnostic atheist. I didn’t ask why, but I think a week making arguments for the position (after months of exposure to pretty strong arguments for the position, in a polite and friendly exchange) had something to do with it.
I also like the idea of “identity defense” as a mindset to avoid.
Again, it was really nicely written and clear, I liked that it extended outside a strictly AI alignment realm into more general applications.

James Stephen Brown 4 May 2026 21:35 UTC
1 point
0
on: James Stephen Brown’s Shortform
There is a version of the Trolley Problem, which extends to a hospital scenario, where instead of switching tracks from 5 people to kill one, you’re killing one healthy person to save the lives of 5 people who need transplants.
Two points:

1. To my mind this is not even a problem, if doctors killed people with no fatal illnesses when they went to the hospital… no one would go to the hospital, and many more people would die. Trust is required for an ordered society.
Now, we can play the game where we stipulate that this is entirely isolated, and then keep taking real-world variables away from a situation until it’s identical to the other… sure… if you take away society… (even though we’re a social species and have moral responsibilities precisely because we live in a society) then I guess, if I mention all the nurses, doctors, surgeons, anaesthetists etc involved in these extensive surgeries, and the recipients who receive the organs who now have to live their lives with this dark secret, we’ll just magic away that with magic surgery, and memory wiping… sure, if you keep eliminating all the factors that make a hospital a hospital… then sure, you can literally recreate the trolley scenario, and at some point in this transformation the moral calculus will return to the switching tracks answer.
Moral concerns take on multiple factors. These “paradoxes” assume if you can’t reduce an equation to a single factor, you can’t get a valid answer. But that’s not how equations work.
Like the answer to bad science is better science, the answer to problematically framed utilitarianism is a more complex and accurate utilitarian framing.

2. Reflecting on this question, I tried plugging this into my Shapley Value Calculator, which measures marginal contributions to a group. Taking each coalition’s total years lived as the utilitarian goal, the “donor” (in this case Meredith) scores higher than all the others combined. Unfortunately I couldn’t do it with one donor and 5 patients because I was limited to 5 people total in the calculator (my bad).
Bob: 23.00
Allie: 23.00
Tom: 23.00
Helen: 23.00
Meredith: 108.00
So Shapley agrees that Meredith’s life is worth more… ethically. This might make no sense to you, but Lloyd Shapley did win a Nobel Prize for his work in game theory so I’m happy to extend a little authority to him.
Having said this, Meredith’s “worth” could be seen as a justification to sacrifice her, precisely because her marginal contribution is greater than the total of all the others, but I don’t think that would be reading the marginal gain correctly—it is usually used to justify the share of the “profits” or the “decision making power” meaning she would have a more than 50% stake in the decision, and should be allowed to opt out of the deal.
This is all purely theoretical of course… but it was an interesting experiment and could have gone either way.

James Stephen Brown 30 Nov 2025 17:44 UTC
1 point
0
in reply to: Mr Frege’s comment on: The NPC → MC Spectrum
Right, nice catch. I’ll add that as a note, as I think it’s still fair to say that the term came from the world of video games, as that’s the route it took into common parlance, even if it’s not the origin.

James Stephen Brown 14 Nov 2025 18:55 UTC
3 points
0
on: Human Values ≠ Goodness
It is quite possible to hyperoptimize for that one particular yumminess, then burn out and later realize that one values other things too—as many a parent learns when the midlife crisis hits.

So true, this reminds me of Jung’s emphasis on “the shadow“—it’s important to acknowledge (and not discount) “values” you hold that are selfish or otherwise not ostensibly pro-social.
… your actual Values long term (which usually involves other people)
This is also important to note. We are often torn between selfish wants and the wants and needs of others. This can be framed as selfishness = bad, concern for others = good. But I think it’s better interpreted as you say, that “goodness” is usually aligned with our own long-term interests which are often also aligned with the interests of others. So your values need not be a zero-sum contest between your interests and the interests of others.

James Stephen Brown 14 Sep 2025 19:25 UTC
3 points
0
on: James Stephen Brown’s Shortform
I’ve been running into something I think of as “The Narrow Band Dilemma” where a moderate ethical position is fragile because it is in a battle on two fronts between a popular more pure ethical position and a popular unethical or a-moral position.
The first example is ethically produced meat / free-range farming, where a slightly more expensive product tries to find a market that loses patrons on both sides, either people who don’t care about where their meat comes from, or care more about the expense than the ethics on the one hand, and the people who forgo meat altogether (vegans / vegetarians) on the other. As messaging successfully advertises the benefits of ethical meat production, it gains market share from one end of the spectrum (the “I care about ethics but can’t justify the additional expense”) but loses them on the other because the heightened awareness of animal cruelty drives ethical meat eaters away from eating meat altogether.
I noticed another example today in moderate Christianity, where it is flanked by fundamentalism and atheism. The more that Christians want to align their beliefs with modern secular ethics (moving away from fundamentalism), the more they are likely to leave the faith altogether.

Now, I’m not actually saying this is a problem, just a phenomenon I’ve noticed, I’m an atheist, so don’t mind if more people come to think as I do, but I’d also like more Christians to be more moderate, but I can see it’s difficult to build critical mass when they exist in a narrow band. I am in the narrow band when it comes to ethical meat, and I have seen how long it has taken for ethical meat products to become ubiquitous and accessible, and truly ethical real meat (lab grown) is still a way off. I imagine, if a much larger market had emerged (without the pressures of the narrow band), it’s possible this could be available already.

James Stephen Brown 31 Jul 2025 1:00 UTC
1 point
0
in reply to: CronoDAS’s comment on: Replicators—Pandora’s dangerous children
You’re right, but the term teme covers much more than that, for instance it’s also relevant to the development of AI agents, and AI self-editing / self-improvement. Although, identifying these systems as virus-like (because of the replication mechanism) might be instructive (as a red-flag).

James Stephen Brown 20 Jul 2025 20:55 UTC
3 points
0
in reply to: AnthonyC’s comment on: Emergent Gravity—Order out of Chaos
I agree, it’s not coming across at all well at present, needs a rewrite, give me a couple of weeks :)

James Stephen Brown 20 Jul 2025 20:54 UTC
1 point
0
in reply to: brambleboy’s comment on: Emergent Gravity—Order out of Chaos
I take your point, I think it needs a rewrite, I have not been nearly clear enough, and your notes are helpful in pointing me to areas I need to clarify. I have replies to your points here, but I should get my ducks-in-a-row before making them, so I don’t end up contradicting myself. Thanks for your comment.

James Stephen Brown 19 Jul 2025 23:50 UTC
3 points
0
in reply to: Adele Lopez’s comment on: Emergent Gravity—Order out of Chaos
Thanks Adele,
I appreciate your comment, and will take some time to process it and read the links. This is definitely not an area I have any expertise in and I’m not meaning to propose that this is how gravity actually works in reality—it’s more an illustration that something gravity-like, and elements that are like atoms or systems etc can arise out of very simple and random rules without the need for fine-tuning, and that constants (or regularities) can be arrived at by means of natural equilibria rather than being lucked upon, or designed.
But I probably haven’t made this clear. It was something I actually wrote I while ago and have only recently published here, so it may require a re-write, clarifying my intention and incorporating the points you’ve raised. You’re the first to provide a rigorous rebuttal for it so far, so I appreciate you lending your expertise in this respect.

James Stephen Brown 17 Jul 2025 3:47 UTC
1 point
0
in reply to: quanticle’s comment on: Moloch’s Demise—solving the original problem
Ah, yes now you’ve jogged my memory about all the attempted expansionism in between. You make a solid case that they didn’t step outside of the expand-or-die dichotomy willingly.
The point I’m trying to make is that the third option was there (perhaps it wasn’t feasible before WWII, I’m not sure), but the third option (mutually beneficial trade and cooperation) ended up sustaining Japan from WWII to the present without the need for expansion.
The point of the post is that often there is often a third option outside of expand-or-die, and it’s worth questioning what that could be in any given problem. But thanks for all the very good points—I absolutely agree with you that there have been civilisations that have had to, or have seemed to have to, expand in order to survive. Thanks for your well-considered points, and the spot-on history (apologies for the patchiness of mine).

James Stephen Brown 15 Jul 2025 18:14 UTC
1 point
0
in reply to: quanticle’s comment on: Moloch’s Demise—solving the original problem
I take your point, it requires everyone to behave themselves (I’m actually familiar with this history, I went down a Japanese history rabbit hole about this time last year, fascinating), but if we continue with Japan we find that due to a third option of trading-with-other-nations (beginning with the Meiji restoration I think...) Japan continues to operate as a sovereign nation without the need to expand (with the exception of its ill-fated and frankly bonkers attempt to expand in WWII...).
So, again it’s good to look for answers outside the paradigm of expand or die. Cooperation and trade are non-zero-sum options that are available in the messy and therefore less theoretically-bound real world, as opposed to a formal game theory scenario.
But I think the expansionist trap you describe is a real thing and an important cautionary tale, which could perhaps be applied to our modern perpetual growth model of economics and its attendant consumerism (here I am sounding like a first year sociology major).

James Stephen Brown 14 Jul 2025 21:10 UTC
1 point
0
in reply to: quanticle’s comment on: Moloch’s Demise—solving the original problem
True, though there are many examples of conquerors who expanded for the sake of an expansionist philosophy or glory: Alexander the Great, The Mongols, The Assyrians, The Crusades… off the top of my head. The Germans in WWII definitely justified expansion for the sake of living space (Lebensraum), so there are examples of expansion at least being justified in the way you mention. And of course colonialism is justified in the same way.
I think what you’re saying is logical, but the example, being metaphorical, is more to illustrate that we should question critically what it is we actually want before conceding a price to pay for it. As you say, it might be necessary, but it also might not.

James Stephen Brown 14 Jul 2025 2:51 UTC
10 points
0
in reply to: Seth Herd’s comment on: Moloch’s Demise—solving the original problem
Humans profess to care about everyone a lot more than they really do, because doing that (and even thinking that) is strategically useful.
A bit bleak… but yes, your logic checks out, and hence why coordination problems are so sticky (I did sort of claim to solve the problem didn’t I? Oops, back to the drawing board).

James Stephen Brown 13 Jul 2025 7:16 UTC
3 points
0
in reply to: Richard_Kennaway’s comment on: Win-Win-Win Ethics—Reconciling Consequentialism, Virtue Ethics and Deontology
I love this silly side of Yudkowsky.

James Stephen Brown 11 Jun 2025 6:03 UTC
1 point
0
in reply to: lesswronguser123’s comment on: Emergence Spirals—what Yudkowsky gets wrong
Thanks for your comment, I appreciate your points, and see that Yudkowsky appreciates some use of higher-level abstractions as a pragmatic tool that is not erased by reductionism. But I still feel like you’re being a bit too charitable. I re-read the ‘it’s okay to use ‘emerge”’ parts several times, and as I understand it, he’s not meaning to refer to a higher-level abstraction, he’s using it in the general sense “whatever byproduct comes from this” in which case it would be just as meaningful to say “heat emerges from the body” which does not reflect any definition of emergence as a higher-level abstraction. I think the issue comes into focus with your final point:
The phrase “intelligence is emergent” as what intelligence is doesn’t predict anything and is a blank phrase, this is what he was opposed to.

But it is not correct to say that acknowledging intelligence as emergent doesn’t help us predict anything. If emergence can be described as a pattern that happens across different realms then it can help to predict things, through the use of analogy. If for instance we can see that neurones are selected and strengthened based on use, we can transfer some of our knowledge about natural selection in biological evolution to provide fruitful questions to ask, and research to do, on neural evolution. If we understand that an emergent system has reached equilibrium, it can help us to ask useful questions about what new systems might emerge on top of that system, questions we might not otherwise ask if we were not to recognise the shared pattern.

A question I often ask myself is “If the world itself is to become increasingly organised, at some point do we cease to be autonomous entities an on a floating rock, and become instead like automatic cells within a new vector of autonomy (the planet as super-organism)”. This question only comes about if we acknowledge that the world itself is subject to the same sorts of emergent processes that humans and other animals are (although not exactly, a planet doesn’t have much of a social life, and that could be essential to autonomy). I find these predictions based on principles of emergence interesting and potentially consequential.

James Stephen Brown 11 Jun 2025 5:29 UTC
1 point
0
in reply to: Ape in the coat’s comment on: Emergence Spirals—what Yudkowsky gets wrong
Sorry about my lack of clarity: By “complex” I mean “intricately ordered” rather than the simple disorder generally expected of an entropic process. To taboo both this and alignment as “following the same pattern as”:
I’d like to make the case that emergent complexity is where…
- a whole system is more intricately ordered than the sum of its parts
- a system follows more closely the pattern of a macroscopic phenomenon than it follows the pattern of any of its component parts.
By a macroscopic phenomenon, I mean any (or all) of the following:

1. Another physical feature of the world which it fits to, like roads aligning with a map and its terrain (and obstacles).
2. Another instance of what appears to fulfil a similar purpose despite entirely different paths to get there or materials (like with convergence)
3. A conceptual feature of the world, like a purpose or function.

So, we can more readily understand an emergent phenomenon in relation to some other macroscopic phenomenon than we can were we to merely inspect the cells in isolation. In other words, there is usefulness identifying the 20+ varieties of eyes as “eyes” (2) even though they are not the same at all, on a cellular level. It is also meaningful to understand that they perform a function or purpose (3), and that they fit the physical world (by reflecting it relatively accurately) (1).

James Stephen Brown 11 Jun 2025 4:58 UTC
3 points
0
in reply to: Richard_Kennaway’s comment on: Emergence Spirals—what Yudkowsky gets wrong
This is an error I see people making over and over… That different theory may be a useful new development! But that is what it is, not a defence of the original theory.
I think this is the crux of our disagreement. Yudkowsky was denying the usefulness of a term entirely because some people use it vaguely. I am trying to provide a less vague and more useful definition of the term—not to say Yudkowsky is unjustified in criticising the use of the term, but that he is unjustified in writing it off completely because of some superficial flaws in presentation, or some unrefined aspects of the concept.

An error that I see happening often is throwing out the baby with the bathwater, and I’ve read people on Less Wrong (even Yudkowsky I think, though I can’t remember where, sorry) write in support of ideas like “Error Correction” as a virtue and Bayesian updating whereby we take criticisms as an opportunity to refine a concept rather than writing it off completely.

I am trying to take part in that process, and I think Yudkowsky would have been better served had he done the same—suggested a better definition that is useful.

James Stephen Brown 10 Jun 2025 10:47 UTC
1 point
0
in reply to: Richard_Kennaway’s comment on: Emergence Spirals—what Yudkowsky gets wrong
Thanks for your comment, but I think it misses the mark somewhat.
While googling to find someone who expresses a straw-man position in the real-world is a form of straw-manning itself, this comment goes further to misrepresent a colloquial use of the word “magical” to mean literal (supernatural) “magic”.
While I haven’t read the book referenced, the quotes provided do not give enough context to claim that the author doesn’t mean what he obviously means (to me at least) that the development of an emergent phenomena seems magical… does it not seem magical? Seeming magical is not a claim that something is not reducible to its component parts, it just means it’s not immediately reducible without some thorough investigation into the mechanisms at work. Part and parcel of the definition of emergence is that it is a non-magical (bottom-up) way of understanding phenomena that seem remarkable (magical), which is why he uses a clearly non-supernatural system like an anthill to illustrate it.

Despite all this, the purpose of the post was to give a clear definition of emergence that doesn’t fall into Yudkowsky’s strawman—not a claim that no one has ever used the word loosely in the past. As conceded in the preamble (paraphrasing) I don’t expect something written 18 years ago to perfectly reflect the conceptual landscape of today.

James Stephen Brown 10 Jun 2025 10:17 UTC
1 point
0
in reply to: sunwillrise’s comment on: Emergence Spirals—what Yudkowsky gets wrong
Thanks, and yes, I did scan over the comments when I first read the article, and noted many good points, but when I decided to write I wanted to focus on this particular angle and not get lost in an encyclopaedia defences. I’m very much in the same camp as the first comment you quote.
I appreciate your take on Yudkowsky’s overreach, and the historical context. That helps me understand his position better.
The semantic stop-sign is interesting, I do appreciate Yudkowsky coming up with these handy handles for ideas that often crop up in discussion. Your two examples make me think of the fallacy of composition, in that emergence seems to be a key feature of reality that, at least in part, makes the fallacy of composition a fallacy.