If Moral Realism is true, then the Orthogonality Thesis is false.

Eye You26 Jun 2025 18:31 UTC

3 points

Claim: if moral realism is true, then the Orthogonality Thesis is false, and superintelligent agents are very likely to be moral.

I’m arguing against Armstrong’s version of Orthogonality^[1]:

The fact of being of high intelligence provides extremely little constraint on what final goals an agent could have (as long as these goals are of feasible complexity, and do not refer intrinsically to the agent’s intelligence).

Argument

1. Assume moral realism; there are true facts about morality.

2. Intelligence is causally^[2] correlated with having true beliefs.^[3]

3. Intelligence is causally correlated with having true moral beliefs.^[4]

4. Moral beliefs constrain final goals; believing “X is morally wrong” is a very good reason and motivator for not doing X.^[5]

5. Superintelligent agents will likely have final goals that cohere with their (likely true) moral beliefs.

6. Superintelligent agents are likely to be moral.

^
Though this argument basically applies to most other versions, including Yudkowsky’s strong form: “There can exist arbitrarily intelligent agents pursuing any kind of goal [and] there’s no extra difficulty or complication in the existence of an intelligent agent that pursues a goal.”
This argument as is does not work against Yudkowsky’s weak form: “Since the goal of making paperclips is tractable, somewhere in the design space is an agent that optimizes that goal.”
^
Correction: there was a typo in the original post here. Instead of ‘causally’, it read ‘casually’.
^
A different way of saying this: Intelligent agents tend towards a comprehensive understanding of reality; towards having true beliefs. As intelligence increases, agents will (in general, on average) be less wrong.
^
A different way of saying this: Intelligent agents tend toward having true moral beliefs.
To spell this out a bit more:
A. Intelligent agents tend toward having true moral beliefs.
B. Moral facts (under moral realism) are (an important!) part of reality.
C. Intelligent agents tend toward having true moral beliefs.
^
One could reject this proposition by taking a strong moral externalism stance. If moral claims are not intrinsically motivating and there is no general connection between moral beliefs and motivation, then this proposition does not follow. See here for discussion of moral interalism and orthogonality and here for discussion of moral motivation.
As for the positive case for this proposition, there are at least two:
A. For any ideally rational agent, judging “X is morally wrong” entails a decisive, undefeated reason not to adopt X as a terminal end. [This supports a proposition even stronger than the one here, but gets us into the weeds of ‘ideally rational agents’.]
B. For sufficiently rational agents, believing “X is morally wrong” generates a pro-tanto motivation not to do X. Other motivations could in principle outweigh this motivation. [This supports the proposition here. Notably, it does not guarantee that perfectly rational, intelligent agents will be moral.]

What links here?

Eye You's comment on Is the orthogonality thesis at odds with moral realism? by ChrisHallquist (13 Oct 2025 11:59 UTC; 1 point)

Eye You26 Jun 2025 18:31 UTC

3 points

10 comments1 min readLW link

JBlack 27 Jun 2025 2:18 UTC
16 points
8
Granting for the sake of argument that a superintelligence can determine that the universe in which it exists has an associated objectively “true” moral system in some sense, what requires it to conform to that system?
In short “I believe that there exists an objectively privileged system in which X is wrong” is far from equivalent to “I believe that X is wrong”. Even the latter is far from equivalent to “I will not do X”.
There are multiple additional weaknesses in the argument, some of which are addressed by other commenters.
What links here?
- lesswronguser123's comment on If Moral Realism is true, then the Orthogonality Thesis is false. by Eye You (29 Jun 2025 7:17 UTC; 3 points)
Sabiola 27 Jun 2025 12:39 UTC
4 points
0
I think that moral realism is true, but species-specific. Morality is basically what being a social species feels like from the inside. To be able to live together, group members need to behave morally towards each other; if too many don’t, their society falls apart. But our human morality is different from what it would be if we were more like e.g. lions or naked mole rats. If we create an AGI, it will be its own species. Will it be a social species, and will it see us as their in-group?
- Viliam 29 Jun 2025 13:20 UTC
  4 points
  0
  Parent
  And for species that is not social, morality would mostly reduce to game theory, as most ways you interact with other members of your species is fighting for territory or food. (Exceptions: sex, maybe taking care of your kids.)
lesswronguser123 29 Jun 2025 7:17 UTC
3 points
2
I think I agree with bostrom’s 2012 position here: (On how it’s still a problem if moral realism is true; even though I think it’s false— believing moral realism is true doesn’t quite help people in designing friendly AI, as highlighted by other commentators)

The Orthogonality Thesis
Intelligence and final goals are orthogonal axes along which possible agents can freely
vary. In other words, more or less any level of intelligence could in principle be
combined with more or less any final goal
[...]
The orthogonality thesis, as formulated here, makes a claim about the relationship
between motivation and intelligence, rather than between motivation and rationality (or
motivation and reason). This is because some philosophers use the word “rationality” to connote
a “normatively thicker” concept than we seek to connote here with the word “intelligence”.
[...]
By “intelligence” here we
mean something like instrumental rationality—skill at prediction, planning, and means-ends
reasoning in general.
[...]
even if there are objective moral facts that any fully rational agent
would comprehend, and even if these moral facts are somehow intrinsically motivating (such
that anybody who fully comprehends them is necessarily motivated to act in accordance with
them) this need not undermine the orthogonality thesis. The thesis could still be true if an agent
could have impeccable instrumental rationality even whilst lacking some other faculty
constitutive of rationality proper, or some faculty required for the full comprehension of the
objective moral facts. (An agent could also be extremely intelligent, even superintelligent,
without having full instrumental rationality in every domain.)
silentbob 27 Jun 2025 10:12 UTC
2 points
0
By the way, I had a quick look at what PersonalityMap reports about how intelligence and ethics are correlated among humans. The websites provides an interface to query a pretty powerful AI model that is able to predict correlations (psychological, behavioral etc.) very well. The most suitable starting question that might correlate with high intelligence that I found was “What was your ACT score, between 1 and 36?” (although one could also just work with some made-up claim like “What’s your IQ?” or “Would you describe yourself as unusually intelligent?” or so, that the prediction model could probably work with almost as well). I then checked the correlation of this with some phrases that are vaguely related to doing good:
So, based on this, it appears that at least among humans (or rather, among the types of humans who’s data is in the database of PersonalityMap, which is likely primarily people from the US), intelligence and morality are not (meaningfully/positively) correlated, so locally this does look like evidence for the Orthogonality thesis holding up. Of course we can’t just extrapolate this to AI, let alone AGI/ASI. But maybe still an interesting data point. (Admittedly this is only tangentially related to your actual post, so sorry if this is a little off-topic)
silentbob 26 Jun 2025 21:26 UTC
2 points
0
The short version of my somewhat opposing view point would be something along the lines of “directional effects aren’t absolute truths”. If moral realism is true, then a superintelligence may indeed be more likely to find these moral facts—but it doesn’t mean it necessarily does, nor does it mean it will be motivated to accept these moral facts as goals. “In the limit” (of intelligence), maybe...? But “just able to disempower humanity”-level ASI could still be very far away from that.
Your points 2-4 are all what I would consider directional effects. (Side note, do you really mean “casually” or “causally”?) They are not necessarily very strong, and opposing factors could exist as well.
And point 6 turns these qualitative/directional considerations into something close-to-quantitative (“likely”) that I wouldn’t see as a conclusion following from the earlier points.
I would still agree with the basic idea that moral realism may be vaguely good news wrt the orthogonality thesis, but for me that seems like a very marginal change.
- Eye You 26 Jun 2025 21:39 UTC
  3 points
  0
  Parent
  I meant “causally”! Thank you for pointing that out. I’ve edited the post and corrected it.
Dagon 26 Jun 2025 20:54 UTC
2 points
1
I think this is correct, but I strongly doubt that any strong version of moral realism applies to our universe. I further suspect that there’s a separate argument you’d need to address: “if moral realism is true and we have wrong beliefs about moral truths, then correct beliefs could look nearly arbitrary”. I’ve not seen this second argument made (let alone rebutted), because most people I talk to don’t give much weight to moral realism.
There’s yet another argument against this in that steps 3 and 4 seem not to be universally true in humans—they might (or might) not have some explanatory power in a median individual, but we see plenty of examples of high-intelligence (for a human) presumably-immoral acts. Even if it’s correlated, individual instances of intelligence can vary widely in their moral actions.
- Dacyn 27 Jun 2025 13:30 UTC
  1 point
  0
  Parent
  Here’s Joe Carlsmith making the second argument: https://joecarlsmith.com/2022/01/17/the-ignorance-of-normative-realism-bot
Eye You 11 Sep 2025 23:23 UTC
1 point
0
Note: I just found that Bentham’s Bulldog made a post making a similar argument in 2023. They address many objections as well.