Against Messianic AI: Why Optimizing the Environment Doesn’t Optimize the Agent

Epistemic Status: Conceptual exploration. I am using formal notation below as a toy model to map structural boundaries, not as a computable algorithm.

Sometimes I worry that in our rush to develop helpful, safe AI models, we’re making a dangerous philosophical leap.

AI will make us better as humans.

It’s a subtext, at most. The claims of radical scientific breakthroughs, efficiency boosts, integration with other emerging technologies that will launch us into a new utopia.

But it’s there.

Great inventors have always hoped for utopia, and why shouldn’t they? From well before the Industrial Revolution, when we began to see scalable technological dispersion among societies leading to notable economic transformation, there has been a vision for bringing forth technology that will solve society’s problems.

I was talking with a friend yesterday about this, and we marveled at just the 20th-century technological achievements in which the United States has been a leader. Space exploration. The daVinci surgical robot. The internet. Vaccines.

All these are marvels, yes. But do they make us better humans?

I think some in the broader, corporate AI industry (like OpenAI’s mainstream public messaging) are implying this heavily, even if the alignment community here knows better.

Are they right?

Let’s break it down logically.

Given the following:

  • G(x) = “person x possesses internal moral virtue (independent of environmental constraints)”

  • E(x, t) = “person x has access to /​ uses thing t”

  • U(t) = “thing t is beneficial”

We first analyze Claim 1:

For all people and all things: If the thing is beneficial and you have access to it, you become good.

Stay with me, because I realize that this is trivially false by counterexample. I’m more interested in the weaker claim that people might actually believe, or at the very least, that it’s possible AI leaders slip into some sort of valuation-induced Aristotelian fervor.

Claim 2:

Access to a beneficial thing makes you more good than you were before.

Tricky, right? And in the case of AI, we even have sycophantic technology telling us we are incredible people, the likes of which the world has never seen before. But that’s for another post.

So why is the above false?

If you define a function M as the moral character of a person, having access to a beneficial tool doesn’t constrain M in any one direction:

A person uses the internet to plan a scam, and M decreases.

A person uses the internet to organize mutual aid. M increases.

A person uses the internet to watch cat videos. M unchanged.

With the same tool, you can induce three different moral outcomes. So:

As you can see, the counterexamples are easy to generate, but it’s a weak result.

What I’m most interested in is whether this is accidental or structural. Have we just not found a tool able to improve humanity’s goodness yet, or is it impossible that a tool could do this?

My gut is that it’s structural.

G(x), however we view a person’s moral goodness, is a function of their dispositions, intentions, and choice patterns over time. It’s a holistic reflection of their personhood. U(t) is an object’s single property in the world, whether a light bulb or a microchip. Claiming requires information about a thing to determine the trajectory of a complex property of an agent, but there’s no input channel from U(t) to G(x). The only possible connection is through x’s choices about t, and those themselves are expressions of personal moral goodness, G(x).

However a person is impacted by a tool is determined by who they already were and what they choose to do with the tool, NOT by nature of the tool itself being good. Technologies don’t transmit goodness.

mermaid-diagram (1).png

The Choice Architecture Argument

An obvious objection: What about tech that seems to make people worse? One might think of social media, outrage algorithms, etc. It’s true these are known to degrade human behavior, so do tool-properties push G(x) up as well as down?

Well, not quite. These technologies modify the choice environment for humans. With social media or algorithms, it’s easier to make bad choices, more “tempting” even. But humans still possess the moral agency responsible for the outcome; even if I’m doomscrolling for hours, I could also delete the app. Ergo, the distribution of outcomes shifts, but shifts differently depending on the person.

And thus G(x) is still doing the explanatory work here.

If we want to be super-precise, then:

Where technology t creates the choice environment C for x.

mermaid-diagram (2).png

An Environmental Patch is Not an Agent Upgrade

We as humans are still navigating the terrain of technology, no matter how “good” or “bad” it seems. To be clear, if a Messianic AI simply constrains the environment so tightly that humans can no longer cause massive harm, that is a monumental consequentialist victory. I am not arguing against that utility. I am arguing against the specific idea that this environmental constraint equates to an upgrade of the human agent.

I hope this doesn’t come across as another “technology is neutral” argument. My angle here is that we need to dissect the structural logic that makes us slip into these fallacies.

What I’ve shown here is that the properties of a technology cannot reach the structure of moral character as a matter of logic.

I think this matters for AI builders, because if U(AI) being true has no implications about ΔG(humanity), then any subtle promises that sufficiently beneficial AI will make us better are not only unproven, but could not be proven.

Sure, we might get richer, smarter, healthier. But morally improved?

That’s a human problem, not a machine one.

No comments.