Would we even want AI to solve all our problems?

Status: more of the basics, that I find myself regularly repeating.

When people from the wider population talk to me about politics or global warming or various other large global issues (as they’re prone to do), I am prone to noting that most problems would be easier to fix if we had superintelligent friends at our backs.

A common objection I get goes something like:

So you think you can get AI to solve all the world’s problems? Wouldn’t that kinda suck, though? Wouldn’t a problem-free life be kinda terrible? Isn’t overcoming obstacles part of the fun of life?

To which my answer is: that’s completely right. If we aren’t careful, artificial intelligence could easily destroy all the value in the future, e.g. by aggressively solving everyone’s problems and giving everyone everything they ask for, in a way that sucks all the joy and meaning out of life.

Value is fragile. The goal is not to alleviate every ounce of discomfort; the goal is to make the future awesome. My guess is that that involves leaving people with real decisions that have real consequences, that it involves giving people the opportunity to screw up, that it involves allowing the universe to continue to be a place of obstacles that people must overcome.

But it also probably involves putting an end to a bunch of the senseless death, and putting an end to bad things happening to good people for no reason, and putting an end to the way that reality can punish a brief lapse of judgment with disproportionate, horrific, and irreversible consequences.

And it probably involves filling the universe with much more fun and joy and laughter. (And not just by building a bunch of speakers that play laugh tracks on a loop, but by filling the universe with the sorts of things that cause joyful laughter among healthy people, plus the healthy people to experience those things and laugh joyfully.)

The problem of making the future awesome is tricky and subtle. The way to get AI to help in the long term, is not to tell them exactly what our current best guess of what’s “good” consists in, in a long list of current philosopher’s best guesses, and then hope that modern philosophers got it exactly right. Because modern philosophers haven’t got it exactly right; and that whole approach is doomed to failure.

The hope is that we can somehow face our AI and point at all the things we care about, and say “we don’t know how to describe this in detail, and we don’t know what the limits are, but we care about it; can you look at us and look at all this lovely stuff and care about it too, and help us in the great endeavor to have a wonderful future?”. The win-state is ultimately to make friends, to make allies that help us fight the good fight.

The field of “what is the good stuff even (given that it’s not something simple, like mere cessation of suffering or understanding of the universe)?” is called fun theory, and people who took the AI problem seriously invented it, once it became clear that our species is going to face a question like this, as we grow up.

The word I use for this concept is Fun (with a capital F), and I don’t pretend to know exactly what it consists of, and the long-term goal is to build superintelligences that also want the future to be Fun.

And make no mistake, this is a long term goal. None of this is something humanity should be shooting for on its first try building AIs. The issue of AI alignment today is less about making the AIs care about fun, and more about making them perform some minimal pivotal act that ends the acute risk period, and buys us time to figure out how to make the superintelligent friends.

Because it’s important not to screw that task up, and we shouldn’t attempt it under time pressure before we know what we’re doing.

See also don’t leave your fingerprints on the future and coherent extrapolated volition and indirect normativity.