What would adults in the room know about AI risk?
A prompt I like for thinking about what AI macrostrategy needs to achieve is “what would the adults in the room know about our situation, prior to the transition to advanced AI?”
The hypothetical I’m imagining is something like:
We’re on the cusp of an epochal shift to an AI-powered future
But it’s ok! Sane, competent, sensible adults are navigating the transition
I don’t mean “relatively sane, competent, sensible people drawn from the literal population of human adults”, but something more poetic — the version of adults that we imagine as kids, who are actually sane, competent, and sensible by any objective standards, even on very hard questions in unprecedented situations
They are paying attention to the things they should be, answering the questions that need answered (and leaving those that don’t), and taking measured and appropriate actions to navigate safely through
What questions do they answer?[1]
Part of why I like this prompt is that it draws on quite rich connotations on the emotional state of idealised adults.[2]Things like: adults are calm, they’re not reactive, they’re not particularly concerned about how others view them or whether they get to be a hero, they’re sensible and level-headed. These connotations help me to route around my own fear/panic in the face of AI risk (and my psychological need to personally make a contribution), such that I generate slightly different answers to this prompt than to things like ‘what’s critical path’ or ‘what do we need to know before AI can automate our work’.
Some pictures that vaguely capture what I mean by ‘adult’:
Below are my first pass responses to the prompt. I’d be interested in reactions, and especially in other people’s responses.
Ok, some things I think that adults in the room would know, in the order they occurred to me:
What are the likely selection effects, good and bad, on both AI systems and humans?
I imagine adults having a fairly mechanistic understanding of where the broad basins of attraction are, and using that to determine which risks to focus on—rather than more brittle threat modelling or predictions.
How are other humans likely to act during this transition?
I think adults would have a pretty grounded sense of what other humans are likely to do. I’m thinking of things like ‘the USG is un/likely to panic’, ‘China is un/likely to follow through on threats’, etc. Adults would do a bunch of talking to the relevant people and have good internal models of how they’re likely to behave, as well as knowing any fairly robust social science results that pertain here.
Which actions are hard to reverse?
My model of adults would be tracking that a lot of value could be lost through irreversible actions, which implies that it’s important to a) identify which actions are hard to reverse, and then b) for the most part, avoid taking those actions. I think adults would also be tracking that some irreversible actions might be necessary to prevent other harms, and would be willing to take them soberly and advisedly.
When to take leaps of faith
Sometimes, the right move is to jump to the equilibrium you think is right, and just have faith that others will follow suit. I think adults would be good at discerning when we need analytical, rational answers, and when we need faith. Leaps of faith that seem plausible to me are:
In some scenarios, a leap of faith that if you slow down, others will follow suit
A leap of faith that questions about acausal trade, simulations, most of space, can and should be handled later by more intelligent beings (with the exception of hard to reverse actions in those domains, which I think adults would be tracking)
Which questions can we safely defer to AI systems, and which do we need to answer via human reasoning?
In the status quo world, I’m worried about errors in both directions here. I think adults wouldn’t be governed by either excitement or fear on this, and would have a principled take on the sorts of questions that normatively should be answered by humans, and that empirically can be well and safely answered by AIs.
This question originated from a retreat session ran by Lizka Vaintrob which I really liked, on which questions we actually need to know the answers to in the near future, for the transition to advanced AI to go well. Thanks to Lizka and Owen Cotton-Barratt for that session.
This post is just my responses to one of several prompts; I hope that Lizka, Owen and I will write more about our collective takes on the overall question soon.
“What do they actually do” is another very important question that I should think about more, but haven’t so far. ↩︎
I know people who don’t really like treating the abstraction ‘adult’ in this idealised way, I think because they’re worried that the abstraction will be used to violently repress the child parts of them. I don’t intend the abstraction in that way, and am more trying to channel inspiration than judgement here. ↩︎
Adults will pre-mortem plan by thinking most of their plans will fail. They will therefore they have a dozen layers of backups and action plans prepared in advance. This is also so that other people can feel that they’re safe because someone had already planned this. (“Yeah, I knew this would happen” type of vibe.)
The question would then be “How will my first 10 layers of plans go wrong and how can I take this into account?”
A quick example of this might look like this:
Race dynamics will happen, therefore we need controls and coordination (90%)
Coordination will break down therefore we need better systems (90%)
We will find these better systems yet they will not be implemented by default (85%)
We will therefore have to develop new coordination systems yet they can’t be too big since they won’t be implemented (80%)
We should therefore work on developing proven coordination systems where they empirically work and then scale them
(Other plan maybe around tracking compute or other governance measures)
And I’ve now coincidentally arrived at the same place as Audrey Tang...
But we need more layers than this because systems will fail in various more ways as well!
When you’re creating something with the potential to replace the entire human race, you don’t do it at all, until you are as sure as you can be, that you are doing it right. That’s the adult attitude in my opinion.
Unfortunately we are no longer in that world. The precursors of superintelligence have been commercialized and multiple factions are racing to make it more and more powerful, trusting that they can figure out what needs to be done along the way. The question now is, what is the adult thing to do, in a world that is creating superintelligence in this reckless fashion?
But I’m sure that thinking about what would have been the adult way to do it, remains valuable.