We Choose To Align AI

johnswentworth1 Jan 2022 20:06 UTC

283 points

Epistemic status: poetry

“We choose to go to the moon! We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard. Because that goal will serve to organize the best of our skills and energies. Because that challenge is one we are willing to accept, one we are unwilling to postpone, and one we intend to win.”—John F Kennedy

WE CHOOSE TO ALIGN AI IN THIS DECADE AND DO THE OTHER THINGS

JFK gave his “We choose to go to the moon!” speech in 1962. And when he said “in this decade”, he did not mean that we’d go to the moon before 1972. He meant we’d go to the moon before 1970.

Happy 2022! When I say we choose to align AI in this decade, I don’t mean before 2032. I mean before 2030. Maybe sooner if things go well. Do I think that’s actually doable? Yes. Also fuck you.

… and some other things! As long as we’re shooting for the metaphorical moon, might as well throw aging in the mix too. That seems doable by 2030.

NOT BECAUSE THEY ARE EASY, BUT BECAUSE THEY ARE HARD

Effective altruists talk a lot about “importance, neglectedness, and tractability”. The more important, neglected, and tractable a problem is, the more we should expect a high impact per unit of effort invested in it. The alignment problem scores through the roof on importance, and is still relatively neglected, but tractability is… um… not.

I’m not really an EA, at heart. When there’s low hanging fruit, I might pick it quickly and move on, or these days I might point it out to someone else and move on. Point is, the low hanging fruit is not what I’m here for. I’m here for the challenge. I study alignment and agency and the other things not because they are easy, but because they are hard.

The more EAs I meet, the more I realize that wanting the challenge is a load-bearing pillar of sanity when working on alignment.

When people first seriously think about alignment, a majority freak out. Existential threats are terrifying. And when people first seriously look at their own capabilities, or the capabilities of the world, to deal with the problem, a majority despair. This is not one of those things where someone says “terrible things will happen, but we have a solution ready to go, all we need is your help!”. Terrible things will happen, we don’t have a solution ready to go, and even figuring out how to help is a nontrivial problem. When people really come to grips with that, tears are a common response.

… but for someone who wants the challenge, the emotional response is different. The problem is terrifying? Our current capabilities seem woefully inadequate? Good; this problem is worthy. The part of me which looks at a rickety ladder 30 feet down into a dark tunnel and says “let’s go!” wants this. The part of me which looks at a cliff face with no clear path up and cracks its knuckles wants this. The part of me which looks at a problem with no clear solution and smiles wants this. The response isn’t tears, it’s “let’s fucking do this”.

BECAUSE THAT GOAL WILL SERVE TO ORGANIZE THE BEST OF OUR SKILLS AND ENERGIES

Why align an AI, rather than prove the Riemann hypothesis? Or calculate bits of Chaitin’s constant—we know that’s hard.

When faced with a hard problem, there’s this tendency to substitute easier problems, solve those instead, and call it progress. Riemann hypothesis is too hard, so we pick some other function which looks kinda similar, and prove things about it instead. And sometimes that is progress! But other times, people just end up goodhearting on the new problem instead.

Alignment is a problem which needs to be solved. One day, reality will test us, and if we fail then it’s game over. Substitute an easier problem instead, and reality will ignore our easier solution and wipe us all out anyway.

That’s a core part of the appeal: we don’t have the option of just walking away, we don’t have the option of solving some easier problem instead.

(We still look for shortcuts and loopholes, of course. Those who despair look for shortcuts and loopholes because they want some hope to cling to. Those who seek challenge look for shortcuts and loopholes because if the problem does turn out to be easy, we want to solve it and move on.)

The alignment problem will serve to organize the best of our skills and energies because we can’t just substitute some other problem. It is a Schelling point in problem space, a problem around which I can organize my efforts and expect others to do the same, without everyone spontaneously sliding off to some other problem.

BECAUSE THAT CHALLENGE IS ONE WE ARE WILLING TO ACCEPT

Damn straight.

ONE WE ARE UNWILLING TO POSTPONE

Did I mention we’re on a timer, and we’re not sure when it will run out?

AND ONE WE INTEND TO WIN

What links here?

johnswentworth1 Jan 2022 20:06 UTC

283 points

17 comments3 min readLW link 1 review

Poetry AI

Ben Pace 17 Jan 2024 20:09 UTC
4 points
0
This is a better spirit with which to accomplish great and important tasks than most I have around me, and I’m grateful that it was written up. I give this +4.

Jon Garcia 2 Jan 2022 23:48 UTC
25 points
0
Yeah! Let’s do this!

The part of me which looks at a rickety ladder 30 feet down into a dark tunnel and says “let’s go!” wants this. The part of me which looks at a cliff face with no clear path up and cracks its knuckles wants this.

Although, I would rather those working on AI alignment adopt a general policy of not descending rickety ladders into dark abysses or free-climbing sheer cliffs, just to avoid having the probability of AI catastrophe make a discontinuous jump upward after an exciting weekend trip.
Ruby 28 Jan 2022 19:27 UTC
14 points
0
Curated. Huzzah for poetry and inspiration. We could use a few more speeches here and there.
Quinn 3 Jan 2022 17:55 UTC
8 points
0
Major guilty pleasure of mine is Aaron Sorkin, who once did a show called Newsroom about a large news broadcast project that, against all odds and incentives, doubles down on the duty of media elites to inform the public and so on. It’s either unbearably corny (insulting) or unbearably corny (affectionate) depending on who’s watching.

After the first broadcast of their re-invigorated show, the producer says

in the old days of like 10 minutes ago, we did the news well. You know how? We decided to.

I was thinking about this post and I got my streams crossed—the model of the JFK bit in my head accidentally inserted something like “we do this because we decide to”, and it worked really well! I find it motivating in the poetry sense to believe whatever illusion about agency or free will, especially at a collective level, that allows me to say “we are those who happened to step up” and flushing out any mention of reasons why we stepped up.

For some reason, “we decided to” is nearly as potent as defiance from solstice for me!
- Bill Prada 29 Jan 2022 4:29 UTC
  3 points
  0
  Parent
  For an inspiring movie scene for the moment I’d go with Apollo 13. The nerdy engineers saving the mission by coming up with a kluge to fit the wrong shape and size charcoal CO2 scrubbers. A palpable payoff to the JFK inspirational speech.
  
  https://spacecenter.org/apollo-13-infographic-how-did-they-make-that-co2-scrubber/
  - Bill Prada 29 Jan 2022 4:44 UTC
    1 point
    0
    Parent
    Edited above
    - Yoav Ravid 29 Jan 2022 8:27 UTC
      6 points
      0
      Parent
      Comments and posts are editable.
      - Bill Prada 29 Jan 2022 16:04 UTC
        1 point
        0
        Parent
        Thank you.
- Jon Garcia 3 Jan 2022 21:22 UTC
  2 points
  0
  Parent
  Thanks for the link. That call-and-response was beautiful.
Lorxus 11 Oct 2025 16:34 UTC
6 points
1
The more EAs I meet, the more I realize that wanting the challenge is a load-bearing pillar of sanity when working on alignment.
When people first seriously think about alignment, a majority freak out. Existential threats are terrifying. And when people first seriously look at their own capabilities, or the capabilities of the world, to deal with the problem, a majority despair. This is not one of those things where someone says “terrible things will happen, but we have a solution ready to go, all we need is your help!”. Terrible things will happen, we don’t have a solution ready to go, and even figuring out how to help is a nontrivial problem. When people really come to grips with that, tears are a common response.
… but for someone who wants the challenge, the emotional response is different. The problem is terrifying? Our current capabilities seem woefully inadequate? Good; this problem is worthy. The part of me which looks at a rickety ladder 30 feet down into a dark tunnel and says “let’s go!” wants this. The part of me which looks at a cliff face with no clear path up and cracks its knuckles wants this. The part of me which looks at a problem with no clear solution and smiles wants this. The response isn’t tears, it’s “let’s fucking do this”.
“Problems worthy of attack prove their worth by fighting back.”
Which is to say—despite a lot of other tragedies about me, there is a core part of me, dinged-up and bruised but still fighting, that looks at a beautiful core mystery and says—“No, unacceptable—we must know. We will know. I am hungry, and will chase this truth down, and it will not evade my jaws for long.” (Sometimes it even gets what it wants.)
PoignardAzur 4 Feb 2022 12:32 UTC
3 points
0
Do I think that’s actually doable? Yes. Also fuck you.
Uh, excuse you?
I’ve read your blog post and I still think the problem is poorly defined and untractable with current methods.
Also, the part Kennedy isn’t mentioning in your speech is that “going to the moon” was the end goal of a major propaganda war between the two major superpowers of the time, and as a result it had basically infinite money thrown at it.
Inspirational speeches are great, but having the funding and government authority to back them is even better.
Rachel Shu 4 Feb 2022 4:20 UTC
3 points
0
Composer Christopher Tin has set JFK’s “We Choose to go to the Moon” speech to music, https://www.youtube.com/watch?v=HBITb9Zz0rY . Solsticegoers may recognize the opening leitmotif as shared with Sogno Di Volare, another movement from the same work, an oratorio on the theme of flight, To Shiver the Sky.
Bill Prada 28 Jan 2022 22:02 UTC
0 points
0
The Kennedy speech lit a fire under a generation. Appending the ‘fuck you’ makes you sound like a truculent Reddit knuckle head. Grow up a bit and I’ll take you more seriously.
- Bill Prada 29 Jan 2022 16:44 UTC
  10 points
  0
  Parent
  The juxtaposition of soaring rhetoric and coarse language still jars me but my comment is harsher than it needed to be. I apologize.
  - Shiri Dori-Hacohen 30 Jan 2022 17:10 UTC
    −9 points
    0
    Parent
    Actually, I think you were spot on. The curse was completely uncalled for and not helpful in any way, as I mentioned in this Twitter thread. This was the first email broadcast I ever opened from LessWrong—and will be the last as well. Unsubscribed.
    - Shiri Dori-Hacohen 30 Jan 2022 17:11 UTC
      −9 points
      0
      Parent
      P.S. I am not a prude and use curses in my language quite liberally. The problem for me was not the usage of the coarse language in and of itself, but the fact that it was directed at the reader for no reason whatsoever.
      - Lukas_Gloor 3 Feb 2022 13:01 UTC
        9 points
        0
        Parent
        Good point, but I thought that casually insulting the reader for no reason whatsoever gave the post an aura of battle-readiness (perhaps a bit much of it), which is maybe the tone the author was going for.