Convincing All Capability Researchers

This post is heavily based off this excellent comment (the author’s name is not relevant)

Give the world’s thousand most respected AI researchers $1M each to spend 3 months working on AI alignment, with an extra $100M if by the end they can propose a solution alignment researchers can’t shoot down. I promise you that other than like 20 industry researchers who are paid silly amounts, every one of them would take the million. They probably won’t make any progress, but from then on when others ask them whether they think alignment is a real unsolved problem, they will be way more likely to say yes. That only costs you a billion dollars! I literally think I could get someone reading this the money to do this (at least at an initially moderate scale) - all it needs is a competent person to step up.

Usually people object to paying people large sums of money to work on alignment because they don’t expect them to produce any good work (mostly because it’s very hard to specify alignment, see below). This is a feature, not a bug.

Being able to say “the 1000 smartest people working in AI couldn’t make headway in Alignment in 3 months even when they were paid $1 million and a solution would be awarded $100 million” is very good for persuading existing researchers that this is a very hard problem.

Would this really go a long way to convincing those at major AI labs that alignment is hard? We could actually ask people working at these places if there were no progress made in those 3 months if it would change their minds.

Another problem is looooong timelines. They can agree it’s hard, but their timelines may be too far to matter to work on it now. I could think of a couple counter-arguments that may work (but it would better to actually test them with real people)

  • 1. To predict AGI (or just AI that has the capability of destroying the world), you could argue “It needs these 10 capabilities, and I predict it will get each capabilities at these different time periods”. If someone actually did that and was historically correct, even within 5-10 years, I’d be impressed and trust their opinion. Have you successfully predicted AI capabilities of the past 10 years?

    • possibly mention most recent advances (Transformers, some RL, Neural Nets in 2010′s)

  • 2. If there’s a 10% chance that it happens during you or your kids lifetime, then that’s extremely important. For example, if I thought there was a 10% chance that I died taking the train home tonight, I wouldn’t take the train. Even with a 0.1%, I wouldn’t.

Though, generally having a set of people working on apologetics for both US & Chinese AI Researchers is potentially extremely high-impact. If you’re interested and might be able to donate , DM me for a call.

Next Steps to seriously consider this proposal

Specifying “Solving Alignment”

Part of the problem of alignment is we don’t know the correct framework to specify it. The quoted text suggests the criteria for a solution as “a solution alignment researchers can’t shoot down.”, which side-steps this issue; however, specifying the problem in as fine-grained detail would be extremely useful for communicating to these researchers.

One failure mode would be them taking the money, not get work done, and then argue the problem wasn’t specified enough to make any meaningful progress which limits how persuasive this stunt could be. Documents like ELK are more useful specifications that captures the problem to various degrees, and I wish we had more problems like that.

Listing 1000 AI researcher intellectuals

The initial idea is to hire 1000 best AI researchers to work on the problem, not because we expect them to solve it, but by all of them failing

here are a few different proxy’s we can use, such as citations and top researchers at AI Labs. So far I’ve got

Convincing the CCP and backed researchers is a blank spot in my map, and if anyone knows anything, please comment or message me for a video call.

Actually Hiring People/​Creating a Company to Do This

We would need competent people to work on the specification of the problem, outreach and selecting who to pay, keeping up with researchers (if you’re paying them $1 million dollars, you also can have 1-on-1 calls which would be useful to make the most of), and reviewing actual work produced(which could actually be done by the community/​ independent researchers/​orgs).

Timelines Argument

It was argued that these plans are only relevant in 15+ timelines, but huge social changes/​cultural norms have happened within 1 year time periods. I’m not giving examples, but they went from, say 50 to 100 to 70 within a few months, which may be significantly different than Alignment.

This Plan May Backfire and Increase Capabilities

A way this can backfire is increasing race conditions, such that everyone wants to create the AGI first. Or at least, more people than are already doing so right now.

I think this is a relevant possibility, and this should be taken into account with whoever reaches out to talk to these top researchers when offering to pay them the large sums of money.