Most pivotal acts I can easily think of that can be accomplished without magic ASI help amount to “massively hurt human civilization so that it won’t be able to build large data centers for a long time to come.” I don’t know if that’s a failure of imagination, though. (An alternative might be some kind of way to demonstrate that AI existential risk is real in a way that’s as convincing as the use of nuclear weapons at the end of World War II was for making people consider nuclear war an existential risk, so the world gets at least as paranoid about AI as it is about things like genetic engineering of human germlines. I don’t actually know how to do that, though.)
Perhaps a more useful prompt for you: suppose something indeed convinces the bulk of the population that AI existential risk is real in a way that’s as convincing as the use of nuclear weapons at the end of World War II. Presumably the government steps in with measures sufficient to constitute a pivotal act. What are those measures? What happens, physically, when some rogue actor tries to build an AGI? What happens, physically, when some rogue actor tries to build an AGI 20 or 40 years in the future when alorithmic efficiency and Moore’s law have lowered the requisite resources dramatically? How do those physical things happen? Who’s involved, what specifically does each of the people involved do, and what ensures that they continue to actually do their job across several decades? What physical infrastructure do they need, where does that infrastructure come from, how much would it cost, what maintenance would it need? What’s the annual budget and headcount for this project?
And then, once you’ve thought through that, ask: what’s the minimum intervention required to make those same things physically happen when a rogue actor tries to build an AGI?
To be clear, I think we at Redwood (and people at spiritually similar places like the AI Futures Project) do think about this kind of question (though I’d quibble about the importance of some of the specific questions you mention here).
Some sort of “coordination takeoff” seems not-impossible to me: set up some sort of platform that’s simultaneously massively profitable/addictive/viral and optimizes for e. g. approximating the ground truth.
Prediction markets were supposed to be that, and some sufficiently clever wrapper on them might yet get there.
Twitter’s community notes are another case study, where good, sufficiently cynical incentive design leads to unsupervised selection of truth-ish statements.
This post has been sitting in my head for years. If scaled up, it might produce a sort of white-box “superpersuasion engine” that could then be tuned for raising the sanity waterline.
Intuitively, I think it’s possible there’s some sort of idea from this reference class that would take off explosively if properly implemented, and then fix our civilization. But I haven’t gone beyond idle thinking regarding it.
Most pivotal acts I can easily think of that can be accomplished without magic ASI help amount to “massively hurt human civilization so that it won’t be able to build large data centers for a long time to come.” I don’t know if that’s a failure of imagination, though. (An alternative might be some kind of way to demonstrate that AI existential risk is real in a way that’s as convincing as the use of nuclear weapons at the end of World War II was for making people consider nuclear war an existential risk, so the world gets at least as paranoid about AI as it is about things like genetic engineering of human germlines. I don’t actually know how to do that, though.)
Perhaps a more useful prompt for you: suppose something indeed convinces the bulk of the population that AI existential risk is real in a way that’s as convincing as the use of nuclear weapons at the end of World War II. Presumably the government steps in with measures sufficient to constitute a pivotal act. What are those measures? What happens, physically, when some rogue actor tries to build an AGI? What happens, physically, when some rogue actor tries to build an AGI 20 or 40 years in the future when alorithmic efficiency and Moore’s law have lowered the requisite resources dramatically? How do those physical things happen? Who’s involved, what specifically does each of the people involved do, and what ensures that they continue to actually do their job across several decades? What physical infrastructure do they need, where does that infrastructure come from, how much would it cost, what maintenance would it need? What’s the annual budget and headcount for this project?
And then, once you’ve thought through that, ask: what’s the minimum intervention required to make those same things physically happen when a rogue actor tries to build an AGI?
To be clear, I think we at Redwood (and people at spiritually similar places like the AI Futures Project) do think about this kind of question (though I’d quibble about the importance of some of the specific questions you mention here).
Some sort of “coordination takeoff” seems not-impossible to me: set up some sort of platform that’s simultaneously massively profitable/addictive/viral and optimizes for e. g. approximating the ground truth.
Prediction markets were supposed to be that, and some sufficiently clever wrapper on them might yet get there.
Twitter’s community notes are another case study, where good, sufficiently cynical incentive design leads to unsupervised selection of truth-ish statements.
This post has been sitting in my head for years. If scaled up, it might produce a sort of white-box “superpersuasion engine” that could then be tuned for raising the sanity waterline.
Intuitively, I think it’s possible there’s some sort of idea from this reference class that would take off explosively if properly implemented, and then fix our civilization. But I haven’t gone beyond idle thinking regarding it.