I really appreciate the basic vibe of this post. In particular, I think it’s great to have a distinction between wizard power and king power, and to note that king power is often fake, and that lots of people are very tempted (including by insidious social pressure) to focus on gunning for king power without being sufficiently thoughtful about whether they’re actually achieving what they wanted. And I think that for a lot of people, it’s an underrated strategy to focus hard on wizard power (especially when you’re young). E.g. I spent a lot of my twenties learning computer science and science, and I think this was quite helpful for me.
A big theme of Redwood Research’s work is the question “If you are in charge of deploying a powerful AI and you have limited resources (e.g. cash, manpower, acceptable service degradation) to mitigate misalignment risks, how should you spend your resources?”. (E.g. see here.) This is in contrast to e.g. thinking about what safety measures are most in the Overton window, or which ones are easiest to explain. I think it’s healthy to spend a lot of your time thinking about techniques that are objectively better, because it is less tied up in social realities. That attitude reminds me of your post.
I share your desire to know about all those things you talk about. One of my friends has huge amounts of “wizard power”, and I find this extremely charming/impressive/attractive. I would personally enjoy the LessWrong community more if the people here knew more of this stuff.
I’m very skeptical that focusing on wizard power is universally the right strategy; I’m even more skeptical that learning the random stuff you list in this post is typically a good strategy for people. For example, I think that it would be clearly bad for my effect on existential safety for me to redirect a bunch of my time towards learning about the things you described (making vaccines, using CAD software, etc), because those topics aren’t as relevant to the main strategies that I’m interested in for mitigating existential risk.
You write “And if one wants a cure for aging, or weekend trips to the moon, or tiny genetically-engineered dragons… then the bottleneck is wizard power, not king power.” I think this is true in a collective sense—these problems require technological advancement—but it is absurd to say that the best way to improve the probability of getting to those things is to try to personally learn all of the scientific fields relevant to making those advancements happen. At the very least, surely there should be specialization! And beyond that, I think the biggest threat to eventual weekend trips to the moon is probably AI risk; on my beliefs, we should dedicate way more effort to mitigating AI risk than to tiny-dragon-R&D. Some people should try to have very general knowledge of these things, but IMO the main usecase for having such broad knowledge is helping with the prioritization between them, not contributing to any particular one of them!
I’m very skeptical that focusing on wizard power is universally the right strategy. For example, I think that it would be clearly bad for my effect on existential safety for me to redirect a bunch of my time towards learning about the things you described (making vaccines, using CAD software, etc)...
Fair as stated, but I do think you’d have more (positive) effect on existential safety if you focused more narrowly on wizard-power-esque approaches to the safety problem. In particular, outsourcing the bulk of alignment work (or a pivotal act, or...) to AI is a prototypical king-power strategy; it’s just using (king power over AI) in place of (king power over humans). And that strategy has the usual king-power problems—in particular, there’s a very high risk that one’s supposed king-power over the AI ends up being fake. Plus it has new king-power problems from AIs not thinking like humans—e.g. AI probably won’t be governed by dominance instincts to nearly the same degree as humans, so humans’ instincts about e.g. how employer-employee relationships work in practice will not carry over at all.
More wizard-power-esque directions include ambitious interp and agent foundations, but also less obvious things like “make a pivotal act happen without using an AI” (which is a very valuable thing to think through at least as an exercise), or “be bureauracy wizard and make some actually-effective regulations happen”, or whole brain emulation, or genetically engineering smarter humans.
You write “And if one wants a cure for aging, or weekend trips to the moon, or tiny genetically-engineered dragons… then the bottleneck is wizard power, not king power.” I think this is true in a collective sense—these problems require technological advancement—but it is absurd to say that the best way to improve the probability of getting to those things is to try to personally learn all of the scientific fields relevant to making those advancements happen.
It’s not central to this post, but… I’ve read up on aging research a fair bit, and I do actually think that the best way to improve the probability of a cure for aging at this point is to personally learn all of the scientific fields relevant to making it happen. I would say the same (though somewhat lower confidence) about weekend trips to the moon and tiny genetically-engineered dragons.
I think that if you wanted to contribute maximally to a cure for aging (and let’s ignore the possibility that AI changes the situation), it would probably make sense for you to have a lot of general knowledge. But that’s substantially because you’re personally good at and very motivated by being generally knowledgeable, and you’d end up in a weird niche where little of your contribution comes from actually pushing any of the technical frontiers. Most of the credit for solving aging will probably go to people who either narrowly specialized in a particular domain; much of the rest will go to people who applied their general knowledge to improving the overall strategy or allocation of effort among people who are working on curing aging (while leaving most of the technical contributions to specialists)--this latter strategy crucially relies on management and coordination and not being fully in the weeds everywhere.
Most pivotal acts I can easily think of that can be accomplished without magic ASI help amount to “massively hurt human civilization so that it won’t be able to build large data centers for a long time to come.” I don’t know if that’s a failure of imagination, though. (An alternative might be some kind of way to demonstrate that AI existential risk is real in a way that’s as convincing as the use of nuclear weapons at the end of World War II was for making people consider nuclear war an existential risk, so the world gets at least as paranoid about AI as it is about things like genetic engineering of human germlines. I don’t actually know how to do that, though.)
Perhaps a more useful prompt for you: suppose something indeed convinces the bulk of the population that AI existential risk is real in a way that’s as convincing as the use of nuclear weapons at the end of World War II. Presumably the government steps in with measures sufficient to constitute a pivotal act. What are those measures? What happens, physically, when some rogue actor tries to build an AGI? What happens, physically, when some rogue actor tries to build an AGI 20 or 40 years in the future when alorithmic efficiency and Moore’s law have lowered the requisite resources dramatically? How do those physical things happen? Who’s involved, what specifically does each of the people involved do, and what ensures that they continue to actually do their job across several decades? What physical infrastructure do they need, where does that infrastructure come from, how much would it cost, what maintenance would it need? What’s the annual budget and headcount for this project?
And then, once you’ve thought through that, ask: what’s the minimum intervention required to make those same things physically happen when a rogue actor tries to build an AGI?
To be clear, I think we at Redwood (and people at spiritually similar places like the AI Futures Project) do think about this kind of question (though I’d quibble about the importance of some of the specific questions you mention here).
Some sort of “coordination takeoff” seems not-impossible to me: set up some sort of platform that’s simultaneously massively profitable/addictive/viral and optimizes for e. g. approximating the ground truth.
Prediction markets were supposed to be that, and some sufficiently clever wrapper on them might yet get there.
Twitter’s community notes are another case study, where good, sufficiently cynical incentive design leads to unsupervised selection of truth-ish statements.
This post has been sitting in my head for years. If scaled up, it might produce a sort of white-box “superpersuasion engine” that could then be tuned for raising the sanity waterline.
Intuitively, I think it’s possible there’s some sort of idea from this reference class that would take off explosively if properly implemented, and then fix our civilization. But I haven’t gone beyond idle thinking regarding it.
Thanks for this post. Some thoughts:
I really appreciate the basic vibe of this post. In particular, I think it’s great to have a distinction between wizard power and king power, and to note that king power is often fake, and that lots of people are very tempted (including by insidious social pressure) to focus on gunning for king power without being sufficiently thoughtful about whether they’re actually achieving what they wanted. And I think that for a lot of people, it’s an underrated strategy to focus hard on wizard power (especially when you’re young). E.g. I spent a lot of my twenties learning computer science and science, and I think this was quite helpful for me.
A big theme of Redwood Research’s work is the question “If you are in charge of deploying a powerful AI and you have limited resources (e.g. cash, manpower, acceptable service degradation) to mitigate misalignment risks, how should you spend your resources?”. (E.g. see here.) This is in contrast to e.g. thinking about what safety measures are most in the Overton window, or which ones are easiest to explain. I think it’s healthy to spend a lot of your time thinking about techniques that are objectively better, because it is less tied up in social realities. That attitude reminds me of your post.
I share your desire to know about all those things you talk about. One of my friends has huge amounts of “wizard power”, and I find this extremely charming/impressive/attractive. I would personally enjoy the LessWrong community more if the people here knew more of this stuff.
I’m very skeptical that focusing on wizard power is universally the right strategy; I’m even more skeptical that learning the random stuff you list in this post is typically a good strategy for people. For example, I think that it would be clearly bad for my effect on existential safety for me to redirect a bunch of my time towards learning about the things you described (making vaccines, using CAD software, etc), because those topics aren’t as relevant to the main strategies that I’m interested in for mitigating existential risk.
You write “And if one wants a cure for aging, or weekend trips to the moon, or tiny genetically-engineered dragons… then the bottleneck is wizard power, not king power.” I think this is true in a collective sense—these problems require technological advancement—but it is absurd to say that the best way to improve the probability of getting to those things is to try to personally learn all of the scientific fields relevant to making those advancements happen. At the very least, surely there should be specialization! And beyond that, I think the biggest threat to eventual weekend trips to the moon is probably AI risk; on my beliefs, we should dedicate way more effort to mitigating AI risk than to tiny-dragon-R&D. Some people should try to have very general knowledge of these things, but IMO the main usecase for having such broad knowledge is helping with the prioritization between them, not contributing to any particular one of them!
Glad you liked it!
Fair as stated, but I do think you’d have more (positive) effect on existential safety if you focused more narrowly on wizard-power-esque approaches to the safety problem. In particular, outsourcing the bulk of alignment work (or a pivotal act, or...) to AI is a prototypical king-power strategy; it’s just using (king power over AI) in place of (king power over humans). And that strategy has the usual king-power problems—in particular, there’s a very high risk that one’s supposed king-power over the AI ends up being fake. Plus it has new king-power problems from AIs not thinking like humans—e.g. AI probably won’t be governed by dominance instincts to nearly the same degree as humans, so humans’ instincts about e.g. how employer-employee relationships work in practice will not carry over at all.
More wizard-power-esque directions include ambitious interp and agent foundations, but also less obvious things like “make a pivotal act happen without using an AI” (which is a very valuable thing to think through at least as an exercise), or “be bureauracy wizard and make some actually-effective regulations happen”, or whole brain emulation, or genetically engineering smarter humans.
It’s not central to this post, but… I’ve read up on aging research a fair bit, and I do actually think that the best way to improve the probability of a cure for aging at this point is to personally learn all of the scientific fields relevant to making it happen. I would say the same (though somewhat lower confidence) about weekend trips to the moon and tiny genetically-engineered dragons.
I think that if you wanted to contribute maximally to a cure for aging (and let’s ignore the possibility that AI changes the situation), it would probably make sense for you to have a lot of general knowledge. But that’s substantially because you’re personally good at and very motivated by being generally knowledgeable, and you’d end up in a weird niche where little of your contribution comes from actually pushing any of the technical frontiers. Most of the credit for solving aging will probably go to people who either narrowly specialized in a particular domain; much of the rest will go to people who applied their general knowledge to improving the overall strategy or allocation of effort among people who are working on curing aging (while leaving most of the technical contributions to specialists)--this latter strategy crucially relies on management and coordination and not being fully in the weeds everywhere.
Most pivotal acts I can easily think of that can be accomplished without magic ASI help amount to “massively hurt human civilization so that it won’t be able to build large data centers for a long time to come.” I don’t know if that’s a failure of imagination, though. (An alternative might be some kind of way to demonstrate that AI existential risk is real in a way that’s as convincing as the use of nuclear weapons at the end of World War II was for making people consider nuclear war an existential risk, so the world gets at least as paranoid about AI as it is about things like genetic engineering of human germlines. I don’t actually know how to do that, though.)
Perhaps a more useful prompt for you: suppose something indeed convinces the bulk of the population that AI existential risk is real in a way that’s as convincing as the use of nuclear weapons at the end of World War II. Presumably the government steps in with measures sufficient to constitute a pivotal act. What are those measures? What happens, physically, when some rogue actor tries to build an AGI? What happens, physically, when some rogue actor tries to build an AGI 20 or 40 years in the future when alorithmic efficiency and Moore’s law have lowered the requisite resources dramatically? How do those physical things happen? Who’s involved, what specifically does each of the people involved do, and what ensures that they continue to actually do their job across several decades? What physical infrastructure do they need, where does that infrastructure come from, how much would it cost, what maintenance would it need? What’s the annual budget and headcount for this project?
And then, once you’ve thought through that, ask: what’s the minimum intervention required to make those same things physically happen when a rogue actor tries to build an AGI?
To be clear, I think we at Redwood (and people at spiritually similar places like the AI Futures Project) do think about this kind of question (though I’d quibble about the importance of some of the specific questions you mention here).
Some sort of “coordination takeoff” seems not-impossible to me: set up some sort of platform that’s simultaneously massively profitable/addictive/viral and optimizes for e. g. approximating the ground truth.
Prediction markets were supposed to be that, and some sufficiently clever wrapper on them might yet get there.
Twitter’s community notes are another case study, where good, sufficiently cynical incentive design leads to unsupervised selection of truth-ish statements.
This post has been sitting in my head for years. If scaled up, it might produce a sort of white-box “superpersuasion engine” that could then be tuned for raising the sanity waterline.
Intuitively, I think it’s possible there’s some sort of idea from this reference class that would take off explosively if properly implemented, and then fix our civilization. But I haven’t gone beyond idle thinking regarding it.