Your steps sound pretty reasonable to me. A key missing step is that there’s basically zero chance that good people will win a power struggle over ASI. Rather, power-hungry people will win the power struggle. In other words, if we end up in a situation with extreme power imbalances where the future will be decided by the winners of a short-term struggle, there’s basically no chance of a good outcome. (The outcome might be better than extinction, but still not good.*) So it seems critically important to ensure that things don’t go that way, and I have no idea how to ensure that other than by not building ASI.
I think that’s a real sense in which all these post-alignment problems are still problems. I do acknowledge that “be a good person and then acquire absolute power” is an answer to all post-alignment problems simultaneously, which is something I missed in my original post. But it doesn’t seem like a viable solution to me. It might even be true that seeking absolute power is fundamentally incompatible with being a good person, although I’m not sure about that.
*It could also be worse than extinction if vindictive power-hungry people decide to torture their enemies for eternity, or similar.
Yeah. To clear, I didn’t intend for my comment to make it sound like I think stuff is easy if we have solved alignment. It might be difficult enough that pausing AI is required to solve it (a position I’m sympathetic to anyways).
I just meant to communicate that if we solve alignment, the remaining problem is more like a very high-stakes version of getting the person you want elected president. It’s a very difficult task, but not a problem where the difficulty lies in conceptual confusion, or theoretical questions we don’t have answers to. But discussions about these post-asi topics usually treat it like that.
Your steps sound pretty reasonable to me. A key missing step is that there’s basically zero chance that good people will win a power struggle over ASI. Rather, power-hungry people will win the power struggle. In other words, if we end up in a situation with extreme power imbalances where the future will be decided by the winners of a short-term struggle, there’s basically no chance of a good outcome. (The outcome might be better than extinction, but still not good.*) So it seems critically important to ensure that things don’t go that way, and I have no idea how to ensure that other than by not building ASI.
I think that’s a real sense in which all these post-alignment problems are still problems. I do acknowledge that “be a good person and then acquire absolute power” is an answer to all post-alignment problems simultaneously, which is something I missed in my original post. But it doesn’t seem like a viable solution to me. It might even be true that seeking absolute power is fundamentally incompatible with being a good person, although I’m not sure about that.
*It could also be worse than extinction if vindictive power-hungry people decide to torture their enemies for eternity, or similar.
Yeah. To clear, I didn’t intend for my comment to make it sound like I think stuff is easy if we have solved alignment. It might be difficult enough that pausing AI is required to solve it (a position I’m sympathetic to anyways).
I just meant to communicate that if we solve alignment, the remaining problem is more like a very high-stakes version of getting the person you want elected president. It’s a very difficult task, but not a problem where the difficulty lies in conceptual confusion, or theoretical questions we don’t have answers to. But discussions about these post-asi topics usually treat it like that.