I feel a bit sad that the alignment community is so focused on intelligence enhancement. The chance of getting enough time for that seems so low that it’s accepting a low chance of survival.
What has convinced you that the technical problems are unsolvable? I’ve been trying to track the arguments on both sides rather closely, and the discussion just seems unfinished. My shortform on cruxes of disagreement on alignment difficulty still is mostly my current summary of the state of disagreements.
It seems like we have very little idea how technically difficult alignment will be. The simplicia/doomimir debates sum up the logic very nicely, but the distribution of expert opinions seems more telling: people who think about alignment don’t know to what extent techniques for aligning LLMs will generalize to transformative AI, AGI, or ASI.
There’s a lot of pessimism about the people and organizations that will likely be in charge of building and aligning our first AGIs. I share this pessimism. But it seems quite plausible to me that those people and orgs will take the whole thing slightly more seriously by the time we get there, and actual technical alignment will turn out to be easy enough that even highly flawed humans and orgs can accomplish it.
That seems like a much better out to play for, or at least investigate, than unstated plans or good fortune in roadblocks pauses AI progress long enough for intelligence enhancement to get a chance.
I feel a bit sad that the alignment community is so focused on intelligence enhancement. The chance of getting enough time for that seems so low that it’s accepting a low chance of survival.
What has convinced you that the technical problems are unsolvable? I’ve been trying to track the arguments on both sides rather closely, and the discussion just seems unfinished. My shortform on cruxes of disagreement on alignment difficulty still is mostly my current summary of the state of disagreements.
It seems like we have very little idea how technically difficult alignment will be. The simplicia/doomimir debates sum up the logic very nicely, but the distribution of expert opinions seems more telling: people who think about alignment don’t know to what extent techniques for aligning LLMs will generalize to transformative AI, AGI, or ASI.
There’s a lot of pessimism about the people and organizations that will likely be in charge of building and aligning our first AGIs. I share this pessimism. But it seems quite plausible to me that those people and orgs will take the whole thing slightly more seriously by the time we get there, and actual technical alignment will turn out to be easy enough that even highly flawed humans and orgs can accomplish it.
That seems like a much better out to play for, or at least investigate, than unstated plans or good fortune in roadblocks pauses AI progress long enough for intelligence enhancement to get a chance.