In general, I’m happy to see people work to be more explicit about their viewpoints.
It seems like the team behind this disagrees with much of the rest of the AI safety efforts, in that you think other safety approaches and strategies are unlikely to succeed. Most of this part seems to be in the AI Safety section. Arguably this section provides a basic summary for those not very familiar with the area, but for those who disagree with you on these pieces, I suspect that this section isn’t close to long or detailed enough to convince them.
I find that this section is very hand-wavy and metaphorical. I currently believe that AI oversight mechanisms, control, and careful scaling have a decent chance of maintaining reasonable alignment, if handled decently intelligently.
For example, this piece says,
> The appropriate analogy is not one researcher reviewing another, but rather a group of preschoolers reviewing the work of a million Einsteins. It might be easier and faster than doing the research itself, but it will still take years and years of effort and verification to check any single breakthrough.
I think it’s likely we won’t get a discrete step like that. It would be more like some smart scientists reviewing the work of some smarter scientists, but in a situation where the latter team is forced to reveal all of their in-progress thinking, and is evaluated in a very large range of extreme situations, and there are other clever strategies for inventive setting and oversight.
There also seems to be an implicit assumption that scaling will happen incredibly quickly after rough benchmarks of AGI are achieved, like a major discontinuous jump. I think this is possible, but I’m very unsure.
Separately, I’d flag that I’m not a huge fan of these bold name like “The Compendium.” I get that the team might think the document justified the grandiosity, but from my perspective, I’m more skeptical. (I feel similarly with other names like “Situational Awareness.”
In general, I’m happy to see people work to be more explicit about their viewpoints.
It seems like the team behind this disagrees with much of the rest of the AI safety efforts, in that you think other safety approaches and strategies are unlikely to succeed. Most of this part seems to be in the AI Safety section. Arguably this section provides a basic summary for those not very familiar with the area, but for those who disagree with you on these pieces, I suspect that this section isn’t close to long or detailed enough to convince them.
I find that this section is very hand-wavy and metaphorical. I currently believe that AI oversight mechanisms, control, and careful scaling have a decent chance of maintaining reasonable alignment, if handled decently intelligently.
For example, this piece says,
> The appropriate analogy is not one researcher reviewing another, but rather a group of preschoolers reviewing the work of a million Einsteins. It might be easier and faster than doing the research itself, but it will still take years and years of effort and verification to check any single breakthrough.
I think it’s likely we won’t get a discrete step like that. It would be more like some smart scientists reviewing the work of some smarter scientists, but in a situation where the latter team is forced to reveal all of their in-progress thinking, and is evaluated in a very large range of extreme situations, and there are other clever strategies for inventive setting and oversight.
There also seems to be an implicit assumption that scaling will happen incredibly quickly after rough benchmarks of AGI are achieved, like a major discontinuous jump. I think this is possible, but I’m very unsure.
Separately, I’d flag that I’m not a huge fan of these bold name like “The Compendium.” I get that the team might think the document justified the grandiosity, but from my perspective, I’m more skeptical. (I feel similarly with other names like “Situational Awareness.”