It does not inspire confidence in the AI safety field that the only problem (among many that would appear at the superintelligent level) that was deliberately materialized earlier is alignment-faking, explicitly mentioned by Yudkowsky in AGI Ruin.
(Am I missing other problems that would appear at the superintelligent level that have already been demonstrated? Do we know people who can predict those problems on their own (without requiring Eliezer Yudkowsky to point them out) and then materialize them now?)
Are you asking about problems that would by default only appear at the SI level that have been demonstrated at a sub-SI level via some sort of elicitation?
It does not inspire confidence in the AI safety field that the only problem (among many that would appear at the superintelligent level) that was deliberately materialized earlier is alignment-faking, explicitly mentioned by Yudkowsky in AGI Ruin.
(Am I missing other problems that would appear at the superintelligent level that have already been demonstrated? Do we know people who can predict those problems on their own (without requiring Eliezer Yudkowsky to point them out) and then materialize them now?)
Are you asking about problems that would by default only appear at the SI level that have been demonstrated at a sub-SI level via some sort of elicitation?