Deepmind categorizes Cohen’s scenario as specification gaming (instead of crystallized proxies).
They consider Carlsmith to be about outer misalignment?
Value lock-in, persuasive AI and Clippy are on my TODO list to be added shortly. Please do tell if you have something else in mind you’d like to see in my cheat sheet!
List of known discrepancies:
Deepmind categorizes Cohen’s scenario as specification gaming (instead of crystallized proxies).
They consider Carlsmith to be about outer misalignment?
Value lock-in, persuasive AI and Clippy are on my TODO list to be added shortly. Please do tell if you have something else in mind you’d like to see in my cheat sheet!