Aaron_Scher comments on All AGI Safety questions welcome (especially basic ones) [~monthly thread]

Aaron_Scher 12 Nov 2022 0:41 UTC
3 points
3
Recently, AGISF has revised its syllabus and moved Risks form Learned Optimization to a recommended reading, replacing it with Goal Misgeneralization. I think this change is wrong, but I don’t know why they did it and Chesteron’s Fence.
Does anybody have any ideas for why they did this?
Are Goal Misgeneralization and Inner-Misalignment describing the same phenomenon?
What’s the best existing critique of Risks from Learned Optimization? (besides it being probably worse than Rob Miles pedagogically, IMO)