Daniel Kokotajlo comments on Evaluating the historical value misspecification argument

Daniel Kokotajlo 6 Oct 2023 16:50 UTC
LW: 15 AF: 7
1
AF
I think this discussion would benefit from having a concrete proposed AGI design on the table. E.g. it sounds like Matthew Barnett has in mind something like AutoGPT5 with the prompt “always be ethical, maximize the good” or something like that. And it sounds like he is saying that while this proposal has problems and probably wouldn’t work, it has one fewer problem than old MIRI thought. And as the discussion has shown there seems to be a lot of misunderstandings happening, IMO in both directions, and things are getting heated. I venture a guess that having a concrete proposed AGI design to talk about would clear things up a bit.