Tht isn’t a fact. MIRI assumes goal stability is desirable for safety, but at the same time, MIRIs favourite UFAI is only possible with goal stability.
Paperclip maximizers serve as illustration of a principle. I think that most MIRI folks consider UFAI to be more complicated than simple paperclip maximizers.
Goal stability also get’s harder the more complicated the goal happens to be. A paperclip maximizer can have a off switch but at the same time prevent anyone from pushing that switch.
Tht isn’t a fact. MIRI assumes goal stability is desirable for safety, but at the same time, MIRIs favourite UFAI is only possible with goal stability.
A paperclip maximizer wouldn’t become that much less scary if it accidentally turned itself into a paperclip-or-staple maximizer, though.
What if it decided making paperclips was boring, and spent some time in deep meditation formulating new goals for itself?
Paperclip maximizers serve as illustration of a principle. I think that most MIRI folks consider UFAI to be more complicated than simple paperclip maximizers.
Goal stability also get’s harder the more complicated the goal happens to be. A paperclip maximizer can have a off switch but at the same time prevent anyone from pushing that switch.