There’s a theory that says that agents following their meta-preferences will (or at least might) wind up maniacally obsessed with some goal … less like a human and more like a hedonium-maximizer or whatever. I tried to describe it here. Interested in any thoughts on that...
Also relevant is Stuart Armstrong’s post against CEV here and more here.
Interesting!
There’s a theory that says that agents following their meta-preferences will (or at least might) wind up maniacally obsessed with some goal … less like a human and more like a hedonium-maximizer or whatever. I tried to describe it here. Interested in any thoughts on that...
Also relevant is Stuart Armstrong’s post against CEV here and more here.