I agree with your suspicion that our favorite future have relatively high D(u/h) / D(u) but not the highest value of D(u/h) / D(u).
Many utility functions have the same feature. For example, I could give the AI some flying robots with cameras, and teach it to count smiling people in the street by simple image recognition algorithms. That utility function would also assign a high score to our favorite future, but not the highest score. Of course the smile maximizer is one of LW’s recurring nightmares, like the paperclip maximizer.
I suppose I’d defend a weaker claim, that a D(u/h) / D(u) supercontroller would not be an existential threat. One reason for this is that D(u) is so difficult to compute that it would be pretty bogged down...
Any function that’s computationally hard to optimize would have the same feature.
Many utility functions have the same feature. For example, I could give the AI some flying robots with cameras, and teach it to count smiling people in the street by simple image recognition algorithms. That utility function would also assign a high score to our favorite future, but not the highest score. Of course the smile maximizer is one of LW’s recurring nightmares, like the paperclip maximizer.
Any function that’s computationally hard to optimize would have the same feature.
What other nice features does your proposal have?