Again, I think “provably friendly thing” mischaracterizes what MIRI thinks will be possible.
From what I can gather, there’s still supposed to be some kind of proof, even if it’s just the mathematical kind where you’re not really certain because there might be an error in it. The intent is to have some sort of program that maximizes utility function U, and then explicitly write the utility function as something along the lines of “do what I mean”.
Have you read the section on indirect normativity in Superintelligence? I’d start there.
I’m not sure what you’re referring to. Can you give me a link?
Fixed.
From what I can gather, there’s still supposed to be some kind of proof, even if it’s just the mathematical kind where you’re not really certain because there might be an error in it. The intent is to have some sort of program that maximizes utility function U, and then explicitly write the utility function as something along the lines of “do what I mean”.
I’m not sure what you’re referring to. Can you give me a link?
Superintelligence is a recent book by Nick Bostrom