One obvious example is chess playing from a significantly better position. No superintelligence has any chance against only a good human player.
Can you prove that the board position is significantly better, even against superintelligences, for anything other than trivial endgames?
And what is the superintelligence allowed to do? Trick you into making a mistake? Manipulate you into making the particular moves it wants you to? Use clever rules-lawyering to expose elements of the game that humans haven’t noticed yet?
If it eats its opponent, does that cause a forfeit? Did you think it might try that?
As I said. There are circumstances in which a dumber can win.
The philosophy of FAI is essentially the same thing. Searching for the circumstances where the smarter will serve the dumber.
Always expecting a rabbit from a hat of superintelligence is not justified. A superintelligence is not omnipotent, can’t always eats you. Sometimes it can’t even develops an ill wish toward you.
The philosophy of FAI is essentially the same thing. Searching for the circumstances where the smarter will serve the dumber.
Change that to: searching for circumstances where the smarter will provably serve the dumber. (Then you’re closer). Your description of what superintelligences will do, above, doesn’t rise to anything resembling a formal proof. FAI assumes that AI is Unfriendly until proven otherwise.
So, you raise a valid point here. This area is currently very early on in its work. There are theorems that may prove to be relevant. See for example, this recent work. And yes, in any area where mathematical models are used, the difference between having a theorem and set of definitions and those definitions reflecting what you actually care about can be a major problem (you see this all the time in cryptography with side-channel attacks for example). But all of that said, I’m not sure what the point of your argument is: sure the field is young. But if the MIRI people are correct that AGI is a real worry, then this looks like one of the very few possible responses that has any chance of working. And if it isn’t a lot now, that’s a reason to put in more resources so that we actually have a theory that works by the time AI shows up.
Can you prove that the board position is significantly better, even against superintelligences, for anything other than trivial endgames?
And what is the superintelligence allowed to do? Trick you into making a mistake? Manipulate you into making the particular moves it wants you to? Use clever rules-lawyering to expose elements of the game that humans haven’t noticed yet?
If it eats its opponent, does that cause a forfeit? Did you think it might try that?
As I said. There are circumstances in which a dumber can win.
The philosophy of FAI is essentially the same thing. Searching for the circumstances where the smarter will serve the dumber.
Always expecting a rabbit from a hat of superintelligence is not justified. A superintelligence is not omnipotent, can’t always eats you. Sometimes it can’t even develops an ill wish toward you.
“It doesn’t hate you. it’s just that you happen to be made of atoms, and it needs those atoms to make paperclips. ”
Change that to: searching for circumstances where the smarter will provably serve the dumber. (Then you’re closer). Your description of what superintelligences will do, above, doesn’t rise to anything resembling a formal proof. FAI assumes that AI is Unfriendly until proven otherwise.
Can you prove anything about FAI, uFAI and so on?
I don’t think, that there are any proven theorems about this topic, at all.
Even if there were, how reliable are axioms, how good are definitions?
So, you raise a valid point here. This area is currently very early on in its work. There are theorems that may prove to be relevant. See for example, this recent work. And yes, in any area where mathematical models are used, the difference between having a theorem and set of definitions and those definitions reflecting what you actually care about can be a major problem (you see this all the time in cryptography with side-channel attacks for example). But all of that said, I’m not sure what the point of your argument is: sure the field is young. But if the MIRI people are correct that AGI is a real worry, then this looks like one of the very few possible responses that has any chance of working. And if it isn’t a lot now, that’s a reason to put in more resources so that we actually have a theory that works by the time AI shows up.