We assume AI learning timescales vastly outstrip human learning timescales as a way of keeping our definition tractable. So the only way to structure this problem in our framework would be to imagine a human is playing chess against a superintelligent AI — a highly distorted situation compared to the case of two roughly equal opponents.
I think this is probably true in the long term (the classical-quantum/reversible computer transition is very large, and humans can’t easily modify brains, unlike a virtual human.) But this may not be true in the short-term.
Agreed. We think our human-AI setting is a useful model of alignment in the limit case, but not really so in the transient case. (For the reason you point out.)
I think this is probably true in the long term (the classical-quantum/reversible computer transition is very large, and humans can’t easily modify brains, unlike a virtual human.) But this may not be true in the short-term.
Agreed. We think our human-AI setting is a useful model of alignment in the limit case, but not really so in the transient case. (For the reason you point out.)