Wei Dai comments on Just Imitate Humans?

Wei Dai 8 Aug 2019 7:13 UTC
LW: 2 AF: 1
0
AF

If humans are taking over one time in a thousand, then it’ll think (more or less) there’s a 1⁄000 chance that they’ll remember the last action.

But there’s a model/TM that thinks there a 100% chance that the human will remember the last action (because that’s hard coded into the TM) and that model will do really well in the next update. So we know any time a human steps in no matter when, it will cause a big update (during the next update) because it’ll raise models like this from obscurity to prominence. If the AI “knows” this, it will call in the human for every time step, but maybe it doesn’t “know” this? (I haven’t thought this through formally and will leave it to you.)

With the edit that the model may as well be allowed to depend on the whole history of which actions were human-selected, not just whether the last one was.

I was assuming the models would save that input on its work tape for future use.

In any case, I think I understand your proposal well enough now that we can go back to some of the other questions.