Noumero comments on REPL’s and ELK

Noumero 25 Feb 2022 14:00 UTC
1 point
Yeah, this is the part I’m confused about as well. I think this proposal involves training a neural network emulating a human? Otherwise I’m not sure how $E v a l_{H}$ (F( $s_{m}$ ), $o_{h}$ ) is supposed to work. It requires a human to make a prediction about the next step using observations and the direct translation of the machine state, which requires us to have some way to describe the full state in a way that the “human” we’re using can understand. This precludes using actual humans to label the data, because I don’t think we actually have any way to provide such a description. We’d need to train up a human simulator specifically adapted for parsing this sort of output.