In this shortform, I explain my main confusion with this alignment proposal. The main thing that’s unclear to me: what’s the idea here for how the agent remains motivated by diamondseven while doing very non-diamond related things like “solving mazes” that are required for general intelligence? More details in the shortform itself.
In this shortform, I explain my main confusion with this alignment proposal. The main thing that’s unclear to me: what’s the idea here for how the agent remains motivated by diamonds even while doing very non-diamond related things like “solving mazes” that are required for general intelligence?
More details in the shortform itself.
I think that was supposed to be answered by this line: