John_Maxwell comments on Alignment As A Bottleneck To Usefulness Of GPT-3

John_Maxwell 23 Jul 2020 5:00 UTC
3 points
0

The thesis that values are fragile doesn’t have anything to do with how easy it is to create a system that models them implicitly, but with how easy it is to get an arbitrarily intelligent agent to behave in a way that preserves those values. The difference between those two things is analogous to the difference between a prediction task and a reinforcement learning task, and your argument (as far as I can tell) addresses the former, not the latter. Insofar as my reading of your argument is correct, there is no point to concede.

If you can solve the prediction task, you can probably use the solution to create a reward function for your reinforcement learner.