Richard_Ngo comments on Some thoughts on risks from narrow, non-agentic AI

Richard_Ngo 30 Jun 2021 20:54 UTC
3 points
0
Take a big language model like GPT-3, and then train it via RL on tasks where it gets given a language instruction from a human, and then it gets reward if the human thinks it’s done the task successfully.
- Sam Clarke 1 Jul 2021 7:52 UTC
  1 point
  0
  Parent
  Makes sense, thanks!