Richard_Ngo comments on Richard Ngo’s Shortform

Richard_Ngo 15 Dec 2022 1:57 UTC
LW: 21 AF: 14
0
AF
Putting my money where my mouth is: I just uploaded a (significantly revised) version of my Alignment Problem position paper, where I attempt to describe the AGI alignment problem as rigorously as possible. The current version only has “policy learns to care about reward directly” as a footnote; I can imagine updating it based on the outcome of this discussion though.
- dsj 29 Dec 2022 1:33 UTC
  LW: 3 AF: 2
  AF Parent
  For someone who’s read v1 of this paper, what would you recommend as the best way to “update” to v3? Is an entire reread the best approach?
  [Edit March 11, 2023: Having now read the new version in full, my recommendation to anyone else with the same question is a full reread.]
- [ ]
  [deleted]