RSS

John Schulman

Karma: 484

Scal­ing Laws for Re­ward Model Overoptimization

20 Oct 2022 0:20 UTC
103 points
13 comments1 min readLW link
(arxiv.org)

Fre­quent ar­gu­ments about alignment

John Schulman23 Jun 2021 0:46 UTC
104 points
17 comments5 min readLW link