Tomek Korbak

Karma: 601

Aligning language models at Anthropic

https://tomekkorbak.com/

RL with KL penalties is better seen as Bayesian inference

Tomek Korbak and Ethan Perez

25 May 2022 9:23 UTC

114 points

17 comments12 min readLW link

Pretraining Language Models with Human Preferences

Tomek Korbak, Sam Bowman and Ethan Perez

21 Feb 2023 17:57 UTC

133 points

18 comments11 min readLW link

Compositional preference models for aligning LMs

Tomek Korbak25 Oct 2023 12:17 UTC

18 points

2 comments5 min readLW link