Tomek Korbak

Karma: 393

I’m a PhD student at the University of Sussex and a visiting researcher at NYU working on aligning language models with human preferences. I’m particularly interested in RL from human feedback (RLHF) and probabilistic programming with language models.


Paper: On mea­sur­ing situ­a­tional aware­ness in LLMs

4 Sep 2023 12:54 UTC
89 points
15 comments5 min readLW link

Imi­ta­tion Learn­ing from Lan­guage Feedback

30 Mar 2023 14:11 UTC
71 points
3 comments10 min readLW link

Pre­train­ing Lan­guage Models with Hu­man Preferences

21 Feb 2023 17:57 UTC
133 points
18 comments11 min readLW link

RL with KL penalties is bet­ter seen as Bayesian inference

25 May 2022 9:23 UTC
106 points
17 comments12 min readLW link