RSS

Jeremy Gillen

Karma: 1,017

I do alignment research, mostly stuff that is vaguely agent foundations. Formerly on Vivek’s team at MIRI. Most of my writing before mid 2023 are not representative of my current views about alignment difficulty.

Ex­plain­ing in­ner al­ign­ment to myself

Jeremy Gillen24 May 2022 23:10 UTC
9 points
2 comments10 min readLW link

Broad Bas­ins and Data Compression

8 Aug 2022 20:33 UTC
33 points
6 comments7 min readLW link

Pro­ject pro­posal: Test­ing the IBP defi­ni­tion of agent

9 Aug 2022 1:09 UTC
21 points
4 comments2 min readLW link

Find­ing Goals in the World Model

22 Aug 2022 18:06 UTC
59 points
8 comments13 min readLW link