Jeremy Gillen

Karma: 1,016

I do alignment research, mostly stuff that is vaguely agent foundations. Formerly on Vivek’s team at MIRI. Most of my writing before mid 2023 are not representative of my current views about alignment difficulty.

Without fun­da­men­tal ad­vances, mis­al­ign­ment and catas­tro­phe are the de­fault out­comes of train­ing pow­er­ful AI

26 Jan 2024 7:22 UTC
159 points
59 comments57 min readLW link