RSS

Seth Herd

Karma: 2,641

I’ve been doing computational cognitive neuroscience research since getting my PhD in 2006, until the end of 2022. I’ve worked on computatonal theories of vision, executive function, episodic memory, and decision-making. I’ve focused on the emergent interactions that are needed to explain complex thought. I was increasingly concerned with AGI applications of the research, and reluctant to publish my best ideas. I’m incredibly excited to now be working directly on alignment, currently with generous funding from the Astera Institute. More info and publication list here.

Agen­tized LLMs will change the al­ign­ment landscape

Seth Herd9 Apr 2023 2:29 UTC
153 points
95 comments3 min readLW link

Ca­pa­bil­ities and al­ign­ment of LLM cog­ni­tive architectures

Seth Herd18 Apr 2023 16:29 UTC
80 points
18 comments20 min readLW link

Shane Legg in­ter­view on alignment

Seth Herd28 Oct 2023 19:28 UTC
65 points
20 comments2 min readLW link
(www.youtube.com)

In­ter­nal in­de­pen­dent re­view for lan­guage model agent alignment

Seth Herd7 Jul 2023 6:54 UTC
53 points
26 comments11 min readLW link

OpenAI Staff (in­clud­ing Sutskever) Threaten to Quit Un­less Board Resigns

Seth Herd20 Nov 2023 14:20 UTC
52 points
29 comments1 min readLW link
(www.wired.com)

AI scares and chang­ing pub­lic beliefs

Seth Herd6 Apr 2023 18:51 UTC
45 points
21 comments6 min readLW link

Goals se­lected from learned knowl­edge: an al­ter­na­tive to RL alignment

Seth Herd15 Jan 2024 21:52 UTC
39 points
17 comments7 min readLW link

We have promis­ing al­ign­ment plans with low taxes

Seth Herd10 Nov 2023 18:51 UTC
30 points
9 comments5 min readLW link