Garrett Baker answers What are MIRI’s big achievements in AI alignment?

Garrett Baker 7 Mar 2023 22:35 UTC
23 points
10
I think most every aspiring conceptual alignment researcher should read basically all of the work on Arbital’s AI alignment section. Not all of it is right, but you’ll avoid some obvious-in-retrospect pitfalls you likely would have otherwise fallen into. So I’d count that corpus as a big achievement.

They have a big paper on logical induction. It doesn’t have any applications yet, but possibly will serve some theoretical grounding for later work. And I think the more general idea of seeing inexploitable systems as markets has a good chance of being generally applicable.

Scott Garrabrant has done a lot in the public eye, and so has Vanessa Kosoy.

Risks From Learned Optimization, as others have mentioned, explained & made palatable the idea of “mesa optimizers” to skeptics.