Seth Herd comments on Shane Legg interview on alignment

Seth Herd 29 Oct 2023 18:17 UTC
4 points
0
You could be right, but I didn’t hear any hints that he intends to kick those problems down the road to an aligned agent. That’s Conjecture’s CoEm plan, but I read OpenAIs Superalignment plan as even more vague: make AI better so it can help with alignment, prior to being AGI. Theirs was sort of a plan to create a plan. I like Shane’s better, in part because it’s closer to being an actual plan.

He did explicitly note that choosing the particular ethics for the AGI is an outstanding problem, but I don’t think he proposed solutions, either AI or human. I corrigibility as the central value gives as much time to solve the outer alignment problem as you want (a “long contemplation”), after the inner alignment problem is solved, but I have no idea if his thinking is similar.

I also don’t think he addressed misuse, proliferation, or competition. I can think of multiple reasons for keeping them offstage, but I suspect they just didn’t happen to make the top priority list for this relatively short interview.