The planned Ambitious AI Alignment Seminar which aims to rapidly up-skill ~35 people in understanding the foundations of conceptual/theory alignment using a seminar format where people peer-to-peer tutor each other pulling topics from a long list of concepts. It also aims to fast-iterate and propagate improved technical communication skills of the type that effectively seeks truth and builds bridges with people who’ve seen different parts of the puzzle, and following one year fellowship seems like a worthwhile step towards surviving futures in worlds where superintelligence-robust alignment is not dramatically easier than it appears for a few reasons, including:
Having more people who have deep enough technical models of theory that they can usefully communicate predictable challenges, and inform especially policymakers of strategically relevant considerations.
Having many more people who have the kind of clarity which helps them shape new orgs in the worlds where alignment efforts get dramatically more funds
Preparing the ground for any attempts at the lab’s default plan (get AI to solve alignment) by making more people understand what solving alignment in a once-and-for-all way actually requires and entails, so the labs are more likely to ask the AI to solve the kinds of problems it needs to before hard RSIing if value is to be preserved.
A side-output will be ranking concepts by importance and materials for learning them by effectiveness, which seems pretty valuable for the wider community.
The team is all-star and the venue excellent, and I expect it to be an amazing event. There is a funder who is somewhat interested, but they would like more evidence of this kind of effort being thought by competent people to be worthwhile in the form of Manifund comments, especially from people who’ve been involved in similar programs, so I’m signal-boosting it here along with posting the Expression of Interest for joining.
The planned Ambitious AI Alignment Seminar which aims to rapidly up-skill ~35 people in understanding the foundations of conceptual/theory alignment using a seminar format where people peer-to-peer tutor each other pulling topics from a long list of concepts. It also aims to fast-iterate and propagate improved technical communication skills of the type that effectively seeks truth and builds bridges with people who’ve seen different parts of the puzzle, and following one year fellowship seems like a worthwhile step towards surviving futures in worlds where superintelligence-robust alignment is not dramatically easier than it appears for a few reasons, including:
Having more people who have deep enough technical models of theory that they can usefully communicate predictable challenges, and inform especially policymakers of strategically relevant considerations.
Having many more people who have the kind of clarity which helps them shape new orgs in the worlds where alignment efforts get dramatically more funds
Preparing the ground for any attempts at the lab’s default plan (get AI to solve alignment) by making more people understand what solving alignment in a once-and-for-all way actually requires and entails, so the labs are more likely to ask the AI to solve the kinds of problems it needs to before hard RSIing if value is to be preserved.
A side-output will be ranking concepts by importance and materials for learning them by effectiveness, which seems pretty valuable for the wider community.
The team is all-star and the venue excellent, and I expect it to be an amazing event. There is a funder who is somewhat interested, but they would like more evidence of this kind of effort being thought by competent people to be worthwhile in the form of Manifund comments, especially from people who’ve been involved in similar programs, so I’m signal-boosting it here along with posting the Expression of Interest for joining.