It does seem like a bad sign to me if the insights generated by these visionaries don’t seem to be part of the same thing enough that they build off each other at least a little bit. Which makes me wonder what an elementary textbook coauthored by all the major AI alignment researchers would look like… what are the core things they all agree it’s important to know?
Not a textbook (more for a general audience) but The Alignment Problem by Brian Christian is a pretty good introduction that I reckon most people interested in this would get behind.
Building a tunnel from 2 sides is the same thing even if those 2 sides don’t see each other initially. I believe some, but not all, approaches will end up seeing each other, that it’s not a bad sign if we are not there yet.
Since we don’t seem to have time to build 2 “tunnels” (independent solutions to alignment), a bad sign would be if we could prove all of the approaches are incompatible with each other, which I hope is not the case.
In this analogy, the trouble is, we do not know whether we’re building tunnels in parallel (same direction) or the opposite, or zig zag. The reason for that is a lack of clarity about what will turn out to be a fundamentally important approach towards building a safe AGI. So, it seems to me that for now, exploration for different approaches might be a good thing and the next generation of researchers does less digging and is able to stack more on the existing work
It does seem like a bad sign to me if the insights generated by these visionaries don’t seem to be part of the same thing enough that they build off each other at least a little bit. Which makes me wonder what an elementary textbook coauthored by all the major AI alignment researchers would look like… what are the core things they all agree it’s important to know?
AGI Safety Fundamentals is trying to do something that is somewhat similar I think.
Not a textbook (more for a general audience) but The Alignment Problem by Brian Christian is a pretty good introduction that I reckon most people interested in this would get behind.
Building a tunnel from 2 sides is the same thing even if those 2 sides don’t see each other initially. I believe some, but not all, approaches will end up seeing each other, that it’s not a bad sign if we are not there yet.
Since we don’t seem to have time to build 2 “tunnels” (independent solutions to alignment), a bad sign would be if we could prove all of the approaches are incompatible with each other, which I hope is not the case.
In this analogy, the trouble is, we do not know whether we’re building tunnels in parallel (same direction) or the opposite, or zig zag. The reason for that is a lack of clarity about what will turn out to be a fundamentally important approach towards building a safe AGI. So, it seems to me that for now, exploration for different approaches might be a good thing and the next generation of researchers does less digging and is able to stack more on the existing work