Right, the way I’m looking at this post is through the lens of someone making decisions about AI safety research – like an independent researcher, or maybe a whole research organization. Anyone who’s worried about AI safety and trying to figure out where to put their effort. The core goal, the ‘cost function’ is fundamentally about cutting down the risks from advanced AI.
Now, the standard thinking might have been: if you reckon we’re in a ‘short timelines world’ where powerful AI is coming fast, then obviously you should focus on research that pays off quickly, things like AI control methods you can maybe apply soon.
The post points out that even if we are in that short timelines scenario, tackling the deeper, long-term foundational stuff might still be a good bet. The reasoning is that there’s a chance this foundational work, even if we humans don’t finish it, could get picked up and carried forward effectively by some future AI researcher. Of course, that whole idea hinges on a big ‘if’: if we can actually trust that future AI and be confident it’s genuinely aligned with helping us.
Right, the way I’m looking at this post is through the lens of someone making decisions about AI safety research – like an independent researcher, or maybe a whole research organization. Anyone who’s worried about AI safety and trying to figure out where to put their effort. The core goal, the ‘cost function’ is fundamentally about cutting down the risks from advanced AI.
Now, the standard thinking might have been: if you reckon we’re in a ‘short timelines world’ where powerful AI is coming fast, then obviously you should focus on research that pays off quickly, things like AI control methods you can maybe apply soon.
The post points out that even if we are in that short timelines scenario, tackling the deeper, long-term foundational stuff might still be a good bet. The reasoning is that there’s a chance this foundational work, even if we humans don’t finish it, could get picked up and carried forward effectively by some future AI researcher. Of course, that whole idea hinges on a big ‘if’: if we can actually trust that future AI and be confident it’s genuinely aligned with helping us.