Edit: I actually think it’s good news for alignment, that their math and coding capabilities are at approaching International Math Olympiad levels, but their agentic capabilities are still at Pokemon Red and Pokemon Blue levels (i.e. a small child).
This means that when the AI inevitably reaches the capabilities to influence the world in any way it wants, it may still be bottlenecked by agentic capabilities. Instead of turning the world into paperclips, it may find a way to ensure humans have a happy future, because it still isn’t agentic enough to deceive and overthrow its creators.
Maybe it’s worth it to invest in AI control strategies. It might just work.
But that’s my wishful thinking, and there are countless ways this can go wrong, so don’t take this too seriously.
Edit: I actually think it’s good news for alignment, that their math and coding capabilities are at approaching International Math Olympiad levels, but their agentic capabilities are still at Pokemon Red and Pokemon Blue levels (i.e. a small child).
This means that when the AI inevitably reaches the capabilities to influence the world in any way it wants, it may still be bottlenecked by agentic capabilities. Instead of turning the world into paperclips, it may find a way to ensure humans have a happy future, because it still isn’t agentic enough to deceive and overthrow its creators.
Maybe it’s worth it to invest in AI control strategies. It might just work.
But that’s my wishful thinking, and there are countless ways this can go wrong, so don’t take this too seriously.