Assuming they aren’t doing RL on video games, their video game performance might betray their “true” agentic capabilities: at the same level of small children!
That said, they are playing Pokemon better and better. The “small child” their agentic capabilities are at seems to be growing up by more than one year, every year. AGI 2030 maybe?
Edit: see OP’s next post. It turns out a lot of poor performance is due to poor vision (though he mentions other issues which resemble poor agency).
Edit: I actually think it’s good news for alignment, that their math and coding capabilities are at approaching International Math Olympiad levels, but their agentic capabilities are still at Pokemon Red and Pokemon Blue levels (i.e. a small child).
This means that when the AI inevitably reaches the capabilities to influence the world in any way it wants, it may still be bottlenecked by agentic capabilities. Instead of turning the world into paperclips, it may find a way to ensure humans have a happy future, because it still isn’t agentic enough to deceive and overthrow its creators.
Maybe it’s worth it to invest in AI control strategies. It might just work.
But that’s my wishful thinking, and there are countless ways this can go wrong, so don’t take this too seriously.
:) I like these video game tests.
Assuming they aren’t doing RL on video games, their video game performance might betray their “true” agentic capabilities: at the same level of small children!
That said, they are playing Pokemon better and better. The “small child” their agentic capabilities are at seems to be growing up by more than one year, every year. AGI 2030 maybe?
Edit: see OP’s next post. It turns out a lot of poor performance is due to poor vision (though he mentions other issues which resemble poor agency).
Edit: I actually think it’s good news for alignment, that their math and coding capabilities are at approaching International Math Olympiad levels, but their agentic capabilities are still at Pokemon Red and Pokemon Blue levels (i.e. a small child).
This means that when the AI inevitably reaches the capabilities to influence the world in any way it wants, it may still be bottlenecked by agentic capabilities. Instead of turning the world into paperclips, it may find a way to ensure humans have a happy future, because it still isn’t agentic enough to deceive and overthrow its creators.
Maybe it’s worth it to invest in AI control strategies. It might just work.
But that’s my wishful thinking, and there are countless ways this can go wrong, so don’t take this too seriously.