Edit: In the below I assign Yudkowsky’s probability of ruin (near certain) with his rough estimate of timelines (5 years from February 2024[1]), despite him not doing so. I’ll leave the below because I am still interested in arguments for and against short timelines, but my implication that “ASI is near certain in the immediate future” can be attributed to Yudkowsky is incorrect.
At the risk of being loudly upset that the points I personally think are most important are not adequately addressed, I think 90% of the difference in my certainty of ruin and Yudkowsky’s lives in point 1. This post goes into quite a lot of detail about all the reasons that a cognitive system with sufficiently high cognitive powers leads to ruins, but seems to gloss over how we get there. Alpha Zero was able to improve so rapidly because self-play in go has clear rules, a perfectly defined reward function, a tight feedback loop, and a guaranteed reward for one player every time through the feedback loop.
The face that we are still alive today seems to be strong evidence that we are in a different paradigm than the day it took Alpha Zero to blow past human ability.
Without that rich RL feedback loop, I think the path to super intelligence is much less certain. We have made quick progress over the last three years first by scaling pre-training compute, then by scaling inference compute, but there is evidence that both are leveling off. Now I think an intelligence explosion like the one described in AI 2027 is very possible, but still likely requires future algorithmic breakthroughs by human researchers (admittedly aided by increasingly capable AI assistants).
If anyone has links to especially strong arguments for why ASI is near certain in the immediate future, please send them my way as I’d love to understand where Yudkowsky’s certainty comes from.
Edit: In the below I assign Yudkowsky’s probability of ruin (near certain) with his rough estimate of timelines (5 years from February 2024[1]), despite him not doing so. I’ll leave the below because I am still interested in arguments for and against short timelines, but my implication that “ASI is near certain in the immediate future” can be attributed to Yudkowsky is incorrect.
At the risk of being loudly upset that the points I personally think are most important are not adequately addressed, I think 90% of the difference in my certainty of ruin and Yudkowsky’s lives in point 1. This post goes into quite a lot of detail about all the reasons that a cognitive system with sufficiently high cognitive powers leads to ruins, but seems to gloss over how we get there. Alpha Zero was able to improve so rapidly because self-play in go has clear rules, a perfectly defined reward function, a tight feedback loop, and a guaranteed reward for one player every time through the feedback loop.
The face that we are still alive today seems to be strong evidence that we are in a different paradigm than the day it took Alpha Zero to blow past human ability.
Without that rich RL feedback loop, I think the path to super intelligence is much less certain. We have made quick progress over the last three years first by scaling pre-training compute, then by scaling inference compute, but there is evidence that both are leveling off. Now I think an intelligence explosion like the one described in AI 2027 is very possible, but still likely requires future algorithmic breakthroughs by human researchers (admittedly aided by increasingly capable AI assistants).
If anyone has links to especially strong arguments for why ASI is near certain in the immediate future, please send them my way as I’d love to understand where Yudkowsky’s certainty comes from.
https://www.theguardian.com/technology/2024/feb/17/humanitys-remaining-timeline-it-looks-more-like-five-years-than-50-meet-the-neo-luddites-warning-of-an-ai-apocalypse
He doesn’t say that? Though plenty of other people do.
Good point. I’ve edited my original comment.