The ARC-AGI leaderboard got an update. IIRC, the base LLM Qwen3-235b-a22b Instruct (25/07) is the first Chinese model to excel at the Pareto frontier. Or is it likely to be closely matched by the West, as happened with DeepSeek R1 (released on Jaunary 20?) and o3-mini (January 31)? And is China likely to cheaply create higher-level models like an analogue of o3 BEFORE the West? If China does, then how are the two countries to reach the Slowdown Ending?
The two main problems with the slowdown ending of the AI-2027 scenario are the two optimistic assumptions, which I plan to cover in two different posts.
If China invades Taiwan in March 2026 and steals Agent-2 in Jan 2027, then OpenBrain no longer has the absolute lead necessary for the unilateral slowdown.
What if any sufficiently powerful AI either takes over or becomes a protective god, but not a servant, as I conjectured here? Then it could be the slowdown ending that has a greater chance to lead to doom, since then OpenBrain is stuck with an insoluble problem.
Why does the Race Ending of the AI-2027 Forecast claim that “there are compelling theoretical reasons to expect no aliens for another fifty million light years beyond that”? If it’s false, then sapient alien lifeforms should also be moral patients in a way. For example, this implies that all or almost all resources in their home system (and, apparently, some part of space around them) should belong to them, not to humans or a human-aligned AI. And that’s ignoring the possibility that humans encounter a planet having the chance to generate a sapient lifeform...
If we were to view raising the humans from birth to adulthood and training the AI agents from birth to deployment as similar processes, then what human analogues do the six goal types from the AI-2027 forecast have? The analogues of developers are, obviously, the adults who have at least partial control over the human’s life. Then the analogues of written Specs and developer-intended goals are the adults’ intentions; the analogues of reward/reinforcement seems to be short-term stimuli and the morals of one’s communities. I also think that the best analogue for proxies and/or convergent goals is possession of resources (and knowledge, but the latter can be acquired without ethical issues), while the ‘other goals’ are, well, ideologies, morality[1] and tropes absorbed from the most concentrated form of training data available to humans, which is speech in all its forms.
What exactly do the analogies above tell us about the perspectives of alignment? The possession of resources is the goal behind aggressive wars, colonialism and related evils[2]. If human culture managed to make them unacceptable, then does it imply that the AI will also not try the AI takeover?
I also think that humans rarely develop their own moral codes or ideologies; instead, they usually adopt some moral code or ideology close to the one existing in the “training data”. Could anyone comment on this?
The ARC-AGI leaderboard got an update. IIRC, the base LLM Qwen3-235b-a22b Instruct (25/07) is the first Chinese model to excel at the Pareto frontier. Or is it likely to be closely matched by the West, as happened with DeepSeek R1 (released on Jaunary 20?) and o3-mini (January 31)? And is China likely to cheaply create higher-level models like an analogue of o3 BEFORE the West? If China does, then how are the two countries to reach the Slowdown Ending?
The two main problems with the slowdown ending of the AI-2027 scenario are the two optimistic assumptions, which I plan to cover in two different posts.
If China invades Taiwan in March 2026 and steals Agent-2 in Jan 2027, then OpenBrain no longer has the absolute lead necessary for the unilateral slowdown.
What if any sufficiently powerful AI either takes over or becomes a protective god, but not a servant, as I conjectured here? Then it could be the slowdown ending that has a greater chance to lead to doom, since then OpenBrain is stuck with an insoluble problem.
Why does the Race Ending of the AI-2027 Forecast claim that “there are compelling theoretical reasons to expect no aliens for another fifty million light years beyond that”? If it’s false, then sapient alien lifeforms should also be moral patients in a way. For example, this implies that all or almost all resources in their home system (and, apparently, some part of space around them) should belong to them, not to humans or a human-aligned AI. And that’s ignoring the possibility that humans encounter a planet having the chance to generate a sapient lifeform...
If we were to view raising the humans from birth to adulthood and training the AI agents from birth to deployment as similar processes, then what human analogues do the six goal types from the AI-2027 forecast have? The analogues of developers are, obviously, the adults who have at least partial control over the human’s life. Then the analogues of written Specs and developer-intended goals are the adults’ intentions; the analogues of reward/reinforcement seems to be short-term stimuli and the morals of one’s communities. I also think that the best analogue for proxies and/or convergent goals is possession of resources (and knowledge, but the latter can be acquired without ethical issues), while the ‘other goals’ are, well, ideologies, morality[1] and tropes absorbed from the most concentrated form of training data available to humans, which is speech in all its forms.
What exactly do the analogies above tell us about the perspectives of alignment? The possession of resources is the goal behind aggressive wars, colonialism and related evils[2]. If human culture managed to make them unacceptable, then does it imply that the AI will also not try the AI takeover?
I also think that humans rarely develop their own moral codes or ideologies; instead, they usually adopt some moral code or ideology close to the one existing in the “training data”. Could anyone comment on this?
And crimes, but criminals, unlike colonizers, also try to avoid conflicts with the law enforcers that have at least similar power.