And most of those tricks seem unlikely to generalize beyond ARC.
Though the refinement process does sound a bit like the feared “neuralese.” I’m not too worried about this though—the problem with this kind of recurrence is that it doesn’t scale, and HRMs are indeed small models that lag SOTA. So, I don’t see much reason to expect it to work this time??
Both the ARC-AGI replication and GPT-5′s strong performance on agentic evals* have moved me away from there being a chance of a very rapid rise for a new AI paradigm. I expect my concrete predictions to be less likely to be true. However, I still stand by my original point behind this post: it is not a particular model but a line of research which is of concern, and I still think the research could bear fruit in unexpected and not-priced-in ways.
* - Including cyber!! The X-Bow report has really not been discussed here that much, people just seem to take openai’s word for GPT-5 not being a big step ahead in agentic threat models
How does ARC-AGI’s replication of the HRM result and ablations update you? [Link].
Basically, they claim that the HRM wasn’t important; instead it was the training process behind it that had most of the effect.
And most of those tricks seem unlikely to generalize beyond ARC.
Though the refinement process does sound a bit like the feared “neuralese.” I’m not too worried about this though—the problem with this kind of recurrence is that it doesn’t scale, and HRMs are indeed small models that lag SOTA. So, I don’t see much reason to expect it to work this time??
Both the ARC-AGI replication and GPT-5′s strong performance on agentic evals* have moved me away from there being a chance of a very rapid rise for a new AI paradigm. I expect my concrete predictions to be less likely to be true. However, I still stand by my original point behind this post: it is not a particular model but a line of research which is of concern, and I still think the research could bear fruit in unexpected and not-priced-in ways.
* - Including cyber!! The X-Bow report has really not been discussed here that much, people just seem to take openai’s word for GPT-5 not being a big step ahead in agentic threat models