anaguma comments on Daniel Tan’s Shortform

anaguma 11 Oct 2025 18:23 UTC
2 points
0
Yes, but I would have expected that if the method is scalable, even on narrow tasks, they would demonstrate its performance on some task which didn’t saturate at 7M params.
- Jozdien 11 Oct 2025 18:39 UTC
  6 points
  2
  Parent
  Given that the primary motivation for the author was how well the original HRM paper did on ARC-AGI and how the architecture could be improved, it seems like a reasonable choice to show how to improve the architecture to perform better on the same task.
  I agree it’s a small amount of evidence that they didn’t try other tasks, but as is the story seems pretty plausible.