Fejfo comments on I am worried about near-term non-LLM AI developments

Fejfo 12 Aug 2025 20:07 UTC
1 point
0
It’s my impression that a lot of the “promising new architectures” are indeed promising. IMO a lot of them could compete with transformers if you invest in them. It just isn’t worth the risk while the transformer gold-mine is still open. Why do you disagree?
- ACCount 13 Aug 2025 0:41 UTC
  1 point
  0
  Parent
  I disagree because I’m yet to see any of those “promising new architectures” outperform even something like GPT-2 345M, weight for weight, at similar tasks. Or show similar performance with a radical reduction in dataset size. Or anything of the sort.
  I don’t doubt that a better architecture than LLM is possible. But if we’re talking AGI, then we need an actual general architecture. Not a benchmark-specific AI that destroys a specific benchmark, but a more general purpose AI that happens to do reasonably well at a variety of benchmarks it wasn’t purposefully trained for.
  We aren’t exactly swimming in that kind of thing.