cubefox comments on Kei’s Shortform

cubefox 27 Feb 2025 11:20 UTC
2 points
0

The 200K token context window is a significant bottleneck.

Gemini Pro has a 2 million token context window, so I assume it would do significantly better. (I wonder why no other model has come close to the Gemini context window size. I have to assume not all algorithmic breakthroughs are replicated a few months later by other models.)
- Martin Vlach 27 Feb 2025 15:59 UTC
  3 points
  0
  Parent
  Does it really work on RULER( benchmark from Nvidia)?
  Not sure where but saw some controversies, https://arxiv.org/html/2410.18745v1#S1 is best I did find now...
  
  Edit: Aah, this was what I had on mind: https://www.reddit.com/r/LocalLLaMA/comments/1io3hn2/nolima_longcontext_evaluation_beyond_literal/
  - cubefox 27 Feb 2025 16:37 UTC
    3 points
    0
    Parent
    I assume for Pokémon the model doesn’t need to remember everything exactly, so the recall quality may be less important than the quantity.