StanislavKrym comments on Brendan Long’s Shortform

StanislavKrym 9 Apr 2026 1:16 UTC
3 points
0
I thought that I’ve had enough of xAI being likely 3 months behind the frontier, and now we get this… I tried to find out anything about Meta’s model and had Claude Opus 4.6 conclude that Meta’s model is also 3-4 months behind. There also is the issue of Meta having manipulated some benchmarks to present Llama 4 as more capable and with Meta’s claimed benchmark performance on the benchmarks ARC-AGI-2 and SWE-bench verified where the rivals’ models allegedly have different results than in the real leaderboards of ARC-AGI-2 and SWE-bench verified, likely because of a different method of elicitation. How do I lobby for a law change requiring EVERY new American model to be thoroughly evaluated by the entire Big Three?
What links here?
- StanislavKrym's comment on kaiwilliams’s Shortform by kaiwilliams (15 Apr 2026 22:07 UTC; 4 points)