Grok 3 used maybe 3x more compute than 4o or Gemini and topped Chatbot Arena and many benchmarks despite the facts that xAI was playing catch-up and 3x isn’t that significant since the gain is logorithmic.
I take Grok 3′s slight superiority as evidence for, not against, the importance of scaling hardware.
Grok 3 used maybe 3x more compute than 4o or Gemini and topped Chatbot Arena and many benchmarks despite the facts that xAI was playing catch-up and 3x isn’t that significant since the gain is logorithmic.
I take Grok 3′s slight superiority as evidence for, not against, the importance of scaling hardware.
How do we know it was 3x? (If true, I agree with your analysis)
Based on Vladimir_Nesov’s calculations:
https://www.lesswrong.com/posts/WNYvFCkhZvnwAPzJY/go-grok-yourself?commentId=p3nTkpshMq7SmXLjc