We actually have evidence that xAI spent about as much compute on Reinforcement Teaching Grok 4 (to deal with the ARC-AGI-2 bench and to solve METR-like tasks, but not to do things like the AI village or Vending Bench?) as on pretraining it. What we don’t know is how they had Grok 4 instances coordinate with each other in Grok 4 Heavy, nor what they are on track to do to ensure that Grok 5 ends up being AGI...
We actually have evidence that xAI spent about as much compute on Reinforcement Teaching Grok 4 (to deal with the ARC-AGI-2 bench and to solve METR-like tasks, but not to do things like the AI village or Vending Bench?) as on pretraining it. What we don’t know is how they had Grok 4 instances coordinate with each other in Grok 4 Heavy, nor what they are on track to do to ensure that Grok 5 ends up being AGI...