TheBayesian comments on AI in 2025: gestalt

TheBayesian 8 Dec 2025 3:31 UTC
3 points
1
Pretraining (GPT-4.5, Grok 4, but also counterfactual large runs which weren’t done) disappointed people this year. It’s probably not because it wouldn’t work; it was just ~30 times more efficient to do post-training instead, on the margin. This should change, yet again, soon, if RL scales even worse.
IMO this should be edited to say Grok 3 instead of Grok 4. Grok 3 was mostly pre-training, and Grok 4 was mostly Grok 3 with more post-training.
- technicalities 8 Dec 2025 3:53 UTC
  2 points
  0
  Parent
  You’re saying they’re the same base model? Cite?
  - TheBayesian 8 Dec 2025 4:22 UTC
    7 points
    0
    Parent
    Elon changed the planned name of Grok 3.5 to Grok 4 shortly before release:
    https://x.com/elonmusk/status/1936333964693885089?s=20
    Then used this image during Grok 4 release announcement:
    They don’t confirm it outright, but it’s heavily implied and it was widely understood at the time to be the same pre-train.
    - technicalities 8 Dec 2025 4:30 UTC
      4 points
      0
      Parent
      Thanks!