Tim H comments on anaguma’s Shortform

Tim H 1 May 2026 20:33 UTC
1 point
−1
Isn’t the explanation just that an influential AI blog named GPT 5 his “Research Goblin”?
- anaguma 1 May 2026 21:37 UTC
  3 points
  1
  Parent
  I haven’t seen that. OpenAI gives the following explanation:
  As goblin and gremlin mentions increased under the Nerdy personality, they increased by nearly the same relative proportion in samples without it. Taken together, the evidence suggests that the broader behavior emerged through transfer from Nerdy personality training.
  The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them. Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data.
  That creates a feedback loop:
  Playful style is rewarded
  Some rewarded examples contain a distinctive lexical tic.
  The tic appears more often in rollouts.
  Model-generated rollouts are used for supervised fine-tuning (SFT).
  The model gets even more comfortable producing the tic.
  - Tim H 4 May 2026 17:17 UTC
    1 point
    0
    Parent
    Right, I actually read that. But is it not missing an explanation of why those mentions increased under the Nerdy personality in the first place? If the Simon Willison post (which I also haven’t seen anyone else discussing) was the origin, that seems worth noting and understanding. And both its timing and Simon’s nerdiness (in a good way) seem to fit.
    update: Nevermind, apparently people were already noticing goblin mentions in April 2025, months prior to that post.