less_raichu comments on less_raichu’s Shortform

less_raichu 6 May 2026 13:16 UTC
2 points
0
What’s some terminology or lit to discuss the takeover scenario where “AI just sort of gradually takes over and no one notices”?
Sam Altman brought it up in a talk a year or two ago. It almost seems the extreme murkiness of it is part of the danger. It’s not clear at what point AI integration has become somehow uniquely bad for humanity’s quest to explore the universe, and it’s also not clear at what point it’s all that different from oligarchy/feudalism government, if you happen to share that cynical political view, which is trendy nowadays. (I guess I share it enough to care about it but not enough to reduce my worldview to it, and not enough to stop thinking takeover is worse.)
I was thinking of this in the context of open world evals which seem to point to this takeover case. The thing is, some open world evals are simply AI grading other AI. For example submitting an app to review by Apple involves an opaque process, which is good for un-gameability by AI training on public data, but bad if Apple is simply using similar AI in the process.
What you don’t want to happen is the best benchmark to start scraping the ceiling of humanity’s collective ability to reject AI slop, followed by us ceding the benchmarks to AI.
(Maybe these are two contrary points—open world evals wouldn’t be trusted if they’re flawed in their reliance on AI? What if an open world eval achieves trust on human data but that eval itself is later ceded to AI as those humans automate?)
epistemic status = probably unoriginal :) but maybe it’s fine to have an unoriginal quick take, I’m learning too
- oligo 6 May 2026 13:18 UTC
  3 points
  1
  Parent
  Gradual disempowerment.
  - less_raichu 6 May 2026 15:23 UTC
    2 points
    0
    Parent
    ty! I guess from here I look if anyone’s connected open-world evals to gradual disempowerment, at least it’s more google-able now.
    - papetoast 6 May 2026 23:19 UTC
      0 points
      0
      Parent
      For the record, LLMs can give you the terminology too. ChatGPT one-shotted it when I just pasted in your shortform.