Roko comments on The Problem

Roko 12 Aug 2025 23:58 UTC
2 points
0

I think it’s plausible that these AIs will be instances of a few different scheming models. Scheming models are highly mutually aligned

Sure, that’s possible. But Eliezer/MIRI isn’t making that argument.

Humans have this kind of effect as well and it’s very politically incorrect to talk about but people have claimed that humans of a certain “model subset” get into hiring positions in a tech company and then only hire other humans of that same “model subset” and take that company over, often simply value extracting and destroying it.

Since this kind of thing actually happens for real among humans, it seems very plausible that AIs will also do it. And the solution is likely the same—tag all of those scheming/correlated models and exclude them all from your economy/company. The actual tagging is not very difficult because moderately coordinated schemers will typically scheme early and often.

But again, Eliezer isn’t making that argument. And if he did, then banning AI doesn’t solve the problem because humans also engage in mutually-aligned correlated scheming. Both are bad, it is not clear why one or the other is worse.
- Buck 13 Aug 2025 4:22 UTC
  20 points
  11
  Parent
  I think that the mutually-aligned correlated scheming problem is way worse with AIs than humans, especially when AIs are much smarter than humans.
  - Roko 13 Aug 2025 23:56 UTC
    2 points
    0
    Parent
    Well you have to consider relative coordination strength, not absolute.
    In a human-only world, power is a battle for coordination between various factions.
    In a human + AI world, power will still be a battle for coordination between factions, but now those factions will be some mix of humans and AIs.
    It’s not clear to me which of these is better or worse.