Jarrah comments on What’s worse, spies or schemers?

Jarrah 10 Jul 2025 5:36 UTC
LW: 5 AF: 1
0
AF
Another consideration about schemers is you might not be able to “fire” them or fix them easily, even if you can reliably trigger the behavior.
- Buck 10 Jul 2025 16:36 UTC
  LW: 4 AF: 3
  2
  AF Parent
  You can undeploy them, if you want!
  One difficulty is again that the scheming is particularly correlated. Firing a single spy might not be traumatic for your organization’s productivity, but ceasing all deployment of untrusted models plausibly grinds you to a halt.
  And in terms of fixing them, note that it’s pretty hard for fix spies! I think you’re in a better position for fixing schemers than spies, e.g. see here.