Cleo Nardo comments on Lukas Finnveden’s Shortform

Cleo Nardo 5 Feb 2026 21:41 UTC
LW: 2 AF: 1
0
AF
Maybe, but the schemers could optimize their research for looking-good-to-us, and it might be hard to distinguish this from actually good work
if you ask each variant to review the research of the other variants, then the schemers need to optimise their research for looking good to each variant. but the optimistic assumption is that at least one variant is an equally capable non-schemer.
- Lukas Finnveden 5 Feb 2026 22:27 UTC
  LW: 4 AF: 2
  0
  AF Parent
  then the schemers need to optimise their research for looking good to each variant
  Not necessarily. The nice AIs also need to be able to win a debate against the schemers, as judged by humans. It’s not enough for the variants to be able to recognize poor research if they can’t show their work in an unrefutable (by the schemers) way.