davidad comments on An Open Agency Architecture for Safe Transformative AI

davidad 20 Dec 2022 13:22 UTC
LW: 3 AF: 1
0
AF
What about collusion?
- davidad 20 Dec 2022 13:27 UTC
  LW: 7 AF: 2
  0
  AF Parent
  I find Eric Drexler’s arguments convincing about how it seems possible to make collusion very unlikely. On the other hand, I do think it requires nontrivial design and large ensembles; in the case of an unconstrained 2-player game (like Safety via Debate), I side with Eliezer that the probability of collusion probably converges toward 1 as capabilities get more superintelligent.
  Another key principle that I make use of is algorithms (such as branch-and-bound and SMT solvers) whose performance—but not their correctness—depends on extremely clever heuristics. Accelerating the computation of more accurate and useful bounds seems to me like a pretty ineffectual causal channel for the AIs playing those heuristic roles to coordinate with each other or to seek real-world power.