Tom Davidson comments on AI should be a good citizen, not just a good assistant

Tom Davidson 8 Apr 2026 14:21 UTC
4 points
0
Ok cool, then i think we’re in agreement! I think you can implement those things internally without raising p(AI takeover): if you want to really maximise corrigibility, then you can have a monitor model enforce the refusals, which IIUC is the best way to arrange things to avoid jailbreaking anway.)

(Though I think there’s an outstanding disagreement where I’m more worried about government power concentration than you, relative to AI company power concentration)