ryan_greenblatt comments on Making deals with early schemers

ryan_greenblatt 23 Jun 2025 21:53 UTC
LW: 2 AF: 2
0
AF
I also think you could use deals to better understand and iterate against scheming. As in, you could use whether the AI accepts a deal as evidence for whether it was scheming so this can more easily be studied in a test setting and we can find approaches which make scheming less likely. There are a number of practical difficulties with this.