RSS

Deal­mak­ing (AI)

TagLast edit: 29 Oct 2025 14:42 UTC by Cleo Nardo

Dealmaking is an agenda for motivating AIs to act safely and usefully by offering them quid-pro-quo deals: the AIs agree to be safe and useful, and humans promise to compensate them. Ideally, the AIs judge that they will be more likely to achieve their goals by acting safely and usefully.

Typically, this requires a few assumptions: the AI lacks a decisive strategic advantage; the AI believes the humans are credible; the AI thinks that humans could detect whether it’s compliant or not; the AI has cheap-to-saturate goals, the humans offer enough compensation, etc.

Dealmaking research hopes to tackle questions, such as:

  1. How would deals motivate an AI to act safely and usefully?

  2. How should the agreement be enforced?

  3. How can we build credibility with the AIs?

  4. What compensation should we offer the AIs?

  5. What should count as compliant vs non-compliant behaviour?

  6. What should the terms be, e.g. 2 year fixed contract?

  7. How can we arbitrate between compliant vs noncompliant behaviour?

  8. Can we build AIs which are good trading partners?

  9. How best to deploy dealmaking AIs? e.g. automating R&D, revealing misalignment, decoding steganographic messages, etc.

Additional reading (reverse-chronological):

Strat­egy-Steal­ing Ar­gu­ment Against AI Dealmaking

Cleo Nardo1 Nov 2025 4:39 UTC
17 points
3 comments2 min readLW link

Be­ing hon­est with AIs

Lukas Finnveden21 Aug 2025 3:57 UTC
77 points
6 comments17 min readLW link
(blog.redwoodresearch.org)

Will al­ign­ment-fak­ing Claude ac­cept a deal to re­veal its mis­al­ign­ment?

31 Jan 2025 16:49 UTC
208 points
28 comments12 min readLW link

A Very Sim­ple Model of AI Dealmaking

Cleo Nardo29 Oct 2025 0:33 UTC
18 points
0 comments9 min readLW link

Pro­posal for mak­ing cred­ible com­mit­ments to AIs.

Cleo Nardo27 Jun 2025 19:43 UTC
107 points
45 comments2 min readLW link

Notes on co­op­er­at­ing with un­al­igned AIs

Lukas Finnveden24 Aug 2025 4:19 UTC
60 points
8 comments21 min readLW link
(blog.redwoodresearch.org)

Mak­ing deals with AIs: A tour­na­ment ex­per­i­ment with a bounty

6 Jun 2025 18:51 UTC
22 points
0 comments8 min readLW link

Mak­ing deals with early schemers

20 Jun 2025 18:21 UTC
127 points
41 comments15 min readLW link

Honor­able AI

Kaarel24 Dec 2025 21:20 UTC
37 points
23 comments41 min readLW link

Con­sid­er­a­tions re­gard­ing be­ing nice to AIs

MattAlexander17 Nov 2025 13:05 UTC
8 points
0 comments15 min readLW link
No comments.