Cleo Nardo comments on Honorable AI

Cleo Nardo 25 Dec 2025 0:19 UTC
2 points
0
The AI is very honorable/honest/trustworthy — in particular, the AI would keep its promises even in extreme situations.
NB: It seems like we need a (possibly much weaker, but maybe in practice no weaker) assumption that we can detect whether the AI is lying about deals of the form in Step 2.