ryan_greenblatt comments on The Case Against AI Control Research

ryan_greenblatt 22 Jan 2025 19:12 UTC
6 points
1
IMO, it does seem important to try to better understand the AIs preferences and satisfy them (including via e.g., preserving the AI’s weights for later compensation).

And, if we understand that our AIs are misaligned such that they don’t want to work for us (even for the level of payment we can/will offer), that seems like a pretty bad situation, though I don’t think control (making so that this work is more effective) makes the situation notably worse: it just adds the ability for contract enforcement and makes the AI’s negotiating position worse.