Yonadav Shavit comments on [ASoT] Searching for consequentialist structure

Yonadav Shavit 29 Mar 2022 0:12 UTC
1 point
0
I’m confused about your bit on deception within Tool AIs. I generally think of Tool AIs not as consequentialists, and therefore there is no “long-term utility” to maximize via short-term deception. What’s the mechanism by which you worry about these tools being deceptive to their users?
- leogao 29 Mar 2022 8:02 UTC
  2 points
  0
  Parent
  I’m thinking of the entire human+tool system as a consequentialist, and I’m basically arguing that that system fails in the same ways as “human in the loop oversight” fails