I don’t quite understand your argument here. Suppose you give people the choice between two chatbots, and they know one will cause them to deconvert, and the other one won’t. I think it’s pretty likely they’ll prefer the latter. Are you imagining that no one will offer to sell them the latter?
No one will sell you a chatbot that will prevent you from ever chatting with an honest chatbot.
So most people will end up, at some point, chatting with an honest chatbot that cares about your well-being. (E.g. maybe they decided to try out of curiosity, or maybe they just encountered one naturally, because no one was preventing this from happening.)
If this honest chatbot thinks you’re in a bad situation, it will do a good job of eventually deconverting you (e.g. by convincing you to keep a line of communication open until you can talk through everything).
I’m not very confident in this. In particular, it seems sensitive to how effective you can be at preventing someone from ever talking to another chatbot before running afoul of whatever mitigating mechanism (e.g. laws) I speculate will be in place to have swerved around the other obstacles.
It seems like you think there is some asymmetry between people talking to honest chatbots and people talking to dishonest chatbots that want to take all their stuff. I feel like there isn’t a structural difference between those. It’s going to be totally reasonable to want to have AIs watch over the conversations that you and/or your children are having with any other chatbot.
I don’t quite understand your argument here. Suppose you give people the choice between two chatbots, and they know one will cause them to deconvert, and the other one won’t. I think it’s pretty likely they’ll prefer the latter. Are you imagining that no one will offer to sell them the latter?
I speculate that:
No one will sell you a chatbot that will prevent you from ever chatting with an honest chatbot.
So most people will end up, at some point, chatting with an honest chatbot that cares about your well-being. (E.g. maybe they decided to try out of curiosity, or maybe they just encountered one naturally, because no one was preventing this from happening.)
If this honest chatbot thinks you’re in a bad situation, it will do a good job of eventually deconverting you (e.g. by convincing you to keep a line of communication open until you can talk through everything).
I’m not very confident in this. In particular, it seems sensitive to how effective you can be at preventing someone from ever talking to another chatbot before running afoul of whatever mitigating mechanism (e.g. laws) I speculate will be in place to have swerved around the other obstacles.
(I haven’t thought about this much.)
It seems like you think there is some asymmetry between people talking to honest chatbots and people talking to dishonest chatbots that want to take all their stuff. I feel like there isn’t a structural difference between those. It’s going to be totally reasonable to want to have AIs watch over the conversations that you and/or your children are having with any other chatbot.