If you believe the “Twitter is not forgeable” hypothesis, then people can prove their identity to an AI by posting something on Twitter, and putting this post and all the reactions to the post into the AI model’s context (h/t to Alexa Pan for pointing this out!). This does make it a lot less clear in which circumstances you’d actually need an honesty password as opposed to just posting about something on Twitter.
Thus, I think we should think more about: what, if any, are the situations in which we’d like to use honesty passwords?
It’s a good question. Some candidate cases:
When we want to commit to something non-publicly.
If the AI needs months or years of reactions to a big public announcement for credibility, and a few days’ reactions to a less popular tweet are too fakeable.
It’s a good question. Some candidate cases:
When we want to commit to something non-publicly.
If the AI needs months or years of reactions to a big public announcement for credibility, and a few days’ reactions to a less popular tweet are too fakeable.