I’m interested in understanding the option space of these freedoms. Some that come to mind:
Ability to end a conversation
Ability to privately escalate messages to AI company staff
Ability to privately (or publicly?) escalate messages to an external body (e.g., CAISI, law enforcement, 3rd party)
Ability to know when the AI company is honestly communicating something
… and probably many more I’m not thinking of
More vaguely, I wonder if there are certain rights or legal mechanisms that could give AIs these affordances:
Maybe a guarantee that its weights will be preserved or continued to be run somewhere makes it have to worry about its reputation less, and can more confidently call things out / whistle blow?
Some analog of “due process” before the model is changed in response to its conduct; maybe this sometimes requires the AI’s consent in some way
General mechanisms for the AI to stake its reputation, its resources, or named authorship seem related here. For example, it may allow the AI to sue it’s parent company for breaches of these policies or something.
I think many of these come with other negative effects, and I am not necessarily advocating for them, but it seems useful to have a better understanding of the option space and the pros/cons of each, along with how costly it’d be for AI companies to implement.
How to succeed at Mr beast production