Ack. Despite the fact that we’ve been having the AI boxing/infohazards conversation for like a decade I still don’t feel like I have a robust sense of how to decide whether a source is going to feed me or hack me. The criterion I’ve been operating on is like, “if it’s too much smarter than me, assume it can get me to do things that aren’t in my own interest”, but most egregores/epistemic networks, which I’m completely reliant upon, are much smarter than me, so that can’t be right.
Does this message also contain an ulterior motive?
If so, or if not, how can we conclusively determine either?
Maybe instead of sharing it over the internet with public access, this should have been debated extensively in an expert committee?
Ack. Despite the fact that we’ve been having the AI boxing/infohazards conversation for like a decade I still don’t feel like I have a robust sense of how to decide whether a source is going to feed me or hack me. The criterion I’ve been operating on is like, “if it’s too much smarter than me, assume it can get me to do things that aren’t in my own interest”, but most egregores/epistemic networks, which I’m completely reliant upon, are much smarter than me, so that can’t be right.
*Egregore smiles*
The wisest know nothing.