Predictions & Self-awareness

13 Feb 2021 22:16 UTC

Some animals, if put in front of a mirror, will notice that there is some kind of moving animal-ish thing in front of them. They are “aware of themselves”, but they are not necessarily “self-aware” in the sense we normally use the term. The animals that pass the mirror test are the ones that realize the moving animal-ish thing is them.

Suppose we create a powerful AI system that uses (un)supervised learning techniques to understand and make predictions about the world. If the dataset the AI system is trained on includes data about itself, the AI system will be “aware of itself” in the sense of seeing an animal-ish thing in the mirror. Is there a risk that it could graduate to “self-awareness” in the sense of realizing the thing in its training data is it?

I contend this risk is low. When an animal passes the mirror test, it is noticing an isomorphism between its inborn sense of self (endowed by evolution for self-preservation) and the thing in the mirror. But if we don’t endow our AI system with an inborn sense of self, there is no isomorphism to notice.

That doesn’t mean purely predictive AI systems are completely safe.

Photo: Christian Holmér

The Dualist Predict-O-Matic ($100 prize)

John_Maxwell17 Oct 2019 6:45 UTC

19 points

35 comments5 min readLW link

Self-Fulfilling Prophecies Aren’t Always About Self-Awareness

John_Maxwell18 Nov 2019 23:11 UTC

14 points

7 comments4 min readLW link

Predictions & Self-awareness

The Dual­ist Pre­dict-O-Matic ($100 prize)

Self-Fulfilling Prophe­cies Aren’t Always About Self-Awareness

The Dualist Predict-O-Matic ($100 prize)

Self-Fulfilling Prophecies Aren’t Always About Self-Awareness