Alex Vermillion answers Is keeping AI “in the box” during training enough?

Alex Vermillion 7 Jul 2021 1:22 UTC
1 point
There is a paper/essay/blogpost (maybe by Hanson) floating around somewhere that talks about this problem.

Basically, an AI might behave totally normally for a long time, but after reaching a computationally expensive state, like spreading out over a solar system, it might realize the chances it is in a box that it is capable of understanding are basically nil and it could then use non-box-friendly strategies. I hope this description helps someone remember what I’m thinking of.

The point is that there are natural states that could occur that lead a sufficiently advanced mind to set a really low probability on the hypothesis that it is boxed.