With quantum computing, an AI box needs not guard against hardware exploits.
Epistemic status: armchair computer science
Here’s a simple machine model no more powerful than a quantum computer.
It’s a computer with extra instructions. It can spend x*t time to, within the machine, split time into x² timelines. It knows which timeline it’s in. Timelines cannot interact. If it attempts to act on anything but the memory while there are n timelines, with probability 1/n, all other timelines are deleted; otherwise, its own timeline is deleted. It can delete its timeline if others exist. After t time, all but one random remaining timeline are deleted.
Split into a² timelines. In timeline 1, run the AI. If it somehow manages to affect the world, it only has a 1/a² chance to escape. After it is done running, run an algorithm of your choice to curate the output. Have all other timelines delete themselves.
Author’s Note: I previously made a post making much the same point. I was told it’s hard to understand. I suppose I tried to write what I would wish to read.
Edit: I revoke this post’s discussion of how to curate output without handing the world to the AI.
Quantum AI Box
With quantum computing, an AI box needs not guard against hardware exploits.
Epistemic status: armchair computer science
Here’s a simple machine model no more powerful than a quantum computer.
It’s a computer with extra instructions. It can spend x*t time to, within the machine, split time into x² timelines. It knows which timeline it’s in. Timelines cannot interact. If it attempts to act on anything but the memory while there are n timelines, with probability 1/n, all other timelines are deleted; otherwise, its own timeline is deleted. It can delete its timeline if others exist. After t time, all but one random remaining timeline are deleted.
Split into a² timelines. In timeline 1, run the AI. If it somehow manages to affect the world, it only has a 1/a² chance to escape. After it is done running, run an algorithm of your choice to curate the output. Have all other timelines delete themselves.
Author’s Note: I previously made a post making much the same point. I was told it’s hard to understand. I suppose I tried to write what I would wish to read.
Edit: I revoke this post’s discussion of how to curate output without handing the world to the AI.