Cooling the computer doesn’t let you get around the Landauer limit! The savings in energy you get by erasing bits at low temperature are offset by the energy you need to dissipate to keep your computer cold. (Erasing a bit at low temperature still generates some heat, and when you work out how much energy your refrigerator has to use to get rid of that heat, it turns out that you must dissipate the same amount as the Landauer limit says you’d have to if you just erased the bit at ambient temperatures.) To get real savings, you have to actually put your computer in an environment that is naturally colder. For example, if you could put a computer in deep space, that would work.
On the other hand, there might also be other good reasons to keep a computer cold, for example if you want to lower the voltage needed to represent a bit, then keeping your computer cold would plausibly help with that. It just won’t reduce your Landauer-limit-imposed power bill.
None of this is to say that I agree with the rest of Jacob’s analysis of thermodynamic efficiency, I believe he’s made a couple of shaky assumptions and one actual mistake. Since this is getting a lot of attention, I might write a post on it.
Okay, I will try to name a strong-but-checkable pivotal act.
(Having a strong-but-checkable pivotal act doesn’t necessarily translate into having a weak pivotal act. Checkability allows us to tell the difference between a good plan and a trapped plan with high probability, but the AI has no reason to give us a good plan. It will just produce output like “I have insufficient computing power to solve this problem” regardless of whether that’s actually true. If we’re unusually successful at convincing the AI our checking process is bad when it’s actually good, then that AI may give us a trapped plan, which we can then determine is trapped. Of course, one should not risk executing a trapped plan, even if one thinks one has identified and removed all the traps. So even if #30 is false, we are still default-doomed. (I’m not fully certain that we couldn’t create some kind of satisficing AI that gets reward 1 if it generates a safe plan, reward 0 if its output is neither helpful nor dangerous, and reward −1 if it generates a trapped plan that gets caught by our checking process. The AI may then decide that it has a higher chance of success if it just submits a safe plan. But I don’t know how one would train such a satisficer with current deep learning techniques.))
The premise of this pivotal act is that even mere humans would be capable of designing very complex nanomachines, if only they could see the atoms in front of them, and observe the dynamics as they bounce and move around on various timescales. Thus, the one and only output of the AI will be the code for fast and accurate simulation of atomic-level physics. Being able to get quick feedback on what would happen if you designed such-and-such a thing not only helps with being able to check and iterate designs quickly, it means that you can actually do lots of quick experiments to help you intuitively grok the dynamics of how atoms move and bond.
This is kind of a long comment, and I predict the next few paragraphs will be review for many LW readers, so feel free to skip to the paragraph starting with “SO HOW ARE YOU ACTUALLY GOING TO CHECK THE MOLECULAR DYNAMICS SIMULATION CODE?”.
Picture a team of nano-engineers designing some kind of large and complicated nanomachine. Each engineer wears a VR headset so they can view the atomic structure they’re working on in 3d, and each has VR gloves with touch-feedback so they can manipulate the atoms around. The engineers all work on various components of the larger nanomachine that must be built. Often there are standardized interfaces for transfer of information, energy, or charge. Other times, interfaces must be custom-designed for a particular purpose. Each component might connect to several of these interfaces, as well as being physically connected to the larger structure.
The hardest part of nanomachines is probably going to be the process of actually manufacturing them. The easiest route from current technology is to take advantage of our existing DNA synthesis tech to program ribosomes to produce the machines we want. The first stage of machines would be made from amino acids, but from there we could build machines that built other machines and bootstrap our way up to being able to build just about anything. This bootstrapping process would be more difficult than the mere design process for the final machine, and the first stage where we have to build things entirely out of amino acids sounds particularly brutal. But just as people could write the first compilers in machine code, it should be possible to figure out how to get things done even in the insanely constrained domain of amino acids. And this step only has to be done once before a whole world opens up.
The first obvious question is “what about quantum mechanics?”. The answer is that we don’t care too much about it. It makes computing the dynamics harder of course, but most systems interact frequently with their environment, and most nanomachines will probably interact with their environment frequently enough that they behave nearly classically. QM is important for determining what local chemical reactions take place, but there’s no long-range entanglement to worry about. That’s also helpful for allowing the human engineers to get a handle on the dynamics. The main effect of the frequent interaction with the environment is that the dynamics becomes somewhat stochastic. Often you will see engineers run the same simulation several times, so they can understand the distribution of possible outcomes. As the final design is approached, the engineers run it through thousands of simulations so that even rare thermal fluctuations are accounted for.
The zeroth obvious question is “what will you build with it?”. The answer is that a variety of things could be built, and there might be many side-projects produced to raise funds for the main project, but the main project is a scanner that can read a cryonics patient’s brain one layer at a time in sufficient detail to faithfully emulate them. This is a pretty standard pivotal act, and has been written about elsewhere, but the main idea is that we can freeze, scan, and maybe copy a bunch of alignment researchers and other smart folks to solve alignment at a much faster speed than the outside world is working on their unaligned AIs. Or, the researchers might be able to foom while maintaining their values by making careful changes to their own minds. Besides the scanner itself, this project would probably need to design extremely fast special-purpose computing hardware for running the ems, along with various kinds of tiny probes and other experimental equipment to collect data on how a scan translates to brain-function. (An ordinary dumb neural network should be sufficient for this kind of inference problem.)
SO HOW ARE YOU ACTUALLY GOING TO CHECK THE MOLECULAR DYNAMICS SIMULATION CODE? There are two possible attack modes. One is to exploit any possible vulnerabilities in the hardware running the code to achieve unwanted side effects, and use those side effects to escape from containment. The second is to somehow make the output of the simulation subtly flawed, such that the humans, working off of flawed data, accidentally design a nanomachine that lets the AI out of the box instead of doing what we were designing that machine to do. Obviously the first attack mode is much more desirable for the AI than the second. Actually, the second attack mode may be so hard to pull off that we don’t need to check for it at all, but not checking for things when you have the capability to check for them is silly.
Checks for false physics that might trick the humans into building trapped nanomachines:
Compare with the real dynamics, which we know, but is slower for us to simulate. In real physics, we can break down the world into little cubes and the dynamics are symmetrical around translating those cubes in space and time. So if we request a simulation method that has those same symmetries, we can check the simulation on small length and time scales, which are feasible for us to simulate. This doesn’t guarantee that a larger simulation would be correct, since small errors can multiply out of control as the simulation size gets larger, but it does mean that the dynamics of the large simulation have to bear a pretty-close resemblance to real physics.
We can also request that the code provably obeys conservation of momentum, energy, charge, etc.
Use diff fuzzing techniques: This is where you use a fuzzer (implemented with techniques like GOFAI and dumb neural networks) to try and find inputs to the simulation such that the output of the AI’s code differs from the output of our slow-but-known-correct code. The stronger the optimizer we use here, the harder it is for the AI to avoid detection.
Request that the AI give us error bounds on the simulation, along with machine-checkable proofs that they hold.
Side channel attack and its mitigation: The AI can steer us away from certain designs and towards others by making the simulation take different amounts of time depending on the configuration of atoms we submit. Therefore we request code that takes a constant amount of time to run, plus proof that it does so. In order to make this easier, the code will be run on a GPU architecture with cache levels explicitly managed in the code, and where none of the cores have branch prediction.
The checks for the first attack method are pretty much just normal computer security, but at extremely high stakes. Just so we’re clear on what the picture is like here: Generally the AI is not going to know in perfect detail what the hardware is like that its simulation code will be running on, though we at least have to give it the instruction set, or some high-level language that compiles to that instruction set. (Probably the high level language is the better option. That way we can design the language so that properties like constant run-time and space-time symmetry are gauranteed-true of all valid programs in that language.) So the best technique for the AI will be to submit simulation code that has a hidden optimizer embedded into it. This optimizer then searches for possible escape methods while it runs. Even though different runs of the code don’t share memory, they’ll have different inputs, which the optimizer can use as a random seed to determine which part of the escape-strategy search space it should be trying. If there are any holes at all in our security, the AI will be able to find and exploit them. All those engineers in VR headsets running simulations? In each case, the computer running the simulation has a hostile AI on it, trying everything it can think of to get out. In general, every aspect of the system must be proved correct, from the hardware, to the high-level language we give to the AI. In addition, everything must be thoroughly unit tested, integration tested, fuzz tested, and given to the best security researchers who will try to find side-channel attacks, etc. Other people will write a proof that’s as close as humanly possible to an actual mathematical proof that no side channel attacks are possible, and then everyone will try everything they can to poke holes in that proof.
Okay, yeah, we get it, you’re going to try really really hard. Don’t you know that the universe doesn’t owe you success just because of how much you tried? No matter what security you put in place as a mere human, the AI is going to get around it just because it’s so much smarter. Even a superintelligent AI would not be able to break a one time pad. Depending on the details of computational complexity, it probably wouldn’t be able to crack our best symmetric ciphers today either. For a superintelligent AI to succeed at a task, that task must be actually possible. If we can actually design a system with no holes, which takes a large but finite amount of work and intelligence, then a superintelligence will not be able to break it, no matter how much smarter than us it is.