Rohin Shah comments on [AN #69] Stuart Russell’s new book on why we need to replace the standard model of AI

Rohin Shah 19 Oct 2019 18:32 UTC
LW: 13 AF: 5
AF
I mentioned in my opinion that I think many of my disagreements are because of an implicit disagreement on how we build powerful AI systems:
the book has an implied stance towards the future of AI research that I don’t agree with: I could imagine that powerful AI systems end up being created by learning alone without needing the conceptual breakthroughs that Stuart outlines.
I didn’t expand on this in the newsletter because I’m not clear enough on the disagreement; I try to avoid writing very confused thoughts that say wrong things about what other people believe in a publication read by a thousand people. But that’s fine for a comment here!
Rather than attribute a model to Stuart, I’m just going to make up a model that was inspired by reading HC, but wasn’t proposed by HC. In this model, we get a superintelligent AI system that looks like a Bayesian-like system that explicitly represents things like “beliefs”, “plans”, etc. Some more details:
- Things like ‘hierarchical planning’ are explicit algorithms. Simply looking at the algorithm can give you a lot of insight into how it does hierarchy. You can inspect things like “options” just by looking at inputs/outputs to the hierarchical planning module. The same thing applies for e.g. causal reasoning.
- Any black box deep learning system is only used to provide low-level inputs to the real ‘intelligence’, in the same way that for humans vision provides low-level inputs for the rest of cognition. We don’t need to worry about the deep learning system “taking over”, in the same way that we don’t worry about our vision module “taking over”.
- The AI system was created by breakthroughs in algorithms for causal reasoning, hierarchical planning, etc, that allow it to deal with the combinatorial explosion caused by the real world. As a result, it is very cheap to run (i.e. doesn’t need a huge amount of compute). This is more compatible with a discontinuous takeoff, though a continuous takeoff is possible if the algorithms improved continuously over time, rather than having breakthroughs.
Some implications of this model:
- All of the “intelligence” is happening via explicit algorithms. We only need to make sure that the algorithms are aligned. So, we only have an outer alignment problem; there is no inner alignment problem.
- Since the system is mostly Bayesian, the main challenges are to avoid misspecification (solution: use something equivalent to the Solomonoff prior) and to be computationally efficient (solution: keep a small set of hypotheses, detect when they fail to explain the data, and expand to a bigger class of hypotheses). You don’t have to worry about other forms of robustness like adversarial examples.