Technical Predictions Related to AI Safety
If you want to create a policy of nuclear safety then you need to understand:
- Manufacturing nuclear weapons requires access to uranium. 
- Uranium enrichment is a big, slow activity that’s difficult to hide from enemy nation-states. 
- Nuclear weapons are so expensive only major governments can afford them. 
- Missiles are the best way to deploy nuclear weapons. 
If you want to figure out a strategy for bioweapon safety then you need to understand that infectious biological agents are an asymmetric weapon. They are most likely to be intentionally deployed by non-state actors or as a last-ditch retaliation by states on the verge of collapse. A policy of bioweapon safety should defend against these attack vectors.
An analysis of AI safety must start by figuring out the capabilities and limitations of artificial intelligence.
Software: Past, Present and Future
There are three kinds of software.
- Traditional, declarative, hard-coded or procedural software is hand-coded. Though traditional software has been around for many decades, I consider it to have taken off with the proliferation of personal computers in the 1980s. 
- Machine learning (ML) or deep learning minimize an error function by throwing mountains of data at a high-parameter model. ML took off recently with the application of Nvidia GPUs to the training of neural networks. 
- Bayesian reasoning machines and artificial general intelligence (AGI) are science fiction. They won’t be forever. 
Traditional Software
Traditional software does exactly what a human being tells it to. Traditional software is based around a single-CPU von Neumann architecture. Modern computers parallelize software across multiple CPUs via multthreading or multiprocessing. Multiprocessing works fine when you’re running separate applications like LibreOffice Writer, LibreOffice Calc and LibreOffice Impress. Multiprocessing works badly when you’re trying to throw more compute at a single monolithic program via hand-coded software. You can reduce serial dependencies by writing your software in a functional paradigm, but even the best hand-coded stateless software tends to eventually hit a serial bottleneck.
Related to the serial bottleneck, the other problem with traditional software is human programmers are limited by how much complexity we can juggle. If no human can hold more than seven items in his or her working memory then a software program which requires you to hold eight items in your head at once cannot be written by a human being. It is difficult to hard-code software that requires juggling several variables at once.
The other problem with traditional software is computers only speak math.
Machine Learning
Machine learning starts with a high-dimensional model of the world. The possibility space is too big to search exhaustively. Instead, machine learning usually uses a gradient descent algorithm.
The entropy (information content) of your training dataset must exceed the complexity of your model. A model’s complexity is measured by how many independent tunable parameters it has. The more tunable parameters a model has have the more data is required to train it. Machine learning is great for applications like self-driving cars where we have tons of data. Machine learning is bad at applications like low-frequency quantitative finance where results are long-tailed and your data is limited.
The scaling hypothesis is the idea we’ll get superintelligence by throwing more data and compute at machine learning. I don’t doubt that scaling today’s ML systems this will[1] get us good self-driving cars, computer-generated propaganda and robot armies. But scaling today’s ML systems cannot solve the small data problem because historical validation is an invalid measure of performance when long-tailed outcomes are involved. Big data approaches implicitly rely on historical validation.
Bayesian Reasoning Machines
Biological brains are proof it is possible to build a intelligent system that can learn from small data. We don’t know how biological brains accomplish this feat. If the scaling hypothesis is false then how else can we build a Bayesian reasoning machine? I see two possibilities.
- Biological brains execute an algorithm substantially different from the gradient descent used in today’s artificial neural networks (ANNs). 
- Biological brains have really good priors. 
There’s no doubt human brains have better priors than ANNs. How important are these priors? How much data would it take to teach them to a neural network? As an upper bound we can use one gigabyte, the length of the human genome[2]. One gigabyte is two orders of magnitude larger than both the number of parameters in GPT-3 and the size of the tokenized training dataset used to train GPT-3. We have enough data and compute to crush our evolved biological priors under a mountain of data.
“Adequate data” and “adequate compute” only matter if you’re running an adequate algorithm. Biological brains seem to run a different, superior algorithm to the multilayer perceptron because they can learn from small data. I predict biological brains’ higher-level reasoning is built around a small number of simple algorithms and that most of our priors relate stuff like who’s sexy, what kind of food tastes good and identifying snakes.
If my reasoning so far is true then we’re in an AI overhang.
Carbon vs Silicon
The most important trend in software is the end of Moore’s Law. For the last thirty years, Moore’s Law unlocked a series of disruptive innovations. Chaos favors the underdog. Traditional software is extremely capital-efficient. A kid writing software in his bedroom could take on the greatest industrial titans on Earth. The last thirty years have been an unprecedentedly meritocratic toward the cognitive elite.
Machine learning is capital-intensive compared to traditional software. This will push away from meritocracy. However, if machine learning becomes commoditized via APIs then ML will push toward meritocracy and/or rent seeking instead of capital. I think machine learning will increase meritocracy in the short term but that the disruptive effects will become exhausted after a few decades.
What comes after machine learning? In the medium term, genetically-augmented intelligence is inevitable. But so too (might be) Bayesian reasoning machines.
Opt-In Eugenics
Genetic engineering is following an exponential trajectory similar to Moore’s Law. We already have genetically-modified agriculture. On the horizon are genetically-modified humans and artificial pandemics. Once biological engineers gets cheap, artificial pandemics will primarily be used by terrorists. They will kill people but they won’t be the greatest existential threat to civilization. I will instead focus on the application of genetic engineering to human beings.
Some people think it might be possible to cure aging. I’m bearish on this. The people I know who are interested in “curing aging” are mostly software developers. None of them are biologists. I hope I’m wrong. I hope curing ageing is easy. But I am willing to bet it won’t happen by 2112. (I do not plan to sign up for cryonics either. I predict distant self-interested agents are more likely to resurrect me as a slave than as a master. [Post-scarcity utopia is wishful thinking too. By medieval standards, we already live in a post-scarcity society. This world is not a utopia.])
We will not be the beneficiaries of genetic engineering. That inheritance belongs to our children. We have already begun screening fetuses for genetic defects like Down syndrome. It’s just a matter of time until “a few diseases” becomes “risk factors” becomes a D&D Character Sheet. Genetic engineering is impossible to ban. Rich people will give birth on a space station if that’s what it takes to give their children the best genes. It won’t take long for human genetic engineering to go from “illegal” to “universal human right”.
Our engineered children will be taller, prettier, stronger and healthier than us. They’ll be smarter too. How much smarter? It shouldn’t be hard to bring their average up to what is now the top 1% of the population. But there is probably a limit to how hard you can feasibly push things in a single generation. An analysis of the Ashkenazi Jewish population seems to indicate there are nasty tradeoffs if you select too hard for intelligence. The technology will improve over time. Our children will be smarter to us, but they won’t be gods. Bayesian reasoning machines could easily surpass them.
Why is AI so hard?
Human beings are proof positive that a Bayesian reasoning machine can be created out of matter. Living cells are an awful substrate for computation. A Bayesian reasoning machine made out of silicon and copper would be much smarter.
Why haven’t we built one yet?
The limiting factor isn’t our ability to push atoms around. Transistors are a handful of atoms across. We are pushing the physical limits of what it is possible to manufacture. Quantum computations promises to hit the actual physical limits but that is overkill. Biological neurons compute classically. They don’t perform quantum computations. A quantum computer is not a prerequisite to building a Bayesian reasoner.
Training data isn’t the limiting factor either. We have more data available in computer-readable format than a human being can process in a thousand lifetimes.
If we aren’t limited by hardware and we aren’t limited by training data then we are limited by software. The human brain has better software than artificial neural networks.
Discussions of AI often talk about human-parity. Humans are weird animals. We seem to have suddenly evolved complex language about 50,000 years ago. Complex language set the stage for complex tools, agriculture and civilization. There’s not a big difference, genetically, between chimpanzees and humans. I think once you get an AI to nonhuman-animal-level intelligence you could push it to human-level and then beyond with just a few hacks. By this logic, if we can construct a rat-equivalent Bayesian reasoning machine out of silicon then we’re only a few small steps away from inventing a superintelligence.
How complicated is the algorithm for rat intelligence? I like to divide it into two parts:
- A flexible algorithm for general intelligence. 
- A basket of hard-coded algorithms for instinctual knowledge like “who is sexy” and “eek that is a snake”. 
A rat’s instinctual knowledge is infeasible to hand-code but I think we can devise something an artificial substitute by combining several models trained via mere machine learning.
How complicated is a rat’s flexible algorithm for general intelligence? My gut instinct tells me it’s pretty simple. If this whole chain of logic holds up and the rat’s flexible algorithm for general intelligence is indeed simple then we’re in an AI overhang.
I could be wrong. If it turns out evolved biological priors are very important then we might be looking at a long takeoff where development of AGI takes centuries.
Extrapolation
The advantage a Bayesian reasoning machine’s advantage over deep learning is data efficiency. If I was building a Bayesian reasoning machine then I would optimize for data efficiency. The data efficiency of a particular algorithm is constant. Optimizing for data efficiency therefore implies architecture search.
Scaling up an individual architecture is expensive. To keep costs manageable, it is necessary to predict the Big-O performance of an algorithm as a function of its resource use. A capability predictor is part of the architecture search algorithm. If the architecture search algorithm is working then you will know when you’re coming close to superintelligence. If the architecture search is not working then you won’t have a superintelligence at all. The engineers building a superintelligence ought to have warning before the thing goes full Skynet.
Architecture search is expensive. One way to cut costs is via caching. Caching works better when all your code is written functionally. Functional programming is a stateless software paradigm with noncyclic dependencies.
A Bayesian reasoning machine is fundamentally an extrapolation machine. An extrapolation machine has priors. The superintelligence’s priors depend on its internal technical details. A superintelligent rat brain emulator would behave differently from GPT- would differ from the stateless architecture search contraption I outlined in the previous section. They would all behave similarly on simple problems like . Their behavior would deviate the vaguer the questions get because underspecified prompts demand the most extrapolation.
There is no general solution to “how would a superintelligence answer an ambiguous question” because for every answer there exists a set of priors which would elicit it. Solving the control problem of a superintelligence in general is thus impossible. But we already knew this. It’s a corollary to the halting problem.
If we want to solve the AI control problem in a useful way then it is intractable to talk about superintelligences in general. We must limit our investigation to the subset of Bayesian reasoning machine architectures which are likely to actually get built.
Error-Entropy Minimizers
I think the only feasible way to build a Bayesian reasoning machine within the next few decades is via an architecture search that optimizes for data efficiency. This system is likely to be build functionally because that’s the only way to manage a project of this complexity, especially given the parellization related to scaling things. A stateless data compression algorithm is very different from a human being. It neither an agent nor a world optimizer. The mathematics exists outside of time itself the same way abstract mathematics does.
- A functional algorithm has no internal state. It has no concept of agency. It is an optimizer but it is not a world optimizer. A stateless algorithm cannot modify the state of the world because a stateless algorithm has no concept of state. 
- Entropy minimization maximizes interpretability. An error-entropy minimizer gives the simplest answer it can. A fundamental preference for simplicity is a powerful defense mechanism against unexpectedly complex behavior. 
While a powerful superintelligence in general is likely to turn the universe into paperclips, the superintelligence we are most likely to actually build is not. An error-entropy minimizer isn’t a genie. It’s just an extremely efficient, broadly generalizable data compression algorithm. I will use the term general compression algorithm to refer to this specific kind of Bayesian reasoning machine.
A World with Powerful Data Compression
If I wrote a general compression algorithm the first thing I would do is plug it into the stock market and extract money. There is enough money available here to purchase whatever is needed to scale up the general compression algorithm.
The second thing I would do is create a public-facing API, which I would provide as cheaply I could afford. The entropy of a dataset obeys the triangle inequality. A general compression algorithm derives its intelligence from transfer learning. Queries to the API constitute valuable training data. The result is a winner-take-all equilibrium.
If, at this point, the general compression algorithm is strictly superior to human beings then what happens next depends on the choices of a small number of individuals. Prediction is impossible beyond this event horizon.
If the general compression algorithm is not strictly superior to human beings then we’re likely to see a situation where routine jobs continue to get eliminated. Economic inequality continues to skyrocket.
- ↩︎Semiconductor fabrication is an expensive, fragile industry. All my predictions about advances in computer technology are conditional on the global economic system continuing to exist. 
- ↩︎It is possible “DNA length” is an underestimate because much heritable information is contained outside of DNA (such as in mitochondria). On the other hand, most heritable information is related to stuff like metabolism. Most DNA does not encode priors about the world. 
- Autoregressive Propaganda by (22 Aug 2021 2:18 UTC; 25 points)
- 's comment on AI Alignment, Philosophical Pluralism, and the Relevance of Non-Western Philosophy by (14 Aug 2021 3:51 UTC; 10 points)
- 's comment on [Book Review] “The Alignment Problem” by Brian Christian by (20 Sep 2021 9:07 UTC; 5 points)
The scaling hypothesis says “whatever algorithm you think can solve the small data problem, something analogous will eventually be learned by a large enough neural net with enough data + compute, because solving the small data problem is useful for loss”.
Importantly, you don’t solve small data problems by running gradient descent on them. You solve them by taking you big pretrained neural network, providing the small data problem as an input to that neural network, and let the forward passes of the neural network solve the problem, which works because those forward passes are executing similar algorithms to <the ones which actually work>.
What you mean by “solving the small data problem is useful for loss”?
If you want to e.g. predict text on the Internet, you can do a better job of it if you can solve small data problems than if you can’t.
For example, in the following text (which I copied from here):
“Look carefully for the pattern, and then choose which pair of numbers comes next.
42 40 38 35 33 31 28
A. 25 22
B. 26 23
C. 26 24
D. 25 23
E. 26 22
Answer & Explanation:
Answer: Option”
You will do a better job at predicting the next token if you can learn the pattern from the given sequence of 7 numbers.
This is a very very small benefit in absolute terms, but once you get to very very large models that is the sort of thing you learn.
I expect a similar thing will be true for whichever small-data problems you have in mind (though they may require models that can have more context than GPT-3 can have).
Two small things: You want to refer to “cryonics”, not “cryogenics”, and the text you link is fiction (I also disagree with your assessment, but will have to take some time to type that up and make no promises of doing so :-)).
Thank you for the correction. I have replaced “cryogenics” with “cryonics”.
Nice, I wrote up some thoughts about the risk here. They possibly apply to you less due to high competence and therefore value as an emulation.
Two technical issues:
You say “As an upper bound we can use 200 gigabytes, the length of the human genome”. But the human genome actually consists of about 3 billion base pairs, each specifiable using two bits, so it’s about 0.75 gigabytes in size, even before taking account that it’s somewhat compressible, due to repeats and other redundancies.
You also say “The entropy (information content) of your training dataset must exceed the complexity of your model.” But actually it is typical for neural network models to have more parameters than there are numbers in the training data. Overfitting is avoided by methods such as “dropout” and “early stopping”. One could argue that these methods reduce the “effective” complexity of the model to less than the entropy of the training data, but if you do that, the statement verges on being tautological rather than substantive. For Bayesian learning methods, it is certainly not true that the complexity of the model must be limited to the entropy of the data set—at least in theory, a Bayesian model can be specified before you even know how much data you will have, and then doesn’t need to be modified based on how much data you actually end up with.
Thank you for correcting the size of the human genome. I have fixed the number.
My claim is indeed “that [early stopping] reduce[s] the ‘effective’ complexity of the model to less than the entropy of the training data”. I consider such big data methods to be in a separate, data-inefficient category separate from Bayesian learning methods. Thus, “[f]or Bayesian learning methods, it is certainly not true that the complexity of the model must be limited to the entropy of the data set”.
Do you think it’s reasonable to push for rat-level AI before we can create a C.Elegans-level AI?
I don’t know how to measure the intelligence of a C. Elegans. If I could it would come first.
I guess you don’t mean simulating the relevant parts of the rat brain in silico like OpenWorm, but “a rat-equivalent Bayesian reasoning machine out of silicon”, which is probably different.
Yes.