The 0.2 OOMs/year target

Cleo Nardo30 Mar 2023 18:15 UTC

LW: 84 AF: 33

AI AI Timelines Regulation and AI Risk AI Governance Slowing Down AI Computing Overhang

TLDR: Humanity — which includes all nations, organisations, and individuals — should limit the growth rate of machine learning training runs from 2020 until 2050 to below 0.2 OOMs/year.

Paris Climate Accords

In the early 21st century, the climate movement converged around a “2°C target”, shown in Article 2(1)(a) of the Paris Climate Accords:

“Holding the increase in the global average temperature to well below 2°C above pre-industrial levels and pursuing efforts to limit the temperature increase to 1.5°C above pre-industrial levels, recognizing that this would significantly reduce the risks and impacts of climate change;”(source)

The 2°C target helps facilitate coordination between nations, organisations, and individuals.

It provided a clear, measurable goal.
It provided a sense of urgency and severity.
It promoted a sense of shared responsibility.
It establishes common knowledge of stakeholder goals.
It helped to align efforts across different stakeholders.
It signals a technical practical mindset for solving the problem.
It created a shared understanding of what success would look like.

The 2°C target was the first step towards coordination, not the last step.

The AI governance community should converge around a similar target.

0.2 OOMs/year target

I propose a fixed target of 0.2 OOMs/year. “OOM” stands for “orders of magnitude” and corresponds to a ten-fold increase, so 0.2 OOMs/year corresponds to a 58% year-on-year growth. The 0.2 OOMs/year figure was recently suggested by Jaime Sevilla, which prompted me to write this article.

I do not propose any specific policy for achieving the 0.2 OOMs/year target, because the purpose of the target is to unify stakeholders even if they support different policies.
I do not propose any specific justification for the 0.2 OOMs/year target, because the purpose of the target is to unify stakeholders even if they have different justifications.

Here is the statement:

“Humanity — which includes all nations, organisations, and individuals — should limit the growth rate of machine learning training runs from 2020 until 2050 to below 0.2 OOMs/year.”

The statement is intentionally ambiguous about how to measure “the growth rate of machine learning training runs”. I suspect that a good proxy metric would be the effective training footprint (defined below) but I don’t think the proxy metric should be included in the statement of the target itself.

Effective training footprint

What is the effective training footprint?

The effective training footprint, measured in FLOPs, is one proxy metric for the growth rate of machine learning training runs. The footprint of a model is defined, with caveats, as the total number of FLOPs used to train the model since initialisation.

Caveats:

A randomly initialised model has a footprint of 0 FLOPs.
If the model is trained from a randomly initialised model using SGD or a variant, then its footprint is the total number of FLOPs used in the training process.
If a pre-trained base model is used for the initialisation of another training process (such as unsupervised learning, supervised learning, fine-tuning, or reinforcement learning), then the footprint of the resulting model will include the footprint of the pre-trained model.
If multiple models are composed to form a single cohesive model, then the footprint of the resulting model is the sum of the footprints of each component model.
If there is a major algorithmic innovation which divides by a factor of $r$ the FLOPs required to train a model to a particular score on downstream tasks, then the footprint of models trained with that innovation is multiplied by the same factor $r$ .
This list of caveats to the definition of Effective Training Footprint is non-exhaustive. Future consultations may yield additional caveats, or replace Effective Training Footprint with an entirely different proxy metric.

Fixing the y-axis

According to the 0.2 OOMs/year target, there cannot exist an ML model during the year $(2022 + x)$ with a footprint exceeding $f (x)$ , where $f (x + 1) = 10^{0.2} \times f (x)$ . That means that ${log}_{10} f (x) = (0.2 x + a)$ FLOPs for some fixed constant $a$ .
If we consult EpochAI’s plot of compute training runs during the large-scale era of ML, we see that footprints have been growing with approximately 0.5 OOMs/year.
We can use this trend to fix the value of $A$ . In 2022, the footprint was approximately 1.0e+24. Therefore $a = 24$ .
In other words, ${log}_{10} f (x) = 0.2 x + 24$ .
I have used 2022 as an anchor to fix the y-axis $a$ . If I had used an earlier date then the 0.2 OOMs/yr target would’ve been stricter, and if I had used a later date then the 0.2 OOMs/yr target would’ve been laxer. If the y-axis for the constraint is fixed to the day of the negotiation (the default schelling date), then stakeholders who want a laxer constraint are incentivised to delay negotiation. To avoid that hazard, I have picked “January 1st 2022” to fix the y-axis. I declare 1/1/2022 to be the schelling date for the 0.2 OOMs/year target.

Year-by-year limits

In year $Y$ , all model must have a log10-footprint below $24 + 0.2 \times (Y - 2022)$ .

Year	Maximum training footprint (FLOPs) in logarithm base 10	Maximum training footprint (FLOPs)
2020	23.6	3.98E+23
2021	23.8	6.31E+23
2022	24.0	1.00E+24
2023	24.2	1.58E+24
2024	24.4	2.51E+24
2025	24.6	3.98E+24
2026	24.8	6.31E+24
2027	25.0	1.00E+25
2028	25.2	1.58E+25
2029	25.4	2.51E+25
2030	25.6	3.98E+25
2031	25.8	6.31E+25
2032	26.0	1.00E+26
2033	26.2	1.58E+26
2034	26.4	2.51E+26
2035	26.6	3.98E+26
2036	26.8	6.31E+26
2037	27.0	1.00E+27
2038	27.2	1.58E+27
2039	27.4	2.51E+27
2040	27.6	3.98E+27
2041	27.8	6.31E+27
2042	28.0	1.00E+28
2043	28.2	1.58E+28
2044	28.4	2.51E+28
2045	28.6	3.98E+28
2046	28.8	6.31E+28
2047	29.0	1.00E+29
2048	29.2	1.58E+29
2049	29.4	2.51E+29
2050	29.6	3.98E+29

Implications of the 0.2 OOMs/year target

Because $10^{0.2} = 1.58$ , this means that the maximum footprint would grow 58% every year.
0.2 OOMs/year is equivalent to a doubling time of 18 months.
Every decade, the maximum permissible footprint increases by a factor of 100.
0.2 OOMs/year was the pre-AlexNet growth rate in ML systems.
The current growth rate is 0.5 OOMs/year, which is 2.5 times faster than the target rate.
As the current 0.5 OOMs/year growth rate, after 10 years we would have ML training runs which are 100 000x larger than existing training runs. Under the 0.2 OOMs/year growth rate, this growth would be spread over 25 years instead.
Comparing 0.2 OOMs/year target to hardware growth-rates:
- Moore’s Law states that transitiors per integrated circuit doubles roughly every 2 years.
- Koomey’s Law states that the FLOPs-per-Joule doubled roughly every 1.57 years until 2000, whereupon it began doubling roughly every 2.6 years.
- Huang’s Law states that the growth-rate of GPU performance exceeds that of CPU performance. This is a somewhat dubious claim, but nonetheless I think the doubling time of GPUs is longer than 18 months.
- In general, the 0.2 OOMs/year target is faster than the current hardware growth-rate.
On March 15 2023, OpenAI released GPT-4 which was trained with an estimated 2.8e+25 FLOPs. If OpenAI had followed the 0.2 OOMs/year target, then GPT-4 would’ve been released on March 29 2029. This is because if $Y = 2029.24$ then $24 + 0.2 \times (Y - 2022) = {log}_{10} (2.8 \times 10^{25})$ .
0.2 OOMs/year target would therefore be an effective moratorium on models exceeding GPT-4 until 2029. Nonetheless, the moratorium would still allow an AI Summer Harvest — in which the impact of ChatGPT-3.5/4 steadily dissipates across the economy until a new general equilbirum is reached where...

People have more money to spend.
The products and services are more abundant, cheaper, and of a higher quality.
People have more leisure to enjoy themselves.

Exceeding the 0.2 OOMs/year target would yield little socio-economic benefit, because 5–10 years is the timescale over which the economy (and society as large) can adapt to socio-economic shocks on the scale of ChatGPT-3.5.

What links here?