[AN #121]: Forecasting transformative AI timelines using biological anchors

Link post

Align­ment Newslet­ter is a weekly pub­li­ca­tion with re­cent con­tent rele­vant to AI al­ign­ment around the world. Find all Align­ment Newslet­ter re­sources here. In par­tic­u­lar, you can look through this spread­sheet of all sum­maries that have ever been in the newslet­ter.

Au­dio ver­sion here (may not be up yet).

Draft re­port on AI timelines (Ajeya Co­tra) (sum­ma­rized by Ro­hin): Once again, we have a piece of work so large and de­tailed that I need a whole newslet­ter to sum­ma­rize it! This time, it is a quan­ti­ta­tive model for fore­cast­ing when trans­for­ma­tive AI will hap­pen. Note that since this is still a draft re­port, the num­bers are in flux; the num­bers be­low were taken when I read it and may no longer be up to date.

The over­all framework

The key as­sump­tion be­hind this model is that if we train a neu­ral net or other ML model that uses about as much com­pu­ta­tion as a hu­man brain, that will likely re­sult in trans­for­ma­tive AI (TAI) (defined as AI that has an im­pact com­pa­rable to that of the in­dus­trial rev­olu­tion). In other words, we an­chor our es­ti­mate of the ML model’s in­fer­ence com­pu­ta­tion to that of the hu­man brain. This as­sump­tion al­lows us to es­ti­mate how much com­pute will be re­quired to train such a model us­ing 2020 al­gorithms. By in­cor­po­rat­ing a trend ex­trap­o­la­tion of how al­gorith­mic progress will re­duce the re­quired amount of com­pute, we can get a pre­dic­tion of how much com­pute would be re­quired for the fi­nal train­ing run of a trans­for­ma­tive model in any given year.

We can also get a pre­dic­tion of how much com­pute will be available by pre­dict­ing the cost of com­pute in a given year (which we have a de­cent amount of past ev­i­dence about), and pre­dict­ing the max­i­mum amount of money an ac­tor would be will­ing to spend on a sin­gle train­ing run. The prob­a­bil­ity that we can train a trans­for­ma­tive model in year Y is then just the prob­a­bil­ity that the com­pute re­quire­ment for year Y is less than the com­pute available in year Y.

The vast ma­jor­ity of the re­port is fo­cused on es­ti­mat­ing the amount of com­pute re­quired to train a trans­for­ma­tive model us­ing 2020 al­gorithms (where most of our un­cer­tainty would come from); the re­main­ing fac­tors are es­ti­mated rel­a­tively quickly with­out too much de­tail. I’ll start with those so that you can have them as back­ground knowl­edge be­fore we delve into the real meat of the re­port. Th­ese are usu­ally mod­eled as lo­gis­tic curves in log space: that is, they are mod­eled as im­prov­ing at some con­stant rate, but will level off and sat­u­rate at some max­i­mum value af­ter which they won’t im­prove.

Al­gorith­mic progress

First off, we have the im­pact of al­gorith­mic progress. AI and Effi­ciency (AN #99) es­ti­mates that al­gorithms im­prove enough to cut com­pute times in half ev­ery 16 months. How­ever, this was mea­sured on ImageNet, where re­searchers are di­rectly op­ti­miz­ing for re­duced com­pu­ta­tion costs. It seems less likely that re­searchers are do­ing as good a job at re­duc­ing com­pu­ta­tion costs for “train­ing a trans­for­ma­tive model”, and so the au­thor in­creases the halv­ing time to 2-3 years, with a max­i­mum of some­where be­tween 1-5 or­ders of mag­ni­tude (with the as­sump­tion that the higher the “tech­ni­cal difficulty” of the prob­lem, the more al­gorith­mic progress is pos­si­ble).

Cost of compute

Se­cond, we need to es­ti­mate a trend for com­pute costs. There has been some prior work on this (sum­ma­rized in AN #97). The re­port has some similar analy­ses, and ends up es­ti­mat­ing a dou­bling time of 2.5 years, and a (very un­sta­ble) max­i­mum of im­prove­ment by a fac­tor of 2 mil­lion by 2100.

Willing­ness to spend

Third, we would like to know the max­i­mum amount (in 2020 dol­lars) any ac­tor might spend on a sin­gle train­ing run. Note that we are es­ti­mat­ing the money spent on a fi­nal train­ing run, which doesn’t in­clude the cost of ini­tial ex­per­i­ments or the cost of re­searcher time. Cur­rently, the au­thor es­ti­mates that all-in pro­ject costs are 10-100x larger than the fi­nal train­ing run cost, but this will likely go down to some­thing like 2-10x, as the in­cen­tive for re­duc­ing this ra­tio be­comes much larger.

The au­thor es­ti­mates that the most ex­pen­sive run in a pub­lished pa­per was the fi­nal AlphaS­tar (AN #43) train­ing run, at ~1e23 FLOP and $1M cost. How­ever, there have prob­a­bly been un­pub­lished re­sults that are slightly more ex­pen­sive, maybe $2-8M. In line with AI and Com­pute (AN #7), this will prob­a­bly in­crease dra­mat­i­cally to about $1B in 2025.

Given that AI com­pa­nies each have around $100B cash on hand, and could po­ten­tially bor­row ad­di­tional sev­eral hun­dreds of billions of dol­lars (given their cur­rent mar­ket caps and likely growth in the wor­lds where AI still looks promis­ing), it seems likely that low hun­dreds of billions of dol­lars could be spent on a sin­gle run by 2040, cor­re­spond­ing to a dou­bling time (from $1B in 2025) of about 2 years.

To es­ti­mate the max­i­mum here, we can com­pare to megapro­jects like the Man­hat­tan Pro­ject or the Apollo pro­gram, which sug­gests that a gov­ern­ment could spend around 0.75% of GDP for ~4 years. Since trans­for­ma­tive AI will likely be more valuable eco­nom­i­cally and strate­gi­cally than these pre­vi­ous pro­grams, we can shade that up­wards to 1% of GDP for 5 years. As­sum­ing all-in costs are 5x that of the fi­nal train­ing run, this sug­gests the max­i­mum will­ing­ness to spend should be 1% of GDP of the largest coun­try, which we as­sume grows at ~3% ev­ery year.

Strat­egy for es­ti­mat­ing train­ing com­pute for a trans­for­ma­tive model

In ad­di­tion to the three fac­tors of al­gorith­mic progress, cost of com­pute, and will­ing­ness to spend, we need an es­ti­mate of how much com­pu­ta­tion would be needed to train a trans­for­ma­tive model us­ing 2020 al­gorithms (which I’ll dis­cuss next). Then, at year Y, the com­pute re­quired is given by com­pu­ta­tion needed with 2020 al­gorithms * im­prove­ment fac­tor from al­gorith­mic progress, which (in this re­port) is a prob­a­bil­ity dis­tri­bu­tion. At year Y, the com­pute available is given by FLOP per dol­lar (aka com­pute cost) * money that can be spent, which (in this re­port) is a point es­ti­mate. We can then sim­ply read off the prob­a­bil­ity that the com­pute re­quired is greater than the com­pute available.

Okay, so the last thing we need is a dis­tri­bu­tion over the amount of com­pu­ta­tion that would be needed to train a trans­for­ma­tive model us­ing 2020 al­gorithms, which is the main fo­cus of this re­port. There is a lot of de­tail here that I’m go­ing to elide over, es­pe­cially in talk­ing about the dis­tri­bu­tion as a whole (whereas I will fo­cus pri­mar­ily on the me­dian case for sim­plic­ity). As I men­tioned early on, the key hy­poth­e­sis is that we will need to train a neu­ral net or other ML model that uses about as much com­pute as a hu­man brain. So the strat­egy will be to first trans­late from “com­pute of hu­man brain” to “in­fer­ence com­pute of neu­ral net”, and then to trans­late from “in­fer­ence com­pute of neu­ral net” to “train­ing com­pute of neu­ral net”.

How much in­fer­ence com­pute would a trans­for­ma­tive model use?

We can talk about the rate at which synapses fire in the hu­man brain. How can we con­vert this to FLOP? The au­thor pro­poses the fol­low­ing hy­po­thet­i­cal: sup­pose we redo evolu­tion­ary his­tory, but in ev­ery an­i­mal we re­place each neu­ron with N float­ing-point units that each perform 1 FLOP per sec­ond. For what value of N do we still get roughly hu­man-level in­tel­li­gence over a similar evolu­tion­ary timescale? The au­thor then does some calcu­la­tions about simu­lat­ing synapses with FLOPs, draw­ing heav­ily on the re­cent re­port on brain com­pu­ta­tion (AN #118), to es­ti­mate that N would be around 1-10,000, which af­ter some more calcu­la­tions sug­gests that the hu­man brain is do­ing the equiv­a­lent of 1e13 − 1e16 FLOP per sec­ond, with a me­dian of 1e15 FLOP per sec­ond, and a long tail to the right.

Does this mean we can say that a trans­for­ma­tive model will use 1e15 FLOP per sec­ond dur­ing in­fer­ence? Such a model would have a clear flaw: even though we are as­sum­ing that al­gorith­mic progress re­duces com­pute costs over time, if we did the same anal­y­sis in e.g. 1980, we’d get the same es­ti­mate for the com­pute cost of a trans­for­ma­tive model, which would im­ply that there was no al­gorith­mic progress be­tween 1980 and 2020! The prob­lem is that we’d always es­ti­mate the brain as us­ing 1e15 FLOP per sec­ond (or around there), but for our ML mod­els there is a differ­ence be­tween FLOP per sec­ond us­ing 2020 al­gorithms and FLOP per sec­ond us­ing 1980 al­gorithms. So how do we con­vert form “brain FLOP per sec­ond” to “in­fer­ence FLOP per sec­ond for 2020 ML al­gorithms”?

One ap­proach is to look at how other ma­chines we have de­signed com­pare to the cor­re­spond­ing ma­chines that evolu­tion has de­signed. An anal­y­sis by Paul Chris­ti­ano con­cluded that hu­man-de­signed ar­ti­facts tend to be 2-3 or­ders of mag­ni­tude worse than those de­signed by evolu­tion, when con­sid­er­ing en­ergy us­age. Pre­sum­ably a similar anal­y­sis done in the past would have re­sulted in higher num­bers and thus wouldn’t fall prey to the prob­lem above. Another ap­proach is to com­pare ex­ist­ing ML mod­els to an­i­mals with a similar amount of com­pu­ta­tion, and see which one is sub­jec­tively “more im­pres­sive”. For ex­am­ple, the AlphaS­tar model uses about as much com­pu­ta­tion as a bee brain, and large lan­guage mod­els use some­what more; the au­thor finds it rea­son­able to say that AlphaS­tar is “about as so­phis­ti­cated” as a bee, or that GPT-3 (AN #102) is “more so­phis­ti­cated” than a bee.

We can also look at some ab­stract con­sid­er­a­tions. Nat­u­ral se­lec­tion had a lot of time to op­ti­mize brains, and nat­u­ral ar­ti­facts are usu­ally quite im­pres­sive. On the other hand, hu­man de­sign­ers have the benefit of in­tel­li­gent de­sign and can copy the pat­terns that nat­u­ral se­lec­tion has come up with. Over­all, these con­sid­er­a­tions roughly bal­ance each other out. Another im­por­tant con­sid­er­a­tion is that we’re only pre­dict­ing what would be needed for a model that was good at most tasks that a hu­man would cur­rently be good at (think a vir­tual per­sonal as­sis­tant), whereas evolu­tion op­ti­mized for a whole bunch of other skills that were needed in the an­ces­tral en­vi­ron­ment. The au­thor sub­jec­tively guesses that this should re­duce our es­ti­mate of com­pute costs by about an or­der of mag­ni­tude.

Over­all, putting all these con­sid­er­a­tions to­gether, the au­thor in­tu­itively guesses that to con­vert from “brain FLOP per sec­ond” to “in­fer­ence FLOP per sec­ond for 2020 ML al­gorithms”, we should add an or­der of mag­ni­tude to the me­dian, and add an­other two or­ders of mag­ni­tude to the stan­dard de­vi­a­tion to ac­count for our large un­cer­tainty. This re­sults in a me­dian of 1e16 FLOP per sec­ond for the in­fer­ence-time com­pute of a trans­for­ma­tive model.

Train­ing com­pute for a trans­for­ma­tive model

We might ex­pect a trans­for­ma­tive model to run a for­ward pass 0.1 − 10 times per sec­ond (which on the high end would match hu­man re­ac­tion time of 100ms), and for each pa­ram­e­ter of the neu­ral net to con­tribute 1-100 FLOP per for­ward pass, which im­plies that if the in­fer­ence-time com­pute is 1e16 FLOP per sec­ond then the model should have 1e13 − 1e17 pa­ram­e­ters, with a me­dian of 3e14 pa­ram­e­ters.

We now need to es­ti­mate how much com­pute it takes to train a trans­for­ma­tive model with 3e14 pa­ram­e­ters. We as­sume this is dom­i­nated by the num­ber of times you have to run the model dur­ing train­ing, or equiv­a­lently, the num­ber of data points you train on times the num­ber of times you train on each data point. (In par­tic­u­lar, this as­sumes that the cost of ac­quiring data is neg­ligible in com­par­i­son. The re­port ar­gues for this as­sump­tion; for the sake of brevity I won’t sum­ma­rize it here.)

For this, we need a re­la­tion­ship be­tween pa­ram­e­ters and data points, which we’ll as­sume will fol­low a power law KP^α, where P is the num­ber of pa­ram­e­ters and K and α are con­stants. A large num­ber of ML the­ory re­sults im­ply that the num­ber of data points needed to reach a speci­fied level of ac­cu­racy grows lin­early with the num­ber of pa­ram­e­ters (i.e. α=1), which we can take as a weak prior. We can then up­date this with em­piri­cal ev­i­dence from pa­pers. Scal­ing Laws for Neu­ral Lan­guage Models (AN #87) sug­gests that for lan­guage mod­els, data re­quire­ments scale as α=0.37 or as α=0.74, de­pend­ing on what mea­sure you look at. Mean­while, Deep Learn­ing Scal­ing is Pre­dictable, Em­piri­cally sug­gests that α=1.39 for a wide va­ri­ety of su­per­vised learn­ing prob­lems (in­clud­ing lan­guage mod­el­ing). How­ever, the former pa­per stud­ies a more rele­vant set­ting: it in­cludes reg­u­lariza­tion, and asks about the num­ber of data points needed to reach a tar­get ac­cu­racy, whereas the lat­ter pa­per ig­nores reg­u­lariza­tion and asks about the min­i­mum num­ber of data points that the model can­not overfit to. So over­all the au­thor puts more weight on the former pa­per and es­ti­mates a me­dian of α=0.8, though with sub­stan­tial un­cer­tainty.

We also need to es­ti­mate how many epochs will be needed, i.e. how many times we train on any given data point. The au­thor de­cides not to ex­plic­itly model this fac­tor since it will likely be close to 1, and in­stead lumps in the un­cer­tainty over the num­ber of epochs with the un­cer­tainty over the con­stant fac­tor in the scal­ing law above. We can then look at lan­guage model runs to es­ti­mate a scal­ing law for them, for which the me­dian scal­ing law pre­dicts that we would need 1e13 data points for our 3e14 pa­ram­e­ter model.

How­ever, this has all been for su­per­vised learn­ing. It seems plau­si­ble that a trans­for­ma­tive task would have to be trained us­ing RL, where the model acts over a se­quence of timesteps, and then re­ceives (non-differ­en­tiable) feed­back at the end of those timesteps. How would scal­ing laws ap­ply in this set­ting? One sim­ple as­sump­tion is to say that each rol­lout over the effec­tive hori­zon counts as one piece of “mean­ingful feed­back” and so should count as a sin­gle data point. Here, the effec­tive hori­zon is the min­i­mum of the ac­tual hori­zon and 1/​(1-γ), where γ is the dis­count fac­tor. We as­sume that the scal­ing law stays the same; if we in­stead try to es­ti­mate it from re­cent RL runs, it can change the re­sults by about one or­der of mag­ni­tude.

So we now know we need to train a 3e14 pa­ram­e­ter model with 1e13 data points for a trans­for­ma­tive task. This gets us nearly all the way to the com­pute re­quired with 2020 al­gorithms: we have a ~3e14 pa­ram­e­ter model that takes ~1e16 FLOP per for­ward pass, that is trained on ~1e13 data points with each data point tak­ing H timesteps, for a to­tal of H * 1e29 FLOP. The au­thor’s dis­tri­bu­tions are in­stead cen­tered at H * 1e30 FLOP; I sus­pect this is sim­ply be­cause the au­thor was com­put­ing with dis­tri­bu­tions whereas I’ve been di­rectly ma­nipu­lat­ing me­di­ans in this sum­mary.

The last and most un­cer­tain piece of in­for­ma­tion is the effec­tive hori­zon of a trans­for­ma­tive task. We could imag­ine some­thing as low as 1 sub­jec­tive sec­ond (for some­thing like lan­guage mod­el­ing), or some­thing as high as 1e9 sub­jec­tive sec­onds (i.e. 32 sub­jec­tive years), if we were to redo evolu­tion, or train on a task like “do effec­tive sci­en­tific R&D”. The au­thor splits this up into short, medium and long hori­zon neu­ral net paths (cor­re­spond­ing to hori­zons of 1e0-1e3, 1e3-1e6, and 1e6-1e9 re­spec­tively), and in­vites read­ers to place their own weights on each of the pos­si­ble paths.

There are many im­por­tant con­sid­er­a­tions here: for ex­am­ple, if you think that the dom­i­nat­ing cost will be gen­er­a­tive mod­el­ing (GPT-3 style, but maybe also for images, video etc), then you would place more weight on short hori­zons. Con­versely, if you think the hard challenge is to gain meta learn­ing abil­ities, and that we prob­a­bly need “data points” com­pa­rable to the time be­tween gen­er­a­tions in hu­man evolu­tion, then you would place more weight on longer hori­zons.

Ad­ding three more po­ten­tial anchors

We can now com­bine all these in­gre­di­ents to get a fore­cast for when com­pute will be available to de­velop a trans­for­ma­tive model! But not yet: we’ll first add a few more pos­si­ble “an­chors” for the amount of com­pu­ta­tion needed for a trans­for­ma­tive model. (All of the mod­el­ing so far has “an­chored” the in­fer­ence time com­pu­ta­tion of a trans­for­ma­tive model to the in­fer­ence time com­pu­ta­tion of the hu­man brain.)

First, we can an­chor pa­ram­e­ter count of a trans­for­ma­tive model to the pa­ram­e­ter count of the hu­man genome, which has far fewer “pa­ram­e­ters” than the hu­man brain. Speci­fi­cally, we as­sume that all the scal­ing laws re­main the same, but that a trans­for­ma­tive model will only re­quire 7.5e8 pa­ram­e­ters (the amount of in­for­ma­tion in the hu­man genome) rather than our pre­vi­ous es­ti­mate of ~1e15 pa­ram­e­ters. This dras­ti­cally re­duces the amount of com­pu­ta­tion re­quired, though it is still slightly above that of the short-hori­zon neu­ral net, be­cause the au­thor as­sumed that the hori­zon for this path was some­where be­tween 1 and 32 years.

Se­cond, we can an­chor train­ing com­pute for a trans­for­ma­tive model to the com­pute used by the hu­man brain over a life­time. As you might imag­ine, this leads to a much smaller es­ti­mate: the brain uses ~1e24 FLOP over 32 years of life, which is only 10x the amount used for AlphaS­tar, and even af­ter ad­just­ing up­wards to ac­count for man-made ar­ti­facts be­ing worse than those made by evolu­tion, the re­sult­ing model pre­dicts a sig­nifi­cant prob­a­bil­ity that we would already have been able to build a trans­for­ma­tive model.

Fi­nally, we can an­chor train­ing com­pute for a trans­for­ma­tive model to the com­pute used by all an­i­mal brains over the course of evolu­tion. The ba­sic as­sump­tion here is that our op­ti­miza­tion al­gorithms and ar­chi­tec­tures are not much bet­ter than sim­ply “re­do­ing” nat­u­ral se­lec­tion from a very prim­i­tive start­ing point. This leads to an es­ti­mate of ~1e41 FLOP to train a trans­for­ma­tive model, which is more than the long hori­zon neu­ral net path (though not hugely more).

Put­ting it all together

So we now have six differ­ent paths: the three neu­ral net an­chors (short, medium and long hori­zon), the genome an­chor, the life­time an­chor, and the evolu­tion an­chor. We can now as­sign weights to each of these paths, where each weight can be in­ter­preted as the prob­a­bil­ity that that path is the cheap­est way to get a trans­for­ma­tive model, as well as a fi­nal weight that de­scribes the chance that none of the paths work out.

The long hori­zon neu­ral net path can be thought of as a con­ser­va­tive “de­fault” view: it could work out sim­ply by train­ing di­rectly on ex­am­ples of a long hori­zon task where each data point takes around a sub­jec­tive year to gen­er­ate. How­ever, there are sev­eral rea­sons to think that re­searchers will be able to do bet­ter than this. As a re­sult, the au­thor as­signs 20% to the short hori­zon neu­ral net, 30% to the medium hori­zon neu­ral net, and 15% to the long hori­zon neu­ral net.

The life­time an­chor would sug­gest that we ei­ther already could get TAI, or are very close, which seems very un­likely given the lack of ma­jor eco­nomic ap­pli­ca­tions of neu­ral nets so far, and so gets as­signed only 5%. The genome path gets 10%, the evolu­tion an­chor gets 10%, and the re­main­ing 10% is as­signed to none of the paths work­ing out.

This pre­dicts a me­dian of 2052 for the year in which some ac­tor would be will­ing and able to train a sin­gle trans­for­ma­tive model, with the full graphs shown be­low:

How does this re­late to TAI?

Note that what we’ve mod­eled so far is the prob­a­bil­ity that by year Y we will have enough com­pute for the fi­nal train­ing run of a trans­for­ma­tive model. This is not the same thing as the prob­a­bil­ity of de­vel­op­ing TAI. There are sev­eral rea­sons that TAI could be de­vel­oped later than the given pre­dic­tion:

1. Com­pute isn’t the only in­put re­quired: we also need data, en­vi­ron­ments, hu­man feed­back, etc. While the au­thor ex­pects that these will not be the bot­tle­neck, this is far from a cer­tainty.

2. When think­ing about any par­tic­u­lar path and mak­ing it more con­crete, a host of prob­lems tends to show up that will need to be solved and may add ex­tra time. Some ex­am­ples in­clude ro­bust­ness, re­li­a­bil­ity, pos­si­ble break­down of the scal­ing laws, the need to gen­er­ate lots of differ­ent kinds of data, etc.

3. AI re­search could stall, whether be­cause of reg­u­la­tion, a global catas­tro­phe, an AI win­ter, or some­thing else.

How­ever, there are also com­pel­ling rea­sons to ex­pect TAI to ar­rive ear­lier:

1. We may de­velop TAI through some other cheaper route, such as a ser­vices model (AN #40).

2. Our fore­casts ap­ply to a “bal­anced” model that has a similar pro­file of abil­ities as a hu­man. In prac­tice, it will likely be eas­ier and cheaper to build an “un­bal­anced” model that is su­per­hu­man in some do­mains and sub­hu­man in oth­ers, that is nonethe­less trans­for­ma­tive.

3. The curves for sev­eral fac­tors as­sume some max­i­mum af­ter which progress is not pos­si­ble; in re­al­ity it is more likely that progress slows to some lower but non-zero growth rate.

In the near fu­ture, it seems likely that it would be harder to find cheaper routes (since there is less time to do the re­search), so we should prob­a­bly as­sume that the prob­a­bil­ities are over­es­ti­mates, and for similar rea­sons for later years the prob­a­bil­ities should be treated as un­der­es­ti­mates.

For the me­dian of 2052, the au­thor guesses that these con­sid­er­a­tions roughly can­cel out, and so rounds the me­dian for de­vel­op­ment of TAI to 2050. A sen­si­tivity anal­y­sis con­cludes that 2040 is the “most ag­gres­sive plau­si­ble me­dian”, while the “most con­ser­va­tive plau­si­ble me­dian” is 2080.

Ro­hin’s opinion: I re­ally liked this re­port: it’s ex­tremely thor­ough and an­ti­ci­pates and re­sponds to a large num­ber of po­ten­tial re­ac­tions. I’ve made my own timelines es­ti­mate us­ing the pro­vided spread­sheet, and have adopted the re­sult­ing graph (with a few mod­ifi­ca­tions) as my TAI timeline (which ends up with a me­dian of ~2055). This is say­ing quite a lot: it’s pretty rare that a quan­ti­ta­tive model is com­pel­ling enough that I’m in­clined to only slightly edit its out­put, as op­posed to sim­ply us­ing the quan­ti­ta­tive model to in­form my in­tu­itions.

Here are the main ways in which my model is differ­ent from the one in the re­port:

1. Ig­nor­ing the genome anchor

I ig­nore the genome an­chor be­cause I don’t buy the model: even if re­searchers did cre­ate a very pa­ram­e­ter-effi­cient model class (which seems un­likely), I would not ex­pect the same scal­ing laws to ap­ply to that model class. The re­port men­tions that you could also in­ter­pret the genome an­chor as sim­ply pro­vid­ing a con­straint on how many data points are needed to train long-hori­zon be­hav­iors (since that’s what evolu­tion was op­ti­miz­ing), but I pre­fer to take this as (fairly weak) ev­i­dence that in­forms what weights to place on short vs. medium vs. long hori­zons for neu­ral nets.

2. Plac­ing more weight on short and medium hori­zons rel­a­tive to long horizons

I place 30% on short hori­zons, 40% on medium hori­zons, and 10% on long hori­zons. The re­port already names sev­eral rea­sons why we might ex­pect the long hori­zon as­sump­tion to be too con­ser­va­tive. I agree with all of those, and have one more of my own:

If meta-learn­ing turns out to re­quire a huge amount of com­pute, we can in­stead di­rectly train on some trans­for­ma­tive task with a lower hori­zon. Even some of the hard­est tasks like sci­en­tific R&D shouldn’t have a huge hori­zon: even if we as­sume that it takes hu­man sci­en­tists a year to pro­duce the equiv­a­lent of a sin­gle data point, at 40 hours a week that comes out to a hori­zon of 2000 sub­jec­tive hours, or 7e6 sec­onds. This is near the be­gin­ning of the long hori­zon realm of 1e6-1e9 sec­onds and seems like a very con­ser­va­tive over­es­ti­mate to me.

(Note that in prac­tice I’d guess we will train some­thing like a meta-learner, be­cause I sus­pect the skill of meta-learn­ing will not re­quire such large av­er­age effec­tive hori­zons.)

3. Re­duced will­ing­ness to spend

My will­ing­ness to spend fore­casts are some­what lower: the pre­dic­tions and rea­son­ing in this re­port feel closer to up­per bounds on how much peo­ple might spend rather than pre­dic­tions of how much they will spend. As­sum­ing we re­duce the ra­tio of all-in pro­ject costs to fi­nal train­ing run costs to 10x, spend­ing $1B on a train­ing run by 2025 would im­ply all-in pro­ject costs of $10B, which is ~40% of Google’s yearly R&D bud­get of $26B, or 10% of the bud­get for a 4-year pro­ject. Pos­si­bly this wouldn’t be clas­sified as R&D, but it would also be 2% of all ex­pen­di­tures over 4 years. This feels re­mark­ably high to me for some­thing that’s sup­posed to hap­pen within 5 years; while I wouldn’t rule it out, it wouldn’t be my me­dian pre­dic­tion.

4. Ac­count­ing for challenges

While the re­port does talk about challenges in e.g. get­ting the right data and en­vi­ron­ments by the right time, I think there are a bunch of other challenges as well: for ex­am­ple, you need to en­sure that your model is al­igned, ro­bust, and re­li­able (at least if you want to de­ploy it and get eco­nomic value from it). I do ex­pect that these challenges will be eas­ier than they are to­day, partly be­cause more re­search will have been done, and partly be­cause the mod­els them­selves will be more ca­pa­ble.

Another ex­am­ple of a challenge would be PR con­cerns: it seems very plau­si­ble to me that there will be a back­lash against trans­for­ma­tive AI sys­tems re­sult­ing in those sys­tems be­ing de­ployed later than we’d ex­pect them to be ac­cord­ing to this model.

To be more con­crete, if we ig­nore points 1-3 and as­sume this is my only dis­agree­ment, then for the me­dian of 2052, rather than as­sum­ing that rea­sons for op­ti­mism and pes­simism ap­prox­i­mately can­cel out to yield 2050 as the me­dian for TAI, I’d be in­clined to shade up­wards to 2055 or 2060 as my me­dian for TAI.


I’m always happy to hear feed­back; you can send it to me, Ro­hin Shah, by re­ply­ing to this email.


An au­dio pod­cast ver­sion of the Align­ment Newslet­ter is available. This pod­cast is an au­dio ver­sion of the newslet­ter, recorded by Robert Miles.