Counterfactual resiliency test for non-causal models

Non-causal models

Non-causal mod­els are quite com­mon in many fields, and can be quite ac­cu­rate. Here pre­dic­tions are made, based on (a par­tic­u­lar se­lec­tion of) past trends, and it is as­sumed that these trends will con­tinue in fu­ture. There is no causal ex­pla­na­tion offered for the trends un­der con­sid­er­a­tion: it’s just as­sumed they will go on as be­fore. Non-causal mod­els are thus par­tic­u­larly use­ful when the un­der­ly­ing causal­ity is un­cer­tain or con­tentious. To illus­trate the idea, here are three non-causal mod­els in com­puter de­vel­op­ment:

  1. Moore’s laws about the reg­u­lar dou­bling of pro­cess­ing speed/​hard disk size/​other com­puter re­lated pa­ram­e­ter.

  2. Robin Han­son’s model where the de­vel­op­ment of hu­man brains, hunt­ing, agri­cul­ture and the in­dus­trial rev­olu­tion are seen as re­lated stages of ac­cel­er­a­tions of the un­der­ly­ing eco­nomic rate of growth, lead­ing to the con­clu­sion that there will be an­other surge dur­ing the next cen­tury (likely caused by whole brain em­u­la­tions or AI).

  3. Ray Kurzweil’s law of time and chaos, lead­ing to his law of ac­cel­er­at­ing re­turns. Here the in­puts are the ac­cel­er­at­ing evolu­tion of life on earth, the ac­cel­er­at­ing ‘evolu­tion’ of tech­nol­ogy, fol­lowed by the ac­cel­er­at­ing growth in the power of com­put­ing across many differ­ent sub­strates. This leads to a con­se­quent ‘sin­gu­lar­ity’, an ex­plo­sion of growth, at some point over the com­ing cen­tury.

Be­fore any­thing else, I should thank Moore, Han­son and Kurzweil for hav­ing the courage to pub­lish their mod­els and put them out there where they can be cri­tiqued, mocked or praised. This is a brave step, and puts them a cut above most of us.

That said, though I find the first ar­gu­ment quite con­vinc­ing, I find have to say I find the other two du­bi­ous. Now, I’m not go­ing to claim they’re mi­sus­ing the out­side view: if you ac­cuse them of shov­ing to­gether un­re­lated pro­cesses into a sin­gle model, they can equally well ac­cuse you of ig­nor­ing the com­mon­al­ities they have high­lighted be­tween these pro­cesses. Can we do bet­ter than that? There has to be a bet­ter guide to the truth that just our own pri­vate im­pres­sions.

Coun­ter­fac­tual resilience

One thing I’d like to do is test the re­silience of the model—how ro­bust are they to change. If model M makes pre­dic­tion P from trends T and the real out­come will be O, we can test re­siliency in two ways. First, we can change the world to change T (and hence P), with­out chang­ing O, or we can change the world to change O, with­out chang­ing T (and hence P). If we can do ei­ther or both, this is a strong in­di­ca­tion that the model doesn’t work.

This all sounds highly du­bi­ous—how can we “change the world” in that way? I’m talk­ing about con­sid­er­ing coun­ter­fac­tu­als: al­ter­nate wor­lds whose his­tory em­bod­ies the best of our knowl­edge as to how the real world works. To pick an ex­tremely triv­ial ex­am­ple, imag­ine some­one who main­tains that the West’s global dom­i­na­tion was in­evitable four cen­turies af­ter Luther’s 95 the­ses the­sis in 1517, no mat­ter what else hap­pened out­side Europe. Then we can imag­ine coun­ter­fac­tu­ally di­vert­ing huge as­ter­oids to land in the Chan­nel, or im­port hy­per-viru­lent forms of bird flu from Asi­atic Rus­sia. Ac­cord­ing to ev­ery­thing we know about as­ter­oid im­pacts, epi­demiol­ogy and eco­nomics, this would not have lead to a dom­i­nant West for many cen­turies af­ter­wards.

That was an ex­am­ple of keep­ing T and P, and chang­ing the out­come O. It is le­gi­t­i­mate: we have pre­served ev­ery­thing that went into the ini­tial model, and made the pre­dic­tion wrong. We could take the re­verse ap­proach: chang­ing T and P while pre­serv­ing the out­come O. To do so, we could imag­ine mov­ing Luther (or some Luther-like char­ac­ter) to 1217, with­out chang­ing the rest of Euro­pean his­tory much. To move Luther back in time, we could perfectly imag­ine that the Catholic church had started sel­l­ing and abus­ing in­dul­gences much ear­lier than they did—cor­rupt cler­ics were hardly an im­pos­si­ble idea in the mid­dle ages. It re­quires a bit re­li­gious and so­cial changes to have the 95 these make sense in the thir­teenth cen­tury, but not all that much. Then we could imag­ine that Luther-like char­ac­ter be­ing ig­nored or burnt, and the rest of Western his­tory hap­pen­ing as usual, with­out west­ern world dom­i­nance hap­pen­ing four cen­turies af­ter that non-event (which is what M would have pre­dicted). No­tice that in both these cases, con­sid­er­ing coun­ter­fac­tu­als al­lows us to bring our knowl­edge or the­o­ries about other facts of the world to bear on as­sess­ing the model—we are no longer limited to sim­ply de­bat­ing the as­sump­tions of the model it­self.

“Ob­jec­tion!” shouts my origi­nal straw­man, at both my re­siliency tests. “Of course I didn’t spec­ify ‘un­less a me­teor im­pacts’; that was im­plicit and ob­vi­ous! When you say ‘let’s meet to­mor­row’, you don’t gen­er­ally add ‘un­less there’s a nu­clear war’! Also, I ob­ject to your mov­ing Luther three cen­turies be­fore and say­ing my model would pre­dict the same thing in 1217. I was refer­ring to Luther nailing up his the­ses, in the con­text of an ed­u­cated liter­ate pop­u­la­tion, with print­ing presses and a poli­ti­cal sys­tem that was will­ing to stand up to the Catholic church. Also, I don’t be­lieve you when you say there would need to not be ‘all that much’ re­li­gious and so­cial changes for early Luther to ex­ist. You’d have to change so much, that there’s no way you could put his­tory back on the ‘nor­mal’ track af­ter­wards.”

No­tice that the con­ver­sa­tion has moved on from ‘out­side view’ ar­gu­ments, to mak­ing ex­plicit im­plicit as­sump­tions, ex­tend­ing the model, and ar­gu­ing about our un­der­stand­ing of causal­ity. Thus if these coun­ter­fac­tual re­siliency tests don’t break a model, they’re likely to im­prove it, our un­der­stand­ing, and the de­bate.

The re­silience of these models

So let’s ap­ply this to Robin Han­son’s and Ray Kurzweil’s mod­els. I’ll start with Robin’s, as it’s much more de­tailed. The key in­puts of Robin’s model are the time differ­ences be­tween the differ­ent rev­olu­tions (brains, hunt­ing, agri­cul­ture, in­dus­try), and the growth rates af­ter these rev­olu­tions. The pre­dic­tion is that there is an­other rev­olu­tion com­ing about three cen­turies af­ter the in­dus­trial rev­olu­tion, and that af­ter this the econ­omy will dou­ble ev­ery 1-2 weeks. He then makes the point that the only plau­si­ble way for this to hap­pen is through the cre­ation of brain em­u­la­tions or AIs—copy­able hu­man cap­i­tal. I’ll also as­sume the im­plicit “no dis­aster” as­sump­tion: me­teor strikes or world gov­ern­ments bent on ban­ning AI re­search. How does this fare in coun­ter­fac­tu­als?

It seems rather easy to mess with the in­puts T. Weather con­di­tions or con­ti­nen­tal drifts could con­fine pre-agri­cul­tural hu­mans to hunt­ing es­sen­tially in­definitely, fol­lowed by a slow evolu­tion to agri­cul­ture when the cli­mate im­proved or more lands be­came available. Con­versely, we could imag­ine in­cred­ibly nu­tri­tious crops that were easy to cul­ti­vate, and hun­dreds of do­mes­ti­ca­ble species, rather than the 30-40 we ac­tu­ally had. Com­bine this with a mass die-off of game and some strong evolu­tion­ary pres­sure, and we could end up with agri­cul­ture start­ing much more rapidly.

This sounds un­fair—are these not huge trans­for­ma­tions to the hu­man world and the nat­u­ral world that I’m posit­ing here? In­deed I am, but Robin’s model is that these differ­en­tial growth rates have pre­dic­tive abil­ity, not that these differ­en­tial growth rates com­bined with a de­tailed his­tor­i­cal anal­y­sis of many con­tin­gent fac­tors have pre­dic­tive abil­ity. If the model were to claim that the va­garies of plate tec­ton­ics and the num­ber of eas­ily do­mes­ti­cated species in early hu­man de­vel­op­ment have rele­vance to how long af­ter the in­dus­trial rev­olu­tion would brain em­u­la­tions be de­vel­oped, then some­thing has gone wrong with it.

Con­tin­u­ing on this vein, we can cer­tainly move the in­dus­trial rev­olu­tion back in time. The an­cient Greek world, with its steam en­g­ines, philoso­phers and math­e­mat­i­ci­ans, seems an ideal lo­ca­tion for a coun­ter­fac­tual. Any philo­soph­i­cal, so­cial or ini­tial tech­nolog­i­cal de­vel­op­ment that we could la­bel as es­sen­tial to in­dus­tri­al­i­sa­tion, could at least plau­si­bly have arisen in a Greek city or colony—pos­si­bly over a longer pe­riod of time.

We can also tweak the speed of eco­nomic growth. The yield on hunt­ing can be changed through the availa­bil­ity or ab­sence of con­ve­nient prey an­i­mals. Dur­ing the agri­cul­tural era, we could posit high-yield crops and an en­light­ened despot who put in place some un­der­stand­able-to-an­cient-peo­ple el­e­ments of the green rev­olu­tion—or con­versely, poor yield crops suffer­ing from fre­quent blight. Easy or difficult ac­cess to coal would af­fect growth dur­ing the in­dus­trial era, or we could jump ahead by hav­ing the in­ter­nal com­bus­tion en­g­ine, not the steam en­g­ine, as the ini­tial prime driver of in­dus­tri­al­i­sa­tion. The com­puter era could be brought for­wards by hav­ing Bab­bage com­plete his ma­chines for the Bri­tish gov­ern­ment, or pushed back­wards by re­mov­ing Tur­ing from the equa­tion and as­sum­ing the Se­cond World War didn’t hap­pen.

You may dis­agree with some of these ideas, but it seems to me that there are just too many con­tin­gent fac­tors that can mess up the in­put to the model, lead­ing some pu­ta­tive par­allel-uni­verse Robin Han­son to give com­pletely differ­ent times to brain em­u­la­tions. This sug­gests the model is not very re­silient.

Or we can look at the re­verse: mak­ing whole brain em­u­la­tions much eas­ier, or much harder, than they are now, with­out touch­ing the in­puts to the model at all (and hence its pre­dic­tions). For in­stance, if hu­mans were de­scen­dant from a hi­ber­nat­ing species, it’s perfectly con­ceiv­able that we could have brains that would be easy to fix­ate and slice up for build­ing em­u­la­tions. Other changes to our brain de­sign could also make this eas­ier. It might be that our brains had a differ­ent ar­chi­tec­ture, one where it was much sim­pler to iso­late a small “con­scious­ness mod­ule” or “de­ci­sion mak­ing mod­ule”. Un­der these as­sump­tions, we could con­ceiv­ably have had ad­e­quate em­u­la­tions back in the 60s or 70s! Again, these as­sump­tions are false—life didn’t hap­pen like that, it may be im­pos­si­ble for life to hap­pen like that—but know­ing that these as­sump­tions are false re­quires knowl­edge that is nei­ther ex­plic­itly nor im­plic­itly in the model. And of course we have con­verses: brain ar­chi­tec­tures too gnarly and del­i­cate to fix and slice. Early or late neu­ro­science break­throughs (and greater or lesser tech­nolog­i­cal or med­i­cal re­turns on these break­throughs). Greater or lesser pop­u­lar in­ter­est in brain ar­chi­tec­ture.

For these rea­sons, it seems to me that Robin Han­son’s model fails the coun­ter­fac­tual re­siliency test. Ray Kuzweil’s model suffers similarly—since Kur­weil’s model in­cludes the whole of evolu­tion­ary his­tory (in­clud­ing dis­asters), we can play around with cli­mate, as­ter­oid col­li­sions and tec­ton­ics to make evolu­tion hap­pen at very differ­ent rates (one easy change is to kill off all hu­mans in the Toba catas­tro­phe). Shift­ing around the date of the tech­nolog­i­cal break­throughs and that of first com­puter still messes up with the model, and back­dat­ing im­por­tant in­sights al­lows us to imag­ine much ear­lier AIs.

And then there’s Moore’s law, start­ing with Moore’s 1965 pa­per… The differ­ence is im­me­di­ately ob­vi­ous, as we start try­ing to ap­ply the same tricks to Moore’s law. Where even to start? Maybe cer­tain tran­sis­tors de­signs are not available? Maybe sili­con is hard to get ahold of rather than be­ing ubiquitous? Maybe In­tel went bust at an early stage? Maybe no-one dis­cov­ered pho­tolithog­ra­phy? Maybe some spe­cific use of com­put­ers wasn’t thought of, so de­mand was re­duced? Maybe some spe­cial new chip de­sign was imag­ined ahead of time?

None of these seem to clearly lead to situ­a­tions where Moore’s law would fail. We don’t re­ally know what causes Moore’s law, but it has been ro­bust for moves to very differ­ent tech­nolo­gies, and has spanned cul­tural trans­for­ma­tions and changes in the pur­pose and uses of com­put­ers. It seems to lie at the in­ter­ac­tion be­tween mar­kets de­mand, tech­nolog­i­cal de­vel­op­ment, and im­ple­men­ta­tion. Some triv­ial change could con­ceiv­ably throw it off its rails—but we just don’t know what, which means we can’t bring our knowl­edge about other facts in the world to bear.

In con­clu­sion: more work needed

It was the com­par­a­tive ease with which we could change the com­po­nents of the other two mod­els that re­vealed their lack of re­silience; it is the difficulty of do­ing so with Moore’s law that shows it is re­silient.

I’ve never seen this ap­proach used be­fore; more re­silience tests only in­volve chang­ing nu­mer­i­cal pa­ram­e­ters from in­side the model. Cer­tainly the ap­proach needs to be im­proved: it feels very in­for­mal and sub­jec­tive for the mo­ment. Nev­er­the­less, I feel that it has af­forded me some gen­uine in­sights, and I’m hop­ing to im­prove and for­mal­ise it in fu­ture—with any feed­back I get here, of course.