Defining the normal computer control problem

There has been fo­cus on con­trol­ling su­per in­tel­li­gent ar­tifi­cial in­tel­li­gence, how­ever we cur­rently can’t even con­trol our un-agenty com­put­ers with­out hav­ing to re­sort to for­mat­ting and other large scale in­ter­ven­tions.

Solv­ing the nor­mal com­puter con­trol prob­lem might help us solve the su­per in­tel­li­gence con­trol prob­lem or al­low us to work to­wards safe in­tel­li­gence aug­men­ta­tion.

We can­not cur­rently keep our com­put­ers do­ing what we want eas­ily. They can get in­fected with malware, com­pro­mised or they get up­dates that may be buggy. If you have suffi­cient ex­per­tise you can go in and fix the prob­lem or wipe the sys­tem, but this is not ideal.

We do not have con­trol our com­put­ers, with­out re­sort­ing to out of band ma­nipu­la­tion.

Genes have found a way to con­trol the rep­tilian brain and also the more pow­er­ful mam­malian and hu­man brains, some­what, as dis­cussed in the con­trol prob­lem has already been solved. And the sys­tem con­tinues to run, our brains aren’t re­for­mat­ted when we get bad be­havi­ours. Let us call this the nor­mal com­puter con­trol prob­lem. We don’t know how the genes do it, but hu­mans tend to do what they would want (if they wanted things!) de­spite our flex­i­bil­ity. There is some con­trol there.

This prob­lem of con­trol has been ne­glected by tra­di­tional AI as there it is not try­ing to solve a cog­ni­tive prob­lem. It is not like solv­ing chess or learn­ing to rec­og­nize faces. It is not mak­ing any­thing pow­er­ful, it is just weed­ing out the bad pro­grams.

Com­par­ing the nor­mal com­puter con­trol and AI con­trol problem

The AI con­trol prob­lem has been defined as ask­ing the ques­tion

What prior pre­cau­tions can the pro­gram­mers take to suc­cess­fully pre­vent the su­per­in­tel­li­gence from catas­troph­i­cally mis­be­hav­ing?

In this lan­guage the nor­mal com­puter con­trol prob­lem can be defined as.

What type of au­to­mated sys­tem can we im­ple­ment to stop a nor­mal gen­eral pur­pose com­puter sys­tem mis­be­hav­ing (and carry on with its good be­havi­our ) if it has a ma­lign pro­gram in it.

To make the differ­ences ex­plicit:

  • The nor­mal con­trol prob­lem as­sumes a sys­tem with mul­ti­ple pro­grams some good, some bad

  • The nor­mal con­trol prob­lem as­sumes that there is no spe­cific agency in the pro­grams (es­pe­cially not su­per-in­tel­li­gent agency)

  • The nor­mal con­trol prob­lem al­lows minor mis­be­havi­our, but that it should not per­sist over time

Th­ese make the prob­lem more amenable to study. Th­ese sorts of sys­tems can be seen in an­i­mals. They will stop pur­su­ing be­havi­ours that are ma­lign to them­selves. If a horse un­know­ingly walks into an elec­tric fence whilst it was try­ing to get to an ap­ple, they will stop try­ing to walk in that di­rec­tion. This is op­er­ant con­di­tion­ing, but it has not been ap­plied to a whole com­puter sys­tem with ar­bi­trary pro­grams in.

Imag­ine be­ing able to re­move malware from a nor­mal com­puter sys­tem by train­ing that sys­tem. That is what I am look­ing to pro­duce.

This might not be the right prob­lem defi­ni­tion to help us un­der­stand what is go­ing on the con­trol done in brains. But I think it is pre­cise and novel enough to form one of the ini­tially re­search path­ways. Ideally we should have a di­verse com­mu­nity around this prob­lem one that in­cludes neu­ro­scien­tists, psy­chol­o­gists and other rele­vant sci­en­tists. We would also have char­i­ta­ble or­gani­sa­tions like the ones try­ing to solve the su­per in­tel­li­gence con­trol prob­lem. All this would max­i­mize the chance that the right ques­tion was be­ing at­tempted to be an­swered.

    Should we study the nor­mal com­puter con­trol prob­lem?

    I’m not go­ing to try and ar­gue that this is more im­por­tant that su­per-in­tel­li­gence work. Such things are prob­a­bly un­know­able un­til af­ter the fact. But just that it is a more tractable prob­lem to try and solve and might have in­sights use­ful for the su­per-in­tel­li­gence work.

    But as ever with the fu­ture there are trade-offs for do­ing this work.


    • It might help solve the su­per in­tel­li­gence con­trol prob­lem, by pro­vid­ing in­spira­tion or al­low­ing peo­ple to show ex­actly where it go wrong in the su­per in­tel­li­gence side of things.

    • it might be the best that we can do to­wards the con­trol prob­lem, along with good train­ing at (if for­mal proofs to do with val­ues aren’t helpful for con­trol)

    • We have can use sci­ence on brains in gen­eral, to give us in­spira­tion on how this might work.

    • It can be more ex­per­i­men­tal and less the­o­ret­i­cal than cur­rent work.

    • It might help with in­tel­li­gence aug­men­ta­tion work (maybe a Con, de­pend­ing upon pre-con­cep­tions).

    • if de­ployed widely it might lead to com­pu­ta­tion be­ing harder to con­trol by mal­i­cious ac­tors. This would mak­ing tak­ing over the in­ter­net harder (ame­lio­rat­ing one take off sce­nario).

    • it could grow the pool of peo­ple that have heard of the con­trol problem

    • my cur­rent hy­poth­e­sis is that the pro­grams within my cur­rent sys­tem aren’t bound to max­imise util­ity so do not suffer from some of the failure modes as­so­ci­ated with nor­mal util­ity max­imi­sa­tion.


    • It might speed up AI work (how­ever it is not AI in it­self, there is no set of prob­lems it is try­ing to solve). It would speed up AI work by giv­ing a com­mon plat­form to work from.

    • it might dis­tract from the large scale su­per in­tel­li­gence problem

    So hav­ing said all that: Is any­one with me in try­ing to solve this prob­lem?