Alignment problems for economists

AI al­ign­ment is a mul­ti­dis­ci­plinary re­search pro­gram. This means that there is po­ten­tially rele­vant knowl­edge and skill scat­tered across differ­ent dis­ci­plines. But it also means that peo­ple schooled only in nar­row dis­ci­plines will ex­pe­rience a hur­dle when they would work on a prob­lem in AI al­ign­ment. One such dis­ci­pline is eco­nomics, from which de­ci­sion the­ory and game the­ory origi­nated.

In this post I want to ex­plore the idea that we should try to cre­ate a col­lec­tion of “al­ign­ment-prob­lems-for-economists”, pack­aged in a way that economists who have rele­vant knowl­edge and skill but don’t un­der­stand ML/​CS/​AF can work on them.

There seem to be sub-prob­lems in AI al­ign­ment that economists might be able to work on. How­ever, out of the economists that I’ve spo­ken to, some are en­thu­si­as­tic about this but see it as a per­sonal ca­reer-risk to work on it as they do not un­der­stand the com­puter sci­ence. So if we can take sub­prob­lems in al­ign­ment, and pack­age them in a way that economists can im­me­di­ately start work­ing on them, then we might be able to uti­lize in­tel­lec­tual re­sources (economists) that would oth­er­wise have worked on some­thing differ­ent.

Two types of economists to target

1. Economists who also to a de­gree un­der­stand ba­sic ML/​CS

2. Economists who do not.

I don’t find it very plau­si­ble that we could find sub-prob­lems for the sec­ond type to work on, but it doesn’t seem en­tirely im­pos­si­ble: there could be cer­tain spe­cific prob­lems in mechanism de­sign or so­cial choice or so, that would be use­ful for al­ign­ment but don’t re­quire any ML/​CS.

Prop­er­ties of al­ign­ment-prob­lems-for-economists that are de­sir­able:

1. Pub­lish­able in eco­nomics jour­nals. I have spo­ken to economists that are in­ter­ested in the al­ign­ment prob­lem, but they are hes­i­tant to work on it: It is a risky ca­reer move to work on al­ign­ment if they can­not pub­lish in jour­nals that they are used to.

2. High work/​state­ment ra­tio. How long will it take to solve the prob­lem, ver­sus pro­vid­ing the state­ment of the prob­lem? If 90% of the prob­lem is to state it in a form so that an economist could work on it, then it would likely not be effi­cient to do so. It should be a prob­lem that can rel­a­tively eas­ily be com­mu­ni­cated clearly to an economist, while tak­ing more time to solve.

3. No strong re­li­ance on CS/​ML tools. Many economists are some­what fa­mil­iar with ba­sic ML tech­niques, but if a prob­lem re­lies too much on knowl­edge of CS or ML, this in­creases the ca­reer-risk of the prob­lem.

4. Not nec­es­sar­ily speci­fi­cally x-risk re­lated. If a prob­lem in al­ign­ment is not speci­fi­cally x-risk re­lated, it is less/​not em­bar­rass­ing to work on it, and there­fore less of a ca­reer-risk. Nev­er­the­less, most prob­lems in AI al­ign­ment seem im­por­tant even if you don’t be­lieve that AI poses an x-risk. I don’t think this re­quire­ment is that im­por­tant.

* Does not have to be high-im­pact. If a prob­lem has only a small chance of be­ing some­what im­pact­ful, it might still be worth pack­ag­ing it as an eco­nomic prob­lem, since the economists who could work on it would not oth­er­wise work on al­ign­ment prob­lems at all.

I do not yet have a list of such prob­lems, but it seems that it might be pos­si­ble to make one:

For ex­am­ple, economists might work on prob­lems in mechanism de­sign and so­cial choice for AGI’s in a vir­tual con­tain­ment. For ex­am­ple, can we cre­ate mechanisms with de­sir­able prop­er­ties for the am­plifi­ca­tion phase in Chris­ti­ano’s pro­gram, to al­ign a col­lec­tion of dis­til­led agents? Can we prove that such mechanisms are ro­bust un­der cer­tain as­sump­tions? Can we cre­ate mechanisms that ro­bustly in­cen­tivizes AGI’s with un­al­igned util­ity func­tions to tell us the truth? Can we use so­cial choice to find out prop­er­ties of agents that con­sist of sub-agents?

Economists work on strate­gic com­mu­ni­ca­tion be­tween agents (cheap talk), which might be helpful in the de­sign of safe con­tain­ment sys­tems of not-su­per­in­tel­li­gent AGI. In­for­ma­tion eco­nomics works on game the­o­retic prop­er­ties of differ­ent al­lo­ca­tions of in­for­ma­tion, and might be use­ful in such mechanisms as well. Economists also work on vot­ing, and de­ci­sion the­ory.

I want your feed­back:

1. What kind of prob­lems have you en­coun­tered that might be added to this list?

2. Do you have rea­sons to think that this pro­ject would be doomed to fail (or not)? If so, I want to pre­vent wast­ing time on it as fast as pos­si­ble. De­spite hav­ing writ­ten this post, I don’t as­sign a high prob­a­bil­ity of suc­cess, but I’d like peo­ple’s views.