# Thoughts and problems with Eliezer’s measure of optimization power

Back in the day, Eliezer pro­posed a method for mea­sur­ing the op­ti­miza­tion power (OP) of a sys­tem S. The idea is to get a mea­sure of small a tar­get the sys­tem can hit:

You can quan­tify this, at least in the­ory, sup­pos­ing you have (A) the agent or op­ti­miza­tion pro­cess’s prefer­ence or­der­ing, and (B) a mea­sure of the space of out­comes—which, for dis­crete out­comes in a finite space of pos­si­bil­ities, could just con­sist of count­ing them—then you can quan­tify how small a tar­get is be­ing hit, within how large a greater re­gion.

Then we count the to­tal num­ber of states with equal or greater rank in the prefer­ence or­der­ing to the out­come achieved, or in­te­grate over the mea­sure of states with equal or greater rank. Di­vid­ing this by the to­tal size of the space gives you the rel­a­tive smal­l­ness of the tar­get—did you hit an out­come that was one in a mil­lion? One in a trillion?

Ac­tu­ally, most op­ti­miza­tion pro­cesses pro­duce “sur­prises” that are ex­po­nen­tially more im­prob­a­ble than this—you’d need to try far more than a trillion ran­dom re­order­ings of the let­ters in a book, to pro­duce a play of qual­ity equal­ling or ex­ceed­ing Shake­speare. So we take the log base two of the re­cip­ro­cal of the im­prob­a­bil­ity, and that gives us op­ti­miza­tion power in bits.

For ex­am­ple, as­sume there were eight equally likely pos­si­ble states {X0, X1, … , X7}, and S gives them util­ities {0, 1, … , 7}. Then if S can make X6 hap­pen, there are two states bet­ter or equal to its achieve­ment (X6 and X7), hence it has hit a tar­get filling 14 of the to­tal space. Hence its OP is log2 4 = 2. If the best S could man­age is X4, then it has only hit half the to­tal space, and has an OP of only log2 2 = 1. Con­versely, if S reached the perfect X7, 18 of the to­tal space, then it would have an OP of log2 8 = 3.

## The sys­tem, the whole sys­tem, and ev­ery­thing else in the universe

No­tice that OP is defined in terms of the state that S achieved (for the mo­ment this will be a pure world, but later we’ll al­low prob­a­bil­is­ti­cally mixed wor­lds to be S’s “achieve­ment”). So it give us a mea­sure of how pow­er­ful S is in prac­tice in our model, not some pla­tonic mea­sure of how good S is in gen­eral situ­a­tions. So an idiot king has more OP than a brilli­ant peas­ant; a naive search al­gorithm dis­tributed across the in­ter­net has more OP than a much bet­ter pro­gram run­ning on Colos­sus. This does not seem a draw­back to OP: af­ter all, we want to mea­sure how pow­er­ful a sys­tem ac­tu­ally is, not how pow­er­ful it could be in other cir­cum­stances.

Similarly, OP mea­sures the sys­tem’s abil­ity to achieve its very top goals, not how hard these goals are. A sys­tem that wants to com­pose a brilli­ant son­net has more OP than ex­actly the same sys­tem that wants to com­pose a brilli­ant son­net while em­bod­ied in the An­dromeda galaxy. Even though the sec­ond is plau­si­bly more dan­ger­ous. So OP is a very im­perfect mea­sure of how pow­er­ful a sys­tem is.

We could maybe ex­tend this to some sort of “op­posed OP”: what is the op­ti­miza­tion power of S, given that hu­mans want to stop it from achiev­ing its goals? But even there, a highly pow­er­ful sys­tem with nearly un-achiev­able goals will still have a very low op­posed OP. Maybe the differ­ence be­tween the op­posed OP and the stan­dard OP is a bet­ter mea­sure of power.

As pointed out by Tim Tyler, OP can also in­crease if we change the size of the solu­tion space. Imag­ine an agent that has to print out a non-nega­tive in­te­ger N, and whose util­ity is -N. The agent will ob­vi­ously print 0, but if the printer is limited to ten digit num­bers, its OP is smaller than if the printer is limited to twenty digit num­bers: though the solu­tion is just as easy and ob­vi­ous, the num­ber of ways it “could have been worse” is in­creased, in­creas­ing OP.

## Is it OP an en­tropy? Is it defined for mixed states?

In his post Eliezer makes a com­par­i­son be­tween OP and en­tropy. And OP does have some of the prop­er­ties of en­tropy: for in­stance if S is op­ti­miz­ing two sep­a­rate in­de­pen­dent pro­cesses (and its own util­ity treats them as in­de­pen­dent), then its OP is the sum of the OP for each pro­cess. If for in­stance S hit an area of 14 in the first pro­cess (OP 2) and 18 in the sec­ond (OP 3), then it hits an area of 1/​(4*8)=1/​32 for the joint pro­cesses, for an OP of 5. This prop­erty, in­ci­den­tally, is what al­lows us to talk about “the” en­tropy of an iso­lated sys­tem, with­out wor­ry­ing about the rest of the uni­verse.

But now imag­ine that our S in the first ex­am­ple can’t be sure to hit a pure state, but has 50% chance of hit­ting X7 and 50% of hit­ting X4. If OP were an en­tropy, then we’d sim­ply do a weighted sum 1/​2(OP(X4)+OP(X7))=1/​2(1+3)=2, and then add one ex­tra bit of en­tropy to rep­re­sent our (bi­nary) un­cer­tainty as to what state we were in, giv­ing a to­tal OP of 3. But this is the same OP as X7 it­self! And ob­vi­ously a 50% of X7 and 50% of some­thing in­fe­rior can­not be as good as a cer­tainty of the best pos­si­ble state. So un­like en­tropy, mere un­cer­tainty can­not in­crease OP.

So how should OP ex­tend to mixed states? Can we write a sim­ple dis­tribu­tive law:

OP(1/​2 X4 + 12 X7) = 1/​2(OP(X4) + OP(X7)) = 2?

It turns out we can’t. Imag­ine that, with­out chang­ing any­thing else, the util­ity of X7 is sud­denly set to ten trillion, rather than 7. The OP of X7 is still 3 - it’s still the best op­tion, still with prob­a­bil­ity 18. And yet 12 X4 + 12 X7 is now ob­vi­ously much, much bet­ter than X6, which has an OP of 2. But now let’s re­set X6 to be­ing ten trillion minus 1. Then it still has a OP of 2, and yet is now much much bet­ter than 12 X4 + 12 X7.

But I may have been un­fair in those ex­am­ples. After all, we’re look­ing at mixed states, and X6 need not have a fixed OP of 2 in the space of mixed states. Maybe if we looked at the sim­plex formed by all mixed states made up of {X0, X1, … , X7}, we could get these re­sults to work? Since all Xi are equally likely, we’d sim­ply put a uniform mea­sure on that sim­plex. But now we run into an­other prob­lem: the OP of X7 has sud­denly shot up to in­finity! After all, X7 is now an event of prob­a­bil­ity zero, bet­ter than any other out­come; the log2 of the in­verse of its prob­a­bil­ity is in­finity. Even if we just re­strict to a tiny, non-zero area, around X7, we get ar­bi­trar­ily high OP—it’s not a fluke or a calcu­la­tion er­ror. Which means that if we fol­lowed the dis­tribu­tive law, Q=(1-10-1000) X0 + 10-1000 X7 must have a much larger OP than X6 - de­spite the fact that nearly ev­ery pos­si­ble out­come is bet­ter than Q.

So it seems that un­like en­tropy, OP can­not have any­thing re­sem­bling a dis­tribu­tive law. The set of pos­si­ble out­comes that you started with—in­clud­ing any pos­si­ble mixed out­comes that S could cause—is what you’re go­ing to have to use. This sits un­com­fortably with the whole Bayesian philos­o­phy—af­ter all, there mixed states shouldn’t rep­re­sent any­thing but un­cer­tainty be­tween pure states. They shouldn’t be listed as sep­a­rate out­comes.

## Mea­sures and coarse-graining

In the pre­vi­ous sec­tion, we moved from us­ing a finite set of equally likely out­comes, to a mea­sure over a sim­plex of mixed out­comes. This is the nat­u­ral gen­er­al­i­sa­tion of OP: sim­ply com­pute the prob­a­bil­ity mea­sure of the states bet­ter than what S achieves, and use the log2 of the in­verse of this mea­sure as OP.

Some of you may have spot­ted the mas­sive elephant in the room, whose mass twists space and un­der­lines and un­der­mines the defi­ni­tion of OP. What does this prob­a­bil­ity mea­sure ac­tu­ally rep­re­sent? Eliezer saw it in his origi­nal post:

The quan­tity we’re mea­sur­ing tells us how im­prob­a­ble this event is, in the ab­sence of op­ti­miza­tion, rel­a­tive to some prior mea­sure that de­scribes the un­op­ti­mized prob­a­bil­ities.

Or how could I write “there were eight equally likely pos­si­ble states” and “S can make X6 hap­pen”? Well, ob­vi­ously, what I meant was that if S didn’t ex­ist, then it would be equally likely that X7 and X6 and X5 and X4 and...

But wait! Th­ese Xi’s are fi­nal states of the world—so they in­clude the in­for­ma­tion as to whether S ex­isted in them or not. So what I’m ac­tu­ally say­ing is that {X0(¬S), X1(¬S), … , X7(¬S)} (the wor­lds with no S) are equally likely, whereas Xi(S) (the wor­lds with S) are im­pos­si­ble for i≠6. But what has al­lowed me to iden­tify X0(¬S) with X0(S)? I’m claiming they’re the same world “apart from S” but what does this mean? After all, S can have huge im­pacts, and X0(S) is ac­tu­ally an im­pos­si­ble world! So I’m say­ing that “there two wor­lds are strictly the same, apart that S ex­ists in one of them, but them again, S would never al­low that world to hap­pen if it did ex­ist, so, hum...”

Thus it seems that we need to use some sort of coarse-grain­ing to iden­tify Xi(¬S) with Xi(S), similar to those I spec­u­lated on in the re­duced im­pact post.

• OP mea­sures the sys­tem’s abil­ity to achieve its goals, not how hard these goals are.

Not re­ally. It’s of­ten largely a mea­sure of how big the speci­fied solu­tion space is—e.g.:

• What’s the small­est prime larger than a mil­lion, and smaller than a billion?

• What’s the small­est prime larger than a mil­lion, and smaller than a trillion?

Essen­tially the same prob­lem, big­ger solu­tion space—more EOP [Eliezer Op­ti­miza­tion Power].

• Thanks, have up­dated the text with a var­i­ant of that prob­lem.

• A con­cept I’ve played with, com­ing off of Eliezer’s ini­tial take on the prob­lem of for­mu­lat­ing op­ti­miza­tion power, is: Sup­pose some­thing gen­er­ated N op­tions ran­domly and then chose the best. Given the ob­served choice, what is the like­li­hood func­tion for N?

For con­tin­u­ously dis­tributed util­ities, this can be com­puted di­rectly us­ing beta dis­tri­bu­tions. Beta(N, 1) is the prob­a­bil­ity den­sity for the high­est of N uniformly dis­tributed unit ran­dom num­bers. This in­cludes num­bers which are cu­mu­la­tive prob­a­bil­ities for a con­tin­u­ous dis­tri­bu­tion at val­ues drawn from that dis­tri­bu­tion, and there­fore num­bers which are cu­mu­la­tive prob­a­bil­ities at the good­ness of an ob­served choice. (N doesn’t have to be an in­te­ger, be­cause beta dis­tri­bu­tions are defined for non-in­te­ger pa­ram­e­ters.)

(The sec­ond step of this con­struc­tion, where you at­tach a beta dis­tri­bu­tion to an­other dis­tri­bu­tion’s CDF, I had to work out by my­self; it’s not di­rectly men­tioned in any dis­cus­sions of ex­treme value statis­tics that I could find. The Math­world page on or­der statis­tics, one step of gen­er­al­iza­tion away, uses the for­mula for a beta CDF trans­formed by an­other CDF, but it still doesn’t re­fer to beta dis­tri­bu­tions by name.)

If the util­ities are dis­cretely dis­tributed, you have to in­te­grate the beta den­sity over the in­ter­val of cu­mu­la­tive prob­a­bil­ities that in­vert to the ob­served util­ity.

To han­dle choices of mix­tures, I guess you could mod­ify this slightly, and ask about the like­li­hood func­tion for N given the ob­served out­come, marginal­iz­ing over (as User:Kindly also sug­gests) the (un­ob­served) choice of op­tion. This re­quires a dis­tri­bu­tion over op­tions and a con­di­tional dis­tri­bu­tion over ob­ser­va­tions given op­tions. This would also cover situ­a­tions with com­pos­ite op­tions where you only ob­serve one of the as­pects of the cho­sen op­tion.

Op­posed op­ti­miza­tion might be very crudely mod­eled by in­creas­ing the num­ber on the op­po­site side of the beta dis­tri­bu­tion from N. Some­what re­lated to this is War­ren Smith’s “Fixed Point for Nega­max­ing Prob­a­bil­ity Distri­bu­tions on Reg­u­lar Trees”, which ex­am­ines the dis­tri­bu­tions of po­si­tion val­ues that re­sult when two op­po­nents take turns choos­ing the worst op­tion for each other.

Alter­na­tively, in­stead of a like­li­hood func­tion for N, you could have a like­li­hood func­tion for an ex­po­nen­tial weight­ing λ on the ex­pected util­ity of the op­tion:

Pr(A was cho­sen)/​Pr(B was cho­sen) ∝ exp(λ(U(A)-U(B))).

Higher val­ues of λ would be hy­pothe­ses in which bet­ter op­tions were more strongly ran­domly se­lected over worse ones. (This is some­thing like a logit-re­sponse model, for which λ (or “β”, or “1/​μ”) would be the “ra­tio­nal­ity” pa­ram­e­ter. It might be more fa­mil­iar as a Gibbs dis­tri­bu­tion.) But this would fail when the ex­pected util­ity from the null dis­tri­bu­tion was heavy-tailed, be­cause then for some λ≠0 the dis­tri­bu­tion of op­ti­mized ex­pected util­ities would be im­pos­si­ble to nor­mal­ize. Bet­ter would be for the cu­mu­la­tive prob­a­bil­ity at the ex­pected util­ity of the cho­sen op­tion to be what was ex­po­nen­tially weighted by λ. In that case, in the limit λ = (N-1) >> 1, the two mod­els give the same dis­tri­bu­tion.

All of these statis­tics, as well as Eliezer’s origi­nal for­mu­la­tion, end up en­cod­ing equiv­a­lent in­for­ma­tion in the limit where the util­ity of each op­tion is an in­de­pen­dent sum of many iden­ti­cal light-tailed-dis­tributed com­po­nents and you’re pre­dict­ing a marginal dis­tri­bu­tion of util­ities for one of those com­po­nents. In this limit you can safely con­vert ev­ery­thing to a statis­ti­cal me­chan­ics paradigm and back again.

Of course, the real crite­rion for a good for­mu­la­tion of op­ti­miza­tion power is whether it helps peo­ple who use it in an ar­gu­ment about things that might be op­ti­miz­ers, or who hear it used in such an ar­gu­ment, to come to truth­ful con­clu­sions.

In this re­spect, like­li­hood func­tions can have the prob­lem that most peo­ple won’t want to use them: they’re hard to com­pute with, or com­mu­ni­cate, un­less they be­long to a low-di­men­sional fam­ily. The like­li­hood func­tions I sug­gested won’t do that ex­cept un­der very sim­ple con­di­tions. I’m not sure what the best way would be to sim­plify them to some­thing lower-di­men­sional. I guess you could just com­mu­ni­cate a max­i­mum-like­li­hood es­ti­mate and pre­ci­sion for the op­ti­miza­tion power pa­ram­e­ter. Or, if you chose a refer­ence prior over op­ti­miza­tion power, you could com­mu­ni­cate its pos­te­rior mean and var­i­ance.

All of this pre­sup­poses that the prob­lems with the un­op­ti­mized prob­a­bil­ity mea­sure can be dealt with. Maybe it would work bet­ter to de­scribe the op­ti­miza­tion power of a sys­tem in terms of a se­ries of lev­els of sim­pler sys­tems lead­ing up to that sys­tem, where each level’s new amount of op­ti­miza­tion was only char­ac­ter­ized ap­prox­i­mately, and only rel­a­tive to some­thing like a dis­tri­bu­tion of out­puts from the pre­vi­ous level. (This would at least patch the prob­lem where, if ther­mo­dy­nam­ics is some­how in­volved in the re­sults of an ac­tion, that ac­tion can count as very pow­er­ful rel­a­tive to the uniform mea­sure over the sys­tem’s microstates.) If op­ti­mizer B is sort of like choos­ing the best of N op­tions gen­er­ated by op­ti­mizer A, and op­ti­mizer C is sort of like choos­ing the best of M op­tions gen­er­ated by op­ti­mizer B, that might not have to mean that op­ti­mizer C is much like choos­ing the best of N*M op­tions gen­er­ated by op­ti­mizer A.

• In­ter­est­ing dis­cus­sion. What do want to do with this mea­sure once you have it? I can see how it would be el­e­gant to have it, but it seems use­ful to think about how you would use this mea­sure to in­form your de­ci­sions.

• Those build­ing op­ti­miza­tion al­gorithms need mea­sures of their perfor­mance on the tar­get prob­lem. They gen­er­ally mea­sure space-time re­sources to find a solu­tion. AFAIK, few bother with mea­sur­ing solu­tion qual­ity Eliezer-style. Count­ing the bet­ter solu­tions not found is of­ten ex­tremely ex­pen­sive. Di­vid­ing by the size of the soul­tion space is usu­ally pointless.

• Well, it’s always good to have a mea­sure of in­tel­li­gence, it we’re wor­ry­ing about high in­tel­li­gence be­ings. Also, I was hop­ing that it might give a way of for­mu­lat­ing “re­duced im­pact AI”. Alas, it seems to be in­suffi­cient.

• Would it make sense for the mea­sure of real-world in­tel­li­gence to be “op­ti­miza­tion power per unit time” or similar? Given ar­bi­trar­ily large amounts of time, I could do an ex­haus­tive search of the solu­tion space or some­thing like that, which isn’t very in­tel­li­gent or use­ful.

Another point: It doesn’t seem like op­ti­miza­tion power is use­fully defined for ev­ery pos­si­ble search space. Let’s say our search space is countably in­finite, each item cor­re­sponds to a sin­gle unique nat­u­ral num­ber, and each item’s score is equal to its cor­re­spond­ing nat­u­ral num­ber. I think for a while and come up with the solu­tion that has score 1 mil­lion. What’s my op­ti­miza­tion power? 0? Re­gard­less of the solu­tion I come up with, my op­ti­miza­tion power is go­ing to be 0, even though I vastly pre­fer some solu­tions to oth­ers. (I don’t know how much of a prob­lem this would be in prac­tice… it seems like it might de­pend on the “shape of the in­fini­tude” of a given solu­tion space.)

(Let me know if these don’t make sense for some rea­son; I haven’t taken the time to un­der­stand EY’s idea of op­ti­miza­tion power in depth.)

• If OP were an en­tropy, then we’d sim­ply do a weighted sum 1/​2(OP(X4)+OP(X7))=1/​2(1+3)=2, and then add one ex­tra bit of en­tropy to rep­re­sent our (bi­nary) un­cer­tainty as to what state we were in, giv­ing a to­tal OP of 3.

I feel like you’re do­ing some­thing wrong here. You’re mix­ing state dis­tri­bu­tion en­tropy with prob­a­bil­ity dis­tri­bu­tion en­tropy. If you in­tro­duce mixed states, shouldn’t each mixed state be ac­counted for in the phase space that you calcu­late the en­tropy over?

• If you down the “en­tropy is ig­no­rance about the ex­act microstate” route, this makes perfect sense. And var­i­ous peo­ple have made con­vinc­ing sound­ing ar­gu­ments that this is the right way to see en­tropy, though I’m not ex­pert my­self.

• I’m not an ex­pert ei­ther. How­ever, the OP func­tion has noth­ing to do with ig­no­rance or prob­a­bil­ities un­til you in­tro­duce them in the mixed states. It seems to me that this stan­dard com­bin­ing rule is not valid un­less you’re com­bin­ing prob­a­bil­ities.

• Hence OP is not an en­tropy.

• No­tice that OP is defined in terms of the state that S achieved. So it give us a mea­sure of how pow­er­ful S is in prac­tice in our model, not some pla­tonic mea­sure of how good S is in gen­eral situ­a­tions. This does not seem a draw­back to OP: af­ter all, we want to mea­sure how pow­er­ful a sys­tem ac­tu­ally is, not how pow­er­ful it could be in other cir­cum­stances.

I think that if you’re look­ing for a use­ful mea­sure of op­ti­miza­tion power, you will not want to use ac­tual achieve­ment rather than po­ten­tial achieve­ment if you want to have a nicely en­cap­su­lated con­cept that doesn’t in­clude prop­er­ties of the rest of the en­vi­ron­ment. Clearly the dan­ger­ous op­ti­miz­ers are the ones with ac­tual power rather than merely po­ten­tial power, but I think it’s much more clear to just talk about op­ti­miza­tion power as po­ten­tially dan­ger­ous, given an en­vi­ron­ment con­ducive to that op­ti­mizer.

• Mea­sur­ing op­ti­miza­tion power in terms of achieved out­come is wrong for the same rea­son that at­tribut­ing a highly de­sired out­come to ra­tio­nal­ity is wrong. Even if I win the lot­tery, it was still a nega­tive ex­pected dol­lars choice to play the lot­tery. What we re­ally want is some­thing like the min­i­mum util­ity out­come a pro­cess will achieve where the min­i­mum is taken over a given a set of start­ing states. Or we could have a prob­a­bil­ity dis­tri­bu­tion over start­ing states and get the ex­pected util­ity of the out­come. ETA: oops, I don’t mean util­ity of the out­come, I mean mea­sure of the set of out­comes with greater util­ity than the min­i­mum or ex­pected or what­ever.

The prob­lem of the size of the solu­tion space isn’t a prob­lem if we con­tent our­selves with com­par­ing op­ti­miza­tion power of differ­ent pro­cesses (I think...).

• Even if I win the lot­tery, it was still a nega­tive ex­pected dol­lars choice to play the lot­tery.

At that point in the post, I hadn’t in­tro­duced mixed states. So the only op­tions are to choose among fully de­ter­minis­tic wor­lds. Ob­vi­ously if you had the choice be­tween “win the lot­tery or don’t win the lot­tery”, then you’d go for the first.

Later when I bring in mixed states you can model such things as “buy a lot­tery ticket” as the achieved state, which gen­er­ally has ex­pected di­su­til­ity.

• I’m con­fused.

If OP were an en­tropy, then we’d sim­ply do a weighted sum 1/​2(OP(X4)+OP(X7))=1/​2(1+3)=2, and then add one ex­tra bit of en­tropy to rep­re­sent our (bi­nary) un­cer­tainty as to what state we were in

Why do we add the ex­tra bit? Doesn’t the weighted sum already rep­re­sent that un­cer­tainty?

• Sup­pose the X’s had 0 en­tropy each—that is, they were states with no “in­ter­nal mov­ing parts,” like an elec­tron.

Now imag­ine that you in­tro­duced ig­no­rance into the prob­lem—now we don’t know if the elec­tron is in state 4 or state 7, so you as­sign each P=0.5. What is the en­tropy of this dis­tri­bu­tion?

Well, it turns out the en­tropy (amount of ig­no­rance) is 1 bit. Which is 1 bit more than the 0 bits of en­tropy that states 4 and 7 had on their own.

• I think a lot of the prob­lems come from start­ing with fi­nal states. In­stead, we can let Y1, Y2, … be the pos­si­ble out­puts of the sys­tem S, each with a cer­tain util­ity at­tached. There is no such thing as mixed states: you can’t half-out­put of Y1 and half-out­put Y2 (in some cases, you might have a prob­a­bil­is­tic strat­egy which out­puts Y1 50% of the time, but the util­ity calcu­la­tion for that is differ­ent). Fur­ther­more, there is no need to deal with the coarse-grain­ing is­sue.

• How do you as­sign util­ity to an out­put that is a mixed fi­nal state? As a weighted sum of util­ities?

• Pre­sum­ably. It’s de­bat­able whether or not this cap­tures risk aver­sion, but in my opinion it does (and risk aver­sion falls out of non­lin­ear util­ity).

But one thing to keep in mind is that if there is an out­put Y1 that leads to a fi­nal state X1, and an out­put Y2 that leads to a fi­nal state X2, then there is not nec­es­sar­ily an out­put lead­ing to 0.5X1+0.5X2.

• Similarly, OP mea­sures the sys­tem’s abil­ity to achieve its very top goals, not how hard these goals are. A sys­tem that wants to com­pose a brilli­ant son­net has more OP than ex­actly the same sys­tem that wants to com­pose a brilli­ant son­net while em­bod­ied in the An­dromeda galaxy. Even though the sec­ond is plau­si­bly more dan­ger­ous. So OP is a very im­perfect mea­sure of how pow­er­ful a sys­tem is.

I’m con­fused. A sys­tem that has to com­pose a brilli­ant son­net and make sure that it ex­ists in the An­dromeda galaxy has to hit a smaller tar­get of pos­si­ble wor­lds than a sys­tem that wants to com­pose a brilli­ant son­net, and doesn’t care where it ends up. Achiev­ing more com­plex goals re­quire more op­ti­miza­tion power, in Eliezer’s sense, than achiev­ing sim­ple goals.

• It seems that op­ti­miza­tion power as it’s cur­rently defined would be a value that doesn’t change with time (un­less the agent’s prefer­ences change with time). This might be fine de­pend­ing what you’re look­ing for, but the defi­ni­tion of op­ti­miza­tion power that I’m look­ing for would al­low an agent to gain or lose op­ti­miza­tion power.

• It turns out we can’t. Imag­ine that, with­out chang­ing any­thing else, the util­ity of X7 is sud­denly set to ten trillion, rather than 7. The OP of X7 is still 3 - it’s still the best op­tion, still with prob­a­bil­ity 18.

I think a prob­lem with this line of at­tack is that you are mix­ing prefer­ences and util­ities. You could imag­ine two types of op­ti­miza­tion power. A prefer­ence-cen­tric one and a util­ity-cen­tric one, both of which can be use­ful de­pend­ing what you’re talk­ing about. You can map prefer­ences to util­ities and util­ities to prefer­ences, but one may be more nat­u­ral than the other for your pur­poses.

• An ad­di­tional prob­lem with the origi­nal defi­ni­tion of op­ti­miza­tion power is that it re­quires the agent to have a rich set of prefer­ences. If it doesn’t pre­fer much but would have been very effec­tive with its ca­pac­i­ties had it had more prefer­ences, I’d still say the op­ti­miza­tion pro­cess was pow­er­ful. It seems to me that the most use­ful con­cept of op­ti­miza­tion power will be in­de­pen­dent of the op­ti­mizer’s ac­tual prefer­ences (or util­ity).