# The Pointers Problem: Human Values Are A Function Of Humans’ Latent Variables

An AI ac­tively try­ing to figure out what I want might show me snap­shots of differ­ent pos­si­ble wor­lds and ask me to rank them. Of course, I do not have the pro­cess­ing power to ex­am­ine en­tire wor­lds; all I can re­ally do is look at some pic­tures or video or de­scrip­tions. The AI might show me a bunch of pic­tures from one world in which a geno­cide is quietly tak­ing place in some ob­scure third-world na­tion, and an­other in which no such geno­cide takes place. Un­less the AI already con­sid­ers that dis­tinc­tion im­por­tant enough to draw my at­ten­tion to it, I prob­a­bly won’t no­tice it from the pic­tures, and I’ll rank those wor­lds similarly—even though I’d pre­fer the one with­out the geno­cide. Even if the AI does hap­pen to show me some mass graves (prob­a­bly sec­ond­hand, e.g. in pic­tures of news broad­casts), and I rank them low, it may just learn that I pre­fer my geno­cides un­der-the-radar.

The ob­vi­ous point of such an ex­am­ple is that an AI should op­ti­mize for the real-world things I value, not just my es­ti­mates of those things. I don’t just want to think my val­ues are satis­fied, I want them to ac­tu­ally be satis­fied. Un­for­tu­nately, this poses a con­cep­tual difficulty: what if I value the hap­piness of ghosts? I don’t just want to think ghosts are happy, I want ghosts to ac­tu­ally be happy. What, then, should the AI do if there are no ghosts?

Hu­man “val­ues” are defined within the con­text of hu­mans’ world-mod­els, and don’t nec­es­sar­ily make any sense at all out­side of the model (i.e. in the real world). Try­ing to talk about my val­ues “ac­tu­ally be­ing satis­fied” is a type er­ror.

Some points to em­pha­size here:

• My val­ues are not just a func­tion of my sense data, they are a func­tion of the state of the whole world, in­clud­ing parts I can’t see—e.g. I value the hap­piness of peo­ple I will never meet.

• I can­not ac­tu­ally figure out or pro­cess the state of the whole world

• … there­fore, my val­ues are a func­tion of things I do not know and will not ever know—e.g. whether some­one I will never en­counter is happy right now

• This isn’t just a limited pro­cess­ing prob­lem; I do not have enough data to figure out all these things I value, even in prin­ci­ple.

• This isn’t just a prob­lem of not enough data, it’s a prob­lem of what kind of data. My val­ues de­pend on what’s go­ing on “in­side” of things which look the same—e.g. whether a smil­ing face is ac­tu­ally a ric­tus grin

• This isn’t just a prob­lem of need­ing suffi­ciently low-level data. The things I care about are still ul­ti­mately high-level things, like hu­mans or trees or cars. While the things I value are in prin­ci­ple a func­tion of low-level world state, I don’t di­rectly care about molecules.

• Some of the things I value may not ac­tu­ally ex­ist—I may sim­ply be wrong about which high-level things in­habit our world.

• I care about the ac­tual state of things in the world, not my own es­ti­mate of the state—i.e. if the AI tricks me into think­ing things are great (whether in­ten­tional trick­ery or not), that does not make things great.

Th­ese fea­tures make it rather difficult to “point” to val­ues—it’s not just hard to for­mally spec­ify val­ues, it’s hard to even give a way to learn val­ues. It’s hard to say what it is we’re sup­posed to be learn­ing at all. What, ex­actly, are the in­puts to my value-func­tion? It seems like:

• In­puts to val­ues are not com­plete low-level world states (since peo­ple had val­ues be­fore we knew what quan­tum fields were, and still have val­ues de­spite not know­ing the full state of the world), but…

• I value the ac­tual state of the world rather than my own es­ti­mate of the world-state (i.e. I want other peo­ple to ac­tu­ally be happy, not just look-to-me like they’re happy).

How can both of those in­tu­itions seem true si­mul­ta­neously? How can the in­puts to my val­ues-func­tion be the ac­tual state of the world, but also high-level ob­jects which may not even ex­ist? What things in the low-level phys­i­cal world are those “high-level ob­jects” point­ing to?

If I want to talk about “ac­tu­ally satis­fy­ing my val­ues” sep­a­rate from my own es­ti­mate of my val­ues, then I need some way to say what the val­ues-rele­vant pieces of my world model are “point­ing to” in the real world.

I think this prob­lem—the “poin­t­ers to val­ues” prob­lem, and the “poin­t­ers” prob­lem more gen­er­ally—is the pri­mary con­cep­tual bar­rier to al­ign­ment right now. This in­cludes al­ign­ment of both “prin­ci­pled” and “pro­saic” AI. The one ma­jor ex­cep­tion is pure hu­man-mimick­ing AI, which suffers from a mostly-un­re­lated set of prob­lems (largely stem­ming from the short­com­ings of hu­mans, es­pe­cially groups of hu­mans).

I have yet to see this prob­lem ex­plained, by it­self, in a way that I’m satis­fied by. I’m steal­ing the name from some of Abram’s posts, and I think he’s point­ing to the same thing I am, but I’m not 100% sure.

The goal of this post is to demon­strate what the prob­lem looks like for a (rel­a­tively) sim­ple Bayesian-util­ity-max­i­miz­ing agent, and what challenges it leads to. This has the draw­back of defin­ing things only within one par­tic­u­lar model, but the ad­van­tage of show­ing how a bunch of nom­i­nally-differ­ent failure modes all fol­low from the same root prob­lem: util­ity is a func­tion of la­tent vari­ables. We’ll look at some spe­cific al­ign­ment strate­gies, and see how and why they fail in this sim­ple model.

One thing I hope peo­ple will take away from this: it’s not the “val­ues” part that’s con­cep­tu­ally difficult, it’s the “poin­t­ers” part.

# The Setup

We have a Bayesian ex­pected-util­ity-max­i­miz­ing agent, as a the­o­ret­i­cal stand-in for a hu­man. The agent’s world-model is a causal DAG over vari­ables , and it chooses ac­tions to max­i­mize - i.e. it’s us­ing stan­dard causal de­ci­sion the­ory. We will as­sume the agent has a full-blown Carte­sian bound­ary, so we don’t need to worry about em­bed­ded­ness and all that. In short, this is a text­book-stan­dard causal-rea­son­ing agent.

One catch: the agent’s world-model uses the sorts of tricks in Writ­ing Causal Models Like We Write Pro­grams, so the world-model can rep­re­sent a very large world with­out ever ex­plic­itly eval­u­at­ing prob­a­bil­ities of ev­ery vari­able in the world-model. Sub­mod­els are ex­panded lazily when they’re needed. You can still con­cep­tu­ally think of this as a stan­dard causal DAG, it’s just that the model is lazily eval­u­ated.

In par­tic­u­lar, think­ing of this agent as a hu­man, this means that our hu­man can value the hap­piness of some­one they’ve never met, never thought about, and don’t know ex­ists. The util­ity can be a func­tion of vari­ables which the agent will never com­pute, be­cause the agent never needs to fully com­pute u in or­der to max­i­mize it—it just needs to know how u changes as a func­tion of the vari­ables in­fluenced by its ac­tions.

Key as­sump­tion: most of the vari­ables in the agent’s world-model are not ob­serv­ables. Draw­ing the anal­ogy to hu­mans: most of the things in our world-mod­els are not raw pho­ton counts in our eyes or raw vibra­tion fre­quen­cies/​in­ten­si­ties in our ears. Our world-mod­els in­clude things like trees and rocks and cars, ob­jects whose ex­is­tence and prop­er­ties are in­ferred from the raw sense data. Even lower-level ob­jects, like atoms and molecules, are la­tent vari­ables; the raw data from our eyes and ears does not in­clude the ex­act po­si­tions of atoms in a tree. The raw sense data it­self is not suffi­cient to fully de­ter­mine the val­ues of the la­tent vari­ables, in gen­eral; even a perfect Bayesian rea­soner can­not de­duce the true po­si­tion of ev­ery atom in a tree from a video feed.

Now, the ba­sic prob­lem: our agent’s util­ity func­tion is mostly a func­tion of la­tent vari­ables. Hu­man val­ues are mostly a func­tion of rocks and trees and cars and other hu­mans and the like, not the raw pho­ton counts hit­ting our eye­balls. Hu­man val­ues are over in­ferred vari­ables, not over sense data.

Fur­ther­more, hu­man val­ues are over the “true” val­ues of the la­tents, not our es­ti­mates—e.g. I want other peo­ple to ac­tu­ally be happy, not just to look-to-me like they’re happy. Ul­ti­mately, is the agent’s es­ti­mate of its own util­ity (thus the ex­pec­ta­tion), and the agent may not ever know the “true” value of its own util­ity—i.e. I may pre­fer that some­one who went miss­ing ten years ago lives out a happy life, but I may never find out whether that hap­pened. On the other hand, it’s not clear that there’s a mean­ingful sense in which any “true” util­ity-value ex­ists at all, since the agent’s la­tents may not cor­re­spond to any­thing phys­i­cal—e.g. a hu­man may value the hap­piness of ghosts, which is tricky if ghosts don’t ex­ist in the real world.

On top of all that, some of those vari­ables are im­plicit in the model’s lazy data struc­ture and the agent will never think about them at all. I can value the hap­piness of peo­ple I do not know and will never en­counter or even think about.

So, if an AI is to help op­ti­mize for , then it’s op­ti­miz­ing for some­thing which is a func­tion of la­tent vari­ables in the agent’s model. Those la­tent vari­ables:

• May not cor­re­spond to any par­tic­u­lar vari­ables in the AI’s world-model and/​or the phys­i­cal world

• May not be es­ti­mated by the agent at all (be­cause lazy eval­u­a­tion)

• May not be de­ter­mined by the agent’s ob­served data

… and of course the agent’s model might just not be very good, in terms of pre­dic­tive power.

As usual, nei­ther we (the sys­tem’s de­sign­ers) nor the AI will have di­rect ac­cess to the model; we/​it will only see the agent’s be­hav­ior (i.e. in­put/​out­put) and pos­si­bly a low-level sys­tem in which the agent is em­bed­ded. The agent it­self may have some in­tro­spec­tive ac­cess, but not full or perfectly re­li­able in­tro­spec­tion.

De­spite all that, we want to op­ti­mize for the agent’s util­ity, not just the agent’s es­ti­mate of its util­ity. Other­wise we run into wire­head­ing-like prob­lems, prob­lems with the agent’s world model hav­ing poor pre­dic­tive power, etc. But the agent’s util­ity is a func­tion of la­tents which may not be well-defined at all out­side the con­text of the agent’s es­ti­ma­tor (a.k.a. world-model). How can we op­ti­mize for the agent’s “true” util­ity, not just an es­ti­mate, when the agent’s util­ity func­tion is defined as a func­tion of la­tents which may not cor­re­spond to any­thing out­side of the agent’s es­ti­ma­tor?

# The Poin­t­ers Problem

We can now define the poin­t­ers prob­lem—not only “poin­t­ers to val­ues”, but the prob­lem of poin­t­ers more gen­er­ally. The prob­lem: what func­tions of what vari­ables (if any) in the en­vi­ron­ment and/​or an­other world-model cor­re­spond to the la­tent vari­ables in the agent’s world-model? And what does that “cor­re­spon­dence” even mean—how do we turn it into an ob­jec­tive for the AI, or some other con­crete thing out­side the agent’s own head?

Why call this the “poin­t­ers” prob­lem? Well, let’s take the agent’s per­spec­tive, and think about what its al­gorithm feels like from the in­side. From in­side the agent’s mind, it doesn’t feel like those la­tent vari­ables are la­tent vari­ables in a model. It feels like those la­tent vari­ables are real things out in the world which the agent can learn about. The la­tent vari­ables feel like “poin­t­ers” to real-world ob­jects and their prop­er­ties. But what are the refer­ents of these poin­t­ers? What are the real-world things (if any) to which they’re point­ing? That’s the poin­t­ers prob­lem.

Is it even solv­able? Definitely not always—there prob­a­bly is no real-world refer­ent for e.g. the hu­man con­cept of a ghost. Similarly, I can have a con­cept of a per­pet­ual mo­tion ma­chine, de­spite the likely-im­pos­si­bil­ity of any such thing ex­ist­ing. Between ab­strac­tion and lazy eval­u­a­tion, la­tent vari­ables in an agent’s world-model may not cor­re­spond to any­thing in the world.

That said, it sure seems like at least some la­tent vari­ables do cor­re­spond to struc­tures in the world. The con­cept of “tree” points to a pat­tern which oc­curs in many places on Earth. Even an alien or AI with rad­i­cally differ­ent world-model could rec­og­nize that re­peat­ing pat­tern, re­al­ize that ex­am­in­ing one tree prob­a­bly yields in­for­ma­tion about other trees, etc. The pat­tern has pre­dic­tive power, and pre­dic­tive power is not just a fig­ment of the agent’s world-model.

So we’d like to know both (a) when a la­tent vari­able cor­re­sponds to some­thing in the world (or an­other world model) at all, and (b) what it cor­re­sponds to. We’d like to solve this in a way which (prob­a­bly among other use-cases) lets the AI treat the things-cor­re­spond­ing-to-la­tents as the in­puts to the util­ity func­tion it’s sup­posed to learn and op­ti­mize.

To the ex­tent that hu­man val­ues are a func­tion of la­tent vari­ables in hu­mans’ world-mod­els, this seems like a nec­es­sary step not only for an AI to learn hu­man val­ues, but even just to define what it means for an AI to learn hu­man val­ues. What does it mean to “learn” a func­tion of some other agent’s la­tent vari­ables, with­out nec­es­sar­ily adopt­ing that agent’s world-model? If the AI doesn’t have some no­tion of what the other agent’s la­tent vari­ables even “are”, then it’s not mean­ingful to learn a func­tion of those vari­ables. It would be like an AI “learn­ing” to imi­tate grep, but with­out hav­ing any ac­cess to string or text data, and with­out the AI it­self hav­ing any in­ter­face which would ac­cept strings or text.

Let’s look at some ex­am­ple symp­toms which can arise from failure to solve spe­cific as­pects of the poin­t­ers prob­lem.

Let’s go back to the open­ing ex­am­ple: an AI shows us pic­tures from differ­ent pos­si­ble wor­lds and asks us to rank them. The AI doesn’t re­ally un­der­stand yet what things we care about, so it doesn’t in­ten­tion­ally draw our at­ten­tion to cer­tain things a hu­man might con­sider rele­vant—like mass graves. Maybe we see a few mass-grave pic­tures from some pos­si­ble wor­lds (prob­a­bly in pic­tures from news sources, since that’s how such in­for­ma­tion mostly spreads), and we rank those low, but there are many other wor­lds where we just don’t no­tice the prob­lem from the pic­tures the AI shows us. In the end, the AI de­cides that we mostly care about avoid­ing wor­lds where mass graves ap­pear in the news—i.e. we pre­fer that mass kil­lings stay un­der the radar.

How does this failure fit in our util­ity-func­tion-of-la­tents pic­ture?

This is mainly a failure to dis­t­in­guish be­tween the agent’s es­ti­mate of its own util­ity , and the “real” value of the agent’s util­ity (in­so­far as such a thing ex­ists). The AI op­ti­mizes for our es­ti­mate, but does not give us enough data to very ac­cu­rately es­ti­mate our util­ity in each world—in­deed, it’s un­likely that a hu­man could even han­dle that much in­for­ma­tion. So, it ends up op­ti­miz­ing for fac­tors which bias our es­ti­mate—e.g. the availa­bil­ity of in­for­ma­tion about bad things.

Note that this in­tu­itive ex­pla­na­tion as­sumes a solu­tion to the poin­t­ers prob­lem: it only makes sense to the ex­tent that there’s a “real” value of from which the “es­ti­mate” can di­verge.

The un­der-the-radar geno­cide prob­lem looks roughly like a typ­i­cal wire­head­ing prob­lem, so we should try a roughly-typ­i­cal wire­head­ing solu­tion: rather than the AI show­ing world-pic­tures, it should just tell us what ac­tions it could take, and ask us to rank ac­tions di­rectly.

If we were ideal Bayesian rea­son­ers with ac­cu­rate world mod­els and in­finite com­pute, and knew ex­actly where the AI’s ac­tions fit in our world model, then this might work. Un­for­tu­nately, the failure of any of those as­sump­tions breaks the ap­proach:

• We don’t have the pro­cess­ing power to pre­dict all the im­pacts of the AI’s actions

• Our world mod­els may not be ac­cu­rate enough to cor­rectly pre­dict the im­pact of the AI’s ac­tions, even if we had enough pro­cess­ing power

• The AI’s ac­tions may not even fit neatly into our world model—e.g. even the idea of ge­netic en­g­ineer­ing might not fit the world-model of pre­mod­ern hu­man thinkers

Math­e­mat­i­cally, we’re try­ing to op­ti­mize , i.e. op­ti­mize ex­pected util­ity given the AI’s ac­tions. Note that this is nec­es­sar­ily an ex­pec­ta­tion un­der the hu­man’s model, since that’s the only con­text in which is well-defined. In or­der for that to work out well, we need to be able to fully eval­u­ate that es­ti­mate (suffi­cient pro­cess­ing power), we need the es­ti­mate to be ac­cu­rate (suffi­cient pre­dic­tive power), and we need to be defined within the model in the first place.

The ques­tion of whether our world-mod­els are suffi­ciently ac­cu­rate is par­tic­u­larly hairy here, since ac­cu­racy is usu­ally only defined in terms of how well we es­ti­mate our sense-data. But the ac­cu­racy we care about here is how well we “es­ti­mate” the val­ues of la­tent vari­ables and . What does that even mean, when the la­tent vari­ables may not cor­re­spond to any­thing in the world?

## Peo­ple I Will Never Meet

“Hu­man val­ues can­not be de­ter­mined from hu­man be­hav­ior” seems al­most old-hat at this point, but it’s worth tak­ing a mo­ment to high­light just how un­der­de­ter­mined val­ues are from be­hav­ior. It’s not just that hu­mans have bi­ases of one kind or an­other, or that re­vealed prefer­ences di­verge from stated prefer­ences. Even in our perfect Bayesian util­ity-max­i­mizer, util­ity is severely un­der­de­ter­mined from be­hav­ior, be­cause the agent does not have perfect es­ti­mates of its la­tent vari­ables. Be­hav­ior de­pends only on the agent’s es­ti­mate, so it can­not ac­count for “er­ror” in the agent’s es­ti­mates of la­tent vari­able val­ues, nor can it tell us about how the agent val­ues vari­ables which are not cou­pled to its own choices.

The hap­piness of peo­ple I will never in­ter­act with is a good ex­am­ple of this. There may be peo­ple in the world whose hap­piness will not ever be sig­nifi­cantly in­fluenced by my choices. Pre­sum­ably, then, my choices can­not tell us about how much I value such peo­ples’ hap­piness. And yet, I do value it.

## “Misspeci­fied” Models

In La­tent Vari­ables and Model Misspeci­fi­ca­tion, jstein­hardt talks about “mis­speci­fi­ca­tion” of la­tent vari­ables in the AI’s model. His ar­gu­ment is that things like the “value func­tion” are la­tent vari­ables in the AI’s world-model, and are there­fore po­ten­tially very sen­si­tive to mis­speci­fi­ca­tion of the AI’s model.

In fact, I think the prob­lem is more se­vere than that.

The value func­tion’s in­puts are la­tent vari­ables in the hu­man’s model, and are there­fore sen­si­tive to mis­speci­fi­ca­tion in the hu­man’s model. If the hu­man’s model does not match re­al­ity well, then their la­tent vari­ables will be some­thing wonky and not cor­re­spond to any­thing in the world. And AI de­sign­ers do not get to pick the hu­man’s model. Th­ese wonky vari­ables, not cor­re­spond­ing to any­thing in the world, are a baked-in part of the prob­lem, un­avoid­able even in prin­ci­ple. Even if the AI’s world model were “perfectly speci­fied”, it would ei­ther be a bad rep­re­sen­ta­tion of the world (in which case pre­dic­tive power be­comes an is­sue) or a bad rep­re­sen­ta­tion of the hu­man’s model (in which case those wonky la­tents aren’t defined).

The AI can’t model the world well with the hu­man’s model, but the la­tents on which hu­man val­ues de­pend aren’t well-defined out­side the hu­man’s model. Rock and a hard place.

# Takeaway

Within the con­text of a Bayesian util­ity-max­i­mizer (rep­re­sent­ing a hu­man), util­ity/​val­ues are a func­tion of la­tent vari­ables in the agent’s model. That’s a prob­lem, be­cause those la­tent vari­ables do not nec­es­sar­ily cor­re­spond to any­thing in the en­vi­ron­ment, and even when they do, we don’t have a good way to say what they cor­re­spond to.

So, an AI try­ing to help the agent is stuck: if the AI uses the hu­man’s world-model, then it may just be wrong out­right (in pre­dic­tive terms). But if the AI doesn’t use the hu­man’s world-model, then the la­tents on which the util­ity func­tion de­pends may not be defined at all.

Thus, the poin­t­ers prob­lem, in the Bayesian con­text: figure out which things in the world (if any) cor­re­spond to the la­tent vari­ables in a model. What do la­tent vari­ables in my model “point to” in the real world?

• I definitely en­dorse this as a good ex­pla­na­tion of the same poin­t­ers prob­lem I was get­ting at. I par­tic­u­larly like the new fram­ing in terms of a di­rect con­flict be­tween (a) the fact that what we care about can be seen as la­tent vari­ables in our model, and (b) we value “ac­tual states”, not our es­ti­mates—this seems like a new and bet­ter way of point­ing out the prob­lem (de­spite be­ing very close in some sense to things Eliezer talked about in the se­quences).

What I’d like to add to this post would be the point that we shouldn’t be im­pos­ing a solu­tion from the out­side. How to deal with this in an al­igned way is it­self some­thing which de­pends on the prefer­ences of the agent. I don’t think we can just come up with a gen­eral way to find cor­re­spon­dences be­tween mod­els, or some­thing like that, and ap­ply it to solve the prob­lem. (Or at least, we don’t need to.)

One rea­son is be­cause find­ing a cor­re­spon­dence and ap­ply­ing it isn’t what the agent should want. In this sim­ple setup, where we sup­pose a perfect Bayesian agent, it’s rea­son­able to ar­gue that the AI should just use the agent’s be­liefs. That’s what would max­i­mize the ex­pec­ta­tion from the per­spec­tive of the agent—not us­ing the agent’s util­ity func­tion but sub­sti­tut­ing the AI’s be­liefs for the agent’s. You men­tion that the agent may not have a perfect world-model, but this isn’t a good ar­gu­ment from the agent’s per­spec­tive—cer­tainly not an ar­gu­ment for just sub­sti­tut­ing the agent’s model with some AI world-model.

This can be a real al­ign­ment prob­lem for the agent (not just a mis­take made by an overly dog­matic agent): if the AI be­lieves that the moon is made of blue cheese, but the agent doesn’t trust that be­lief, then the AI can make plans which the agent doesn’t trust even if the util­ity func­tion is perfect.

And if the agent does trust the AI’s ma­chine-learn­ing-based model, then an AI which used the agent’s prior would also trust the ma­chine-learn­ing model. So, noth­ing is lost by de­sign­ing the AI to use the agent’s prior in ad­di­tion to its util­ity func­tion.

So this is an ar­gu­ment that prior-learn­ing is a part of al­ign­ment just as much as value-learn­ing.

We don’t usu­ally think this way be­cause when it comes to hu­mans, well, it sounds like a ter­rible idea. Hu­man be­liefs—as we en­counter them in the wild—are rad­i­cally bro­ken and ir­ra­tional, and in­ad­e­quate to the task. I think that’s why I got a lot of push-back on my post about this:

I mean, I REALLY don’t want that or any­thing like that.

- jbash

But I think nor­ma­tivity gives us a differ­ent way of think­ing about this. We don’t want the AI to use “the hu­man prior” in the sense of some prior we can ex­tract from hu­man be­hav­ior, or ex­tract from the brain, or what­ever. In­stead, what we want to use is “the hu­man prior” in the nor­ma­tive sense—the prior hu­mans re­flec­tively en­dorse.

This gives us a path for­ward on the “im­pos­si­ble” cases where hu­mans be­lieve in ghosts, etc. It’s not as if hu­mans don’t have ex­pe­rience deal­ing with things of value which turn out not to be a part of the real world. We’re con­stantly form­ing and re­form­ing on­tolo­gies. The AI should be try­ing to learn how we deal with it—again, not quite in a de­scrip­tive sense of how hu­mans ac­tu­ally deal with it, but rather in the nor­ma­tive sense of how we en­dorse deal­ing with it, so that it deals with it in ways we trust and pre­fer.

• This makes a lot of sense.

I had been weakly lean­ing to­wards the idea that a solu­tion to the poin­t­ers prob­lem should be a solu­tion to defer­ral—i.e. it tells us when the agent defers to the AI’s world model, and what map­ping it uses to trans­late AI-vari­ables to agent-vari­ables. This makes me lean more in that di­rec­tion.

What I’d like to add to this post would be the point that we shouldn’t be im­pos­ing a solu­tion from the out­side. How to deal with this in an al­igned way is it­self some­thing which de­pends on the prefer­ences of the agent. I don’t think we can just come up with a gen­eral way to find cor­re­spon­dences be­tween mod­els, or some­thing like that, and ap­ply it to solve the prob­lem. (Or at least, we don’t need to.)

I see a cou­ple differ­ent claims mixed to­gether here:

• The metaphilo­soph­i­cal prob­lem of how we “should” han­dle this prob­lem is suffi­cient and/​or nec­es­sary to solve in its own right.

• There prob­a­bly isn’t a gen­eral way to find cor­re­spon­dences be­tween mod­els, so we need to op­er­ate at the meta-level.

The main thing I dis­agree with is the idea that there prob­a­bly isn’t a gen­eral way to find cor­re­spon­dences be­tween mod­els. There are clearly cases where cor­re­spon­dence fails out­right (like the ghosts ex­am­ple), but I think the prob­lem is prob­a­bly solv­able al­low­ing for er­ror-cases (by which I mean cases where the cor­re­spon­dence throws an er­ror, not cases in which the cor­re­spon­dence re­turns an in­cor­rect re­sult). Fur­ther­more, as­sum­ing that nat­u­ral ab­strac­tions work the way I think they do, I think the prob­lem is solv­able in prac­tice with rel­a­tively few er­ror cases and po­ten­tially even us­ing “pro­saic” AI world-mod­els. It’s the sort of thing which would dra­mat­i­cally im­prove the suc­cess chances of al­ign­ment by de­fault.

I ab­solutely do agree that we still need the metaphilo­soph­i­cal stuff for a first-best solu­tion. In par­tic­u­lar, there is not an ob­vi­ously-cor­rect way to han­dle the cor­re­spon­dence er­ror-cases, and of course any­thing else in the whole setup can also be close-but-not-ex­actly-right . I do think that com­bin­ing a solu­tion to the poin­t­ers prob­lem with some­thing like the com­mu­ni­ca­tion prior strat­egy, plus some ob­vi­ous tweaks like par­tially-or­dered prefer­ences and some model of log­i­cal un­cer­tainty, would prob­a­bly be enough to land us in the basin of con­ver­gence (as­sum­ing the start­ing model was de­cent), but even then I’d pre­fer metaphilo­soph­i­cal tools to be con­fi­dent that some­thing like that would work.

• With refer­ence speci­fi­cally to this:

The hap­piness of peo­ple I will never in­ter­act with is a good ex­am­ple of this. There may be peo­ple in the world whose hap­piness will not ever be sig­nifi­cantly in­fluenced by my choices. Pre­sum­ably, then, my choices can­not tell us about how much I value such peo­ples’ hap­piness. And yet, I do value it.

and with­out con­sid­er­ing any other part of the struc­ture, I have an al­ter­nate view:

It is pos­si­ble to de­ter­mine if and how much you value the hap­piness (or any other at­tribute) of peo­ple you will never in­ter­act with by calculating

1. What are the var­i­ous things you, per­son­ally, could have done in the past [time pe­riod], and how would they have af­fected each of the peo­ple, plants, an­i­mals, ghosts, etc. that you might care about?

2. What things did you ac­tu­ally do?

3. How far away from your max­i­mum im­pact /​ time were you for each en­tity you could have af­fected. (scaled in some way tbd)

4. Derive val­ues and weights from that. For ex­am­ple, if I donate $100 to Clean Water for Africa, that im­plies that I care about Clean Water & Africa more than I care about AIDS and Pak­istan, and the level there de­pends on how much$100 means to me. If that’s ten (or even two) hours of work to earn it that’s a differ­ent level of com­mit­ment than if it rep­re­sents 17 min­utes of own­ing mil­lions in as­sets.

5. Run the calcu­la­tion for all de­sired moral agents, to av­er­age out won’t-ever-see-them effects.

1. Derive val­ues and weights from that. For ex­am­ple, if I donate $100 to Clean Water for Africa, that im­plies that I care about Clean Water & Africa more than I care about AIDS and Pak­istan, and the level there de­pends on how much$100 means to me. If that’s ten (or even two) hours of work to earn it that’s a differ­ent level of com­mit­ment than if it rep­re­sents 17 min­utes of own­ing mil­lions in as­sets.

This will very quickly lead to in­cor­rect con­clu­sions, be­cause peo­ple don’t act ac­cord­ing to their val­ues (es­pe­cially for things that don’t im­pact their day to day lives like in­ter­na­tional char­ity). The fact that you donated $100 to Clean Water for Africa does not mean that you value that more than AIDS in Pak­istan. You per­son­ally may very well care about about clean wa­ter and/​or Africa more than AIDS and/​or Pak­istan, but if you ap­ply this sort of anal­y­sis writ large you will get egre­giously wrong an­swers. Scott Alexan­der’s “Too Much Dark Money in Al­monds” de­scribes one facet of this rather well. Another facet is that how goods are bun­dled mat­ters. Did I spend$15 on al­monds be­cause I value a) al­monds b) nuts c) food d) sources of pro­tein e) snacks I can eas­ily eat while I drive f) snacks I can put out at par­ties… etc. And more im­por­tantly, which of those things do I care about more than I care about Trump los­ing the elec­tion?

Eliz­a­beth An­scombe’s book In­ten­tion does a good job an­a­lyz­ing this. When we make ac­tions, we are not mak­ing those ac­tions based on the state of the world we are mak­ing those ac­tions based on the state of the world un­der a par­tic­u­lar de­scrip­tion. One great ex­am­ple she gives is walk­ing into a room and kiss­ing a woman. Did you in­tend to a) kiss your girlfriend b) kiss the tallest women in the room c) kiss the woman clos­est to the door wear­ing pink d) kiss the per­son who got the 13th high­est mark on her his­tory exam last week e) …

The an­swer is (typ­i­cally) a. You in­tended to kiss your girlfriend. How­ever to an out­side ob­server who doesn’t already have a good model of hu­man­ity at large, if not a model of you in par­tic­u­lar, it’s un­clear how they’re sup­posed to tell that. Most peo­ple who donate to Clean Water for Africa don’t in­tend to be choos­ing that over AIDS in Pak­istan. Their ac­tions are con­sis­tent with hav­ing that in­ten­tion, but you can’t de­rive in­ten­tion­al­ity from brute ac­tions.

• I agree with your com­ment, but I think it’s a scale thing. If I an­a­lyze ev­ery time you walk into a room, and ev­ery time you kiss some­one, I can de­rive that you kiss [spe­cific per­son] when you see them af­ter be­ing apart. And this is already be­ing done in cor­po­rate con­texts with Deep Learn­ing for spe­cific ques­tions, so it’s just a mat­ter of com­put­ing power, bet­ter al­gorithms, and some guidance at to the rele­vant ques­tions and vari­ables.

• You’ve mostly un­der­stood the prob­lem-as-stated, and I like the way you’re think­ing about it, but there’s some ma­jor loop­holes in this ap­proach.

First, I may value the hap­piness of agents who I can­not sig­nifi­cantly im­pact via my ac­tions—for in­stance, pris­on­ers in North Korea.

Se­cond, the ac­tions we chose prob­a­bly won’t provide enough data. Sup­pose there are n differ­ent peo­ple, and I could give any one of them \$1. I value these pos­si­bil­ities differ­ently (e.g. maybe be­cause they have differ­ent wealth/​cost of liv­ing to start with, or just be­cause I like some of them bet­ter). If we knew how much I val­ued each ac­tion, then we’d know how much I val­ued each out­come. But in fact, if I chose per­son 3, then all we know is that I value per­son 3 hav­ing the dol­lar more than I value any­one else hav­ing it; that’s not enough in­for­ma­tion to back out how much I value each other per­son hav­ing the dol­lar. This sort of un­der­de­ter­mi­na­tion will prob­a­bly be the usual re­sult, since the choice-of-ac­tion con­tains a lot less bits than a func­tion map­ping the whole ac­tion space to val­ues.

Third, and ar­guably most im­por­tant: “run the calcu­la­tion for all de­sired moral agents” re­quires first iden­ti­fy­ing all the “de­sired moral agents”, which is it­self an in­stance of the prob­lem in the post. What the heck is a “moral agent”, and how does an AI know which ones are “de­sired”? Th­ese are la­tent vari­ables in your world-model, and would need to be trans­lated to some­thing in the real world.

• I was at­tempt­ing to an­swer the first point, so let me rephrase: Even though your abil­ity to af­fect pris­on­ers in North Korea is minis­cule, we can still look at how much of it you’re do­ing. Are you spend­ing any time seek­ing out ways you could be af­fect­ing them? Are you vot­ing for and sup­port­ing and lob­by­ing poli­ti­ci­ans who are more likely to use their greater power to af­fect the NK pris­oner’s lives? Are you do­ing [un­known thing that the AI figures out would af­fect them]? And, also, are you do­ing any­thing that is mak­ing their situ­a­tion worse? Or any other of the mul­ti­ple axis of be­ing, since hap­piness isn’t ev­ery­thing, and even hap­piness isn’t a one-di­men­tional scale.

“Who counts as a moral agent? (And should they all have equal weights)” Is a ques­tion of philos­o­phy, which I am not qual­ified to an­swer. But “who gets to de­cide the val­ues to teach” it’s one meta-level up from the ques­tion of “how do we teach val­ues”, so I take it as a given for the lat­ter prob­lem.

• I’m not con­vinced that we can do noth­ing if the hu­man wants ghosts to be happy. The AI would sim­ply have to do what would make ghosts happy if they were real. In the worst case, the hu­man’s (co­her­ent ex­trap­o­lated) be­liefs are your only source of in­for­ma­tion on how ghosts work. Any proper gen­eral solu­tion to the poin­t­ers prob­lem will surely han­dle this case. Ap­par­ently, each state of the agent cor­re­sponds to some prob­a­bil­ity dis­tri­bu­tion over wor­lds.

• This seems like it’s only true if the hu­mans would truly cling to their be­lief in spite of all ev­i­dence (IE if they be­lieved in ghosts dog­mat­i­cally), which seems un­true for many things (al­though I grant that some hu­mans may have some be­liefs like this). I be­lieve the idea of the ghost ex­am­ple is to point at cases where there’s an on­tolog­i­cal crisis, not cases where the on­tol­ogy is so dog­matic that there can be no crisis (though, ob­vi­ously, both cases are the­o­ret­i­cally im­por­tant).

How­ever, I agree with you in ei­ther case—it’s not clear there’s “noth­ing to be done” for the ghost case (in ei­ther in­ter­pre­ta­tion).

• I don’t un­der­stand what the pur­ported on­tolog­i­cal crisis is. If ghosts ex­ist, then I want them to be happy. That doesn’t re­quire a dog­matic be­lief that there are ghosts at all. In fact, it can even be true when I be­lieve ghosts don’t ex­ist!

• I mean, that’s fair. But what if your be­lief sys­tem jus­tified al­most ev­ery­thing ul­ti­mately in terms of “mak­ing an­ces­tors happy”, and re­lied on a be­lief that an­ces­tors are still around to be happy/​sad? There are sev­eral pos­si­ble re­sponses which a real hu­man might be tempted to make:

• Give up on those val­ues which were jus­tified via an­ces­tor wor­ship, and only pur­sue the few val­ues which weren’t jus­tified that way.

• Value all the same things, just not based on an­ces­tor wor­ship any more.

• Value all the same things, just with a more ab­stract no­tion of “mak­ing an­ces­tors happy” rather than think­ing the an­ces­tors are liter­ally still around.

• Value mostly the same things, but with some up­dates in places where an­ces­tor wor­ship was re­ally warp­ing your view of what’s valuable rather than merely serv­ing as a pleas­ant jus­tifi­ca­tion for what you already think is valuable.

So we can fix the sce­nario to make a more real on­tolog­i­cal crisis.

It also bears men­tion­ing—the rea­son to be con­cerned about on­tolog­i­cal crisis is, mostly, a worry that al­most none of the things we ex­press our val­ues in terms of are “real” in a re­duc­tion­is­tic sense. So an AI could pos­si­bly view the world through much differ­ent con­cepts and still be pre­dic­tively ac­cu­rate. The ques­tion then is, what would it mean for such an AI to pur­sue our val­ues?

• You could learn the poin­t­ers by ob­serv­ing how the model is in­cre­men­tally built over time. There is much more ex­plicit in chil­dren learn­ing. Com­pare to how our mod­ern val­ues be­ing very differ­ent from our an­ces­tors’.

• I think what you call the poin­t­ers prob­lem is mostly the ground­ing prob­lem ap­plied to val­ues. Philos­o­phy has long tried to find solu­tions to it and you can google some here.

• Really fas­ci­nat­ing prob­lem! I like how your ex­am­ples make me want to say “Well, the AI just has to ask about… wait a minute, that’s the prob­lem!”. Taken from an­other point of view, you’re ask­ing how and in which con­text can an AI re­veal our util­ity func­tions, which means re­veal­ing our la­tent vari­ables.

This prob­lems also feels re­lated to our dis­cus­sion of the lo­cal­ity of goals. Here you as­sume a non-lo­cal goal (as most hu­man ones are), and I think that a bet­ter knowl­edge of how to de­tect/​mea­sure lo­cal­ity from be­hav­ior and as­sump­tions about the agent-model might help with the poin­t­ers prob­lem.

• Set­ting up the “lo­cal­ity of goals” con­cept: let’s split the vari­ables in the world model into ob­serv­ables , ac­tion vari­ables , and la­tent vari­ables . Note that there may be mul­ti­ple stages of ob­ser­va­tions and ac­tions, so we’ll only have sub­sets and of the ob­ser­va­tion/​ac­tion vari­ables in the de­ci­sion prob­lem. The Bayesian util­ity max­i­mizer then chooses to maximize

… but we can rewrite that as

Defin­ing a new util­ity func­tion , the origi­nal prob­lem is equiv­a­lent to:

In English: given the origi­nal util­ity func­tion on the (“non-lo­cal”) la­tent vari­ables, we can in­te­grate out the la­tents to get a new util­ity func­tion defined only on the (“lo­cal”) ob­ser­va­tion & de­ci­sion vari­ables. The new util­ity func­tion yields com­pletely iden­ti­cal agent be­hav­ior to the origi­nal.

So ob­serv­ing agent be­hav­ior alone can­not pos­si­bly let us dis­t­in­guish prefer­ences on la­tent vari­ables from prefer­ences on the “lo­cal” ob­ser­va­tion & de­ci­sion vari­ables.

• Over the last few posts the re­cur­rent thought I have is “why aren’t you talk­ing about com­pres­sion more ex­plic­itly?”

• Could you un­com­press this com­ment a bit please?

• A poin­ter is sort of the ul­ti­mate in lossy com­pres­sion. Just an in­dex to the un­com­pressed data, like a leg­ible com­pres­sion library. Wire­head­ing is a good­heart­ing prob­lem, which is a lossy com­pres­sion prob­lem etc.

• I like this post. I have thoughts along the same lines some­times, and it makes me feel a bit over­whelmed and nihilis­tic, so then I go back to think­ing about eas­ier prob­lems :-P

Is it even solv­able? Definitely not always—there prob­a­bly is no real-world refer­ent for e.g. the hu­man con­cept of a ghost.

Michael Grazi­ano has an­other nice ex­am­ple: “pure white­ness”.

And then he ar­gues ar­gues that an­other ex­am­ple is, ummm, the whole idea of con­scious ex­pe­rience, which would be a bit prob­le­matic for philos­o­phy and ethics if true. See my Book Re­view: Re­think­ing Con­scious­ness.

• I think that one of the prob­lems in this post is ac­tu­ally eas­ier in the real world than in the toy model.

In the toy model the AI has to suc­ceed by max­i­miz­ing the agent’s True Values, which the agent is as­sumed to have as a unique func­tion over its model of the world. This is a very tricky prob­lem, es­pe­cially when, as you point out, we might al­low the agent’s model of re­al­ity to be wrong in places.

But in the real world, hu­mans don’t have a unique set of True Values or even a unique model of the world—we’re non-Carte­sian, which means that when we talk about our val­ues, we are as­sum­ing a spe­cific sort of way of talk­ing about the world, and there are other ways of talk­ing about the world in which talk about our val­ues doesn’t make sense.

Thus in the real world we can­not re­quire that the AI has to max­i­mize hu­mans’ True Values, we can only ask that it mod­els hu­mans (and we might have desider­ata about how it does that mod­el­ing and what the end re­sults should con­tain), and satisfy the mod­eled val­ues. And in some ways this is ac­tu­ally a bit re­as­sur­ing, be­cause I’m pretty sure that it’s pos­si­ble to get bet­ter fi­nal re­sults on this prob­lem than on than learn­ing the toy model agent’s True Values—maybe not in the most sim­ple case, but as you add things like lack of in­tro­spec­tion, dis­tri­bu­tional shift, meta-prefer­ences like iden­ti­fy­ing some be­hav­ior as “bias,” etc.

• This com­ment seems wrong to me in ways that make me think I’m miss­ing your point.

Some ex­am­ples and what seems wrong about them, with the un­der­stand­ing that I’m prob­a­bly mi­s­un­der­stand­ing what you’re try­ing to point to:

we’re non-Carte­sian, which means that when we talk about our val­ues, we are as­sum­ing a spe­cific sort of way of talk­ing about the world, and there are other ways of talk­ing about the world in which talk about our val­ues doesn’t make sense

I have no idea why this would be tied to non-Carte­sian-ness.

But in the real world, hu­mans don’t have a unique set of True Values or even a unique model of the world

There are cer­tainly ways in which hu­mans di­verge from Bayesian util­ity max­i­miza­tion, but I don’t see why we would think that val­ues or mod­els are non-unique. Cer­tainly we use mul­ti­ple lev­els of ab­strac­tion, or mul­ti­ple sub-mod­els, but that’s quite differ­ent from hav­ing mul­ti­ple dis­tinct world-mod­els.

Thus in the real world we can­not re­quire that the AI has to max­i­mize hu­mans’ True Values, we can only ask that it mod­els hu­mans [...] and satisfy the mod­eled val­ues.

How does this fol­low from non-unique­ness of val­ues/​world mod­els? If hu­mans have more than one set of val­ues, or more than one world model, then this seems to say “just pick one set of val­ues/​one world model and satisfy that”, which seems wrong.

One way to in­ter­pret all this is that you’re point­ing to things like sub­mod­els, sub­agents, mul­ti­ple ab­strac­tion lev­els, etc. But then I don’t see why the prob­lem would be any eas­ier in the real world than in the model, since all of those things can be ex­pressed in the model (or a straight­for­ward ex­ten­sion of the model, in the case of sub­agents).

• Yes, the point is mul­ti­ple ab­strac­tion lev­els (or at least mul­ti­ple ab­strac­tions, or­dered into lev­els or not). But not mul­ti­ple ab­strac­tions used by hu­mans, mul­ti­ple ab­strac­tions used on hu­mans.

If you don’t agree with me on this, why didn’t you re­ply when I spent about six months just writ­ing posts that were all vari­a­tions of this idea? Here’s Scott Alexan­der mak­ing the ba­sic point.

It’s like… is there a True ra­tio­nal ap­prox­i­ma­tion of pi? Well, 227 is pretty good, but 355113 is more pre­cise, if harder to re­mem­ber. And just 3 is re­ally easy to re­mem­ber, but not as pre­cise. And of course there’s the ar­bi­trar­ily large “ap­prox­i­ma­tion” that is 3.141592… Depend­ing on what you need to use it for, you might have differ­ent prefer­ences about the trade­off be­tween sim­plic­ity and pre­ci­sion. There is no True ra­tio­nal ap­prox­i­ma­tion of pi. True Hu­man Values are similar, ex­cept in­stead of one trade­off that you can make it’s ap­prox­i­mately one ba­jillion.

• we’re non-Carte­sian, which means that when we talk about our val­ues, we are as­sum­ing a spe­cific sort of way of talk­ing about the world, and there are other ways of talk­ing about the world in which talk about our val­ues doesn’t make sense

I have no idea why this would be tied to non-Carte­sian-ness.

If a Carte­sian agent was talk­ing about their val­ues, they could just be like “you know, those things that are speci­fied as my val­ues in the logic-stuff my mind is made out of.” (Though this as­sumes some level of in­tro­spec­tive ac­cess /​ genre savvi­ness that needn’t be as­sumed, so if you don’t want to as­sume this then we can just say I was mis­taken.). When a hu­man talks about their val­ues they can’t take that short­cut, and in­stead have to spec­ify val­ues as a func­tion of how they af­fect their be­hav­ior. This in­tro­duces the de­pen­dency on how we’re break­ing down the world into cat­e­gories like “hu­man be­hav­ior.”

• Thus in the real world we can­not re­quire that the AI has to max­i­mize hu­mans’ True Values, we can only ask that it mod­els hu­mans [...] and satisfy the mod­eled val­ues.

How does this fol­low from non-unique­ness of val­ues/​world mod­els? If hu­mans have more than one set of val­ues, or more than one world model, then this seems to say “just pick one set of val­ues/​one world model and satisfy that”, which seems wrong.

Well, if there were unique val­ues, we could say “max­i­mize the unique val­ues.” Since there aren’t, we can’t. We can still do some similar things, and I agree, those do seem wrong. See this post for ba­si­cally my ar­gu­ment for what we’re go­ing to have to do with that wrong-seem­ing.

• Well, if there were unique val­ues, we could say “max­i­mize the unique val­ues.” Since there aren’t, we can’t. We can still do some similar things, and I agree, those do seem wrong. See this post for ba­si­cally my ar­gu­ment for what we’re go­ing to have to do with that wrong-seem­ing.

Be­fore I get into the meat of the re­sponse… I cer­tainly agree that val­ues are prob­a­bly a par­tial or­der, not a to­tal or­der. How­ever, that still leaves ba­si­cally all the prob­lems in the OP: that par­tial or­der is still a func­tion of la­tent vari­ables in the hu­man’s world-model, which still gives rise to all the same prob­lems as a to­tal or­der in the hu­man’s world-model. (In­tu­itive way to con­cep­tu­al­ize this: we can rep­re­sent the par­tial or­der as a set of to­tal or­ders, i.e. rep­re­sent the hu­man as a set of util­ity-max­i­miz­ing sub­agents. Each of those sub­agents is still a nor­mal Bayesian util­ity max­i­mizer, and still suffers from the prob­lems in the OP.)

Any­way, I don’t think that’s the main dis­con­nect here...

Yes, the point is mul­ti­ple ab­strac­tion lev­els (or at least mul­ti­ple ab­strac­tions, or­dered into lev­els or not). But not mul­ti­ple ab­strac­tions used by hu­mans, mul­ti­ple ab­strac­tions used on hu­mans.

Ok, I think I see what you’re say­ing now. I am of course on board with the no­tion that e.g. hu­man val­ues do not make sense when we’re mod­el­ling the hu­man at the level of atoms. I also agree that the phys­i­cal sys­tem which com­prises a hu­man can be mod­eled as want­ing differ­ent things at differ­ent lev­els of ab­strac­tion.

How­ever, there is a differ­ence be­tween “the phys­i­cal sys­tem which com­prises a hu­man can be in­ter­preted as want­ing differ­ent things at differ­ent lev­els of ab­strac­tion”, and “there is not a unique, well-defined refer­ent of ‘hu­man val­ues’”. The former does not im­ply the lat­ter. In­deed, the differ­ence is es­sen­tially the same is­sue in the OP: one of these state­ments has a type-sig­na­ture which lives in the phys­i­cal world, while the other has a type-sig­na­ture which lives in a hu­man’s model.

An anal­ogy: con­sider a robot into which I hard-code a util­ity func­tion and world model. This is a phys­i­cal robot; on the level of atoms, its “goals” do not ex­ist in any more real a sense than hu­man val­ues do. As with hu­mans, we can model the robot at mul­ti­ple lev­els of ab­strac­tion, and these differ­ent mod­els may as­cribe differ­ent “goals” to the robot—e.g. mod­el­ling it at the level of an elec­tronic cir­cuit or at the level of as­sem­bly code may as­cribe differ­ent goals to the sys­tem, there may be sub­sys­tems with their own lit­tle con­trol loops, etc.

And yet, when I talk about the util­ity func­tion I hard-coded into the robot, there is no am­bi­guity about which thing I am talk­ing about. “The util­ity func­tion I hard-coded into the robot” is a con­cept within my own world-model. That world-model speci­fies the rele­vant level of ab­strac­tion at which the con­cept lives. And it seems pretty clear that “the util­ity func­tion I hard-coded into the robot” would cor­re­spond to some un­am­bigu­ous thing in the real world—al­though spec­i­fy­ing ex­actly what that thing is, is an in­stance of the poin­t­ers prob­lem.

Does that make sense? Am I still miss­ing some­thing here?