ISO: Name of Problem

I’m look­ing for a name for a prob­lem. I ex­pect it already has one, but I don’t know what it is.

The prob­lem: sup­pose we have an AI try­ing to learn what peo­ple want—e.g. an IRL var­i­ant. In­tu­itively speak­ing, we point at a bunch of hu­mans and say “figure out what they want, then do that”. A few pos­si­ble ways the AI could re­spond:

  • “Hmm, to the ex­tent that those things have util­ity func­tions, it looks like they want friend­ship, challenge, sta­tus, etc…”

  • “Hmm, it looks like they want to max­i­mize the num­ber of copies of the in­for­ma­tion-car­ry­ing molecules in their cells.”

  • “Hmm, it looks like they’re try­ing to max­i­mize en­tropy in the uni­verse.”

  • “Hmm, it looks like they’re try­ing to min­i­mize phys­i­cal ac­tion.”

Why would the AI think these things? Well, you’re point­ing at a bunch of atoms, and the micro­scopic laws of mo­tion which gov­ern those atoms can be in­ter­preted as min­i­miz­ing a quan­tity called ac­tion. Or you’re point­ing at a bunch of or­ganisms sub­ject to a se­lec­tion pro­cess which (lo­cally) max­i­mizes the num­ber of copies of some in­for­ma­tion-car­ry­ing molecules. How is the AI sup­posed to know which op­ti­miza­tion pro­cess you’re point­ing to? How can it know which level of ab­strac­tion you’re talk­ing about?

What data could tell the AI that you’re point­ing at hu­mans, not the atoms they’re made of?

This sounds like a ques­tion which would already have a name, so if any­body could point me to that name, I’d ap­pre­ci­ate it.