The optimization processes used in the agent must all be capable of controlling whether it creates a mesa-optimizer.
I’m confused about this sentence—my understanding is that the term mesa-optimizer refers to the agent/model itself when it is doing some optimization. I think the term “run-time optimization” (which I’ve seen in this slide, seemingly from a talk by Yann LeCun) refers to this type of optimization.
I was under the impression that the term “optimization daemon” was used to describe a mesa-optimizer that is a “consequentialist” (I don’t know whether there’s a common definition for the term “consequentialist” in this context; my own tentative fuzzy definition is “something that has preferences about the spacetime of the world/multiverse”.)
I’m confused about this sentence—my understanding is that the term mesa-optimizer refers to the agent/model itself when it is doing some optimization. I think the term “run-time optimization” (which I’ve seen in this slide, seemingly from a talk by Yann LeCun) refers to this type of optimization.
Isn’t every optimization daemon a mesa-optimizer?
I was under the impression that the term “optimization daemon” was used to describe a mesa-optimizer that is a “consequentialist” (I don’t know whether there’s a common definition for the term “consequentialist” in this context; my own tentative fuzzy definition is “something that has preferences about the spacetime of the world/multiverse”.)