One distinction that I pretty strongly hold as carving nature at its joint is (what I call) optimization vs agents. Optimization has no concept of a utility function, and it just about the state going up an ordering. Agents are the thing that has a utility function, which they need for picking actions with probabilistic outcomes.
Aha—mm that’s quite interesting. As gears says, I’d be curious what to you are the defining parts of agents that imply that generic optimisation processes like natural selection and gradient descent aren’t agent while humans and animals are.
Is it about action / counterfactuality?
EDIT: If I take your perspective seriously that optimisation only talks about preference orderings not utility functions then maybe the supposed deficiency of the Yudkowsky definition is not so big.
We could then define analogs of the entropy as
H(p)=∫p(x)OPp(x)dx
and cross-entropy as
H(pa,p)=∫pa(x)OPp(x)
which is the same as OP(a) defined above.
We can then also consider a relative entropy / optimisation measure
KL(pa|p)=H(pa,p)−H(pa)
As well as the reverse where we flip p,pa.
Given two variables X,Y on Ω, where Ω splits as X×Y with the natural induced preference ordering we have the ≥ -relevant mutual information
I’d still like to understand better why utility functions are intrinsically about agents while preference ordering are about optimisation. This isn’t totally apparent to me.
My best guess about the core difference between optimization and agency is the thing I said above about, “a utility function, which they need for picking actions with probabilistic outcomes”.
An agent wants to move the state up an ordering (its optimization criterion). But an agent also has enough modelling ability to know that any given action has (some approximation of) a probability distribution over outcomes. (Maybe this is what you mean by “counterfactuality”.) Let’s say you’ve got a toy model where your ordering over states is A < B < C < D < E and you’re starting out in state C. The only way to decide between [a 30% chance of B + a 70% chance of D] and [a 40% chance of A + a 60% change of E] is to decide on some numerical measure for how much better E is than D, et cetera.
Gradient descent doesn’t have to do this at all. It just looks at the gradient and is like, number go down? Great, we go in the down direction. Similarly, natural selection isn’t doing this either. It’s just generating a bunch of random mutations and then some of them die.
(I’m not totally confident that one couldn’t somehow show some way in which these scenarios can be mathematically described as calculating an expected utility. But I haven’t needed to pull in these ideas for deconfusing myself about optimization.)
Why do you need to decide between those probability distributions? You only need to get one action (or distribution thereof) out. You can do it without deciding, eg by taking their average and sampling. On the other hand vNM tells us utility is being assigned if your choice satisfies some conditions, but vNM = agency is a complicated position to hold.
We know that at some level every physical system is doing gradient descent or a variational version thereof. So depending on the scale you model a system, you would assign different degrees of agency?
By the way gradient descent is a form of local utility minimization, and by tweaking the meaning of ‘local’ one can get many other things (evolution, Bayesian inference, RL, ‘games’, etc).
I mean, that makes sense according to their definition, I think I’m just defining the word differently. Personally I think defining “agent” such that gradient descent is an agent seems pretty off from the colloquial use of the word agent.
I would be interested to see a sketch of how you mathematize “agent” such that gradient descent could be said to not have a utility function. Best as I can tell, “having a utility function” is a noninteresting property that everything has—a sort of panagentism implied by trivial utility functions. Though nontriviality of utility functions might be able to define what you’re talking about, and I can imagine there being some nontriviality definitions that do exclude gradient descent over boxed parameters, eg that there’s no time td where the utility function becomes indifferent. Any utility function that only cares about the weights becomes indifferent in finite time, I think? so this should exclude the “just sit here being a table” utility function. Although, perhaps this is insufficiently defined because I haven’t specified what physical mechanism to extract as the preference ordering in some cases in which case there could totally be agents. I’d be curious how you try to define this sort of thing, anyway.
(that is to say that for utility function Usuch that there are no worldlines Wa(t) and Wb(t) that diverge at td such that U(wa)=U(wb); call that constraint 1, “never being indifferent between timelines”. though that version of the constraint might demand that the utility function never be indifferent to anything at all, so perhaps a weaker constraint might be that there be no time where the utility function is indifferent to all possible worldlines; ie, if constraint 1 is no worldlines that diverge and yet get the same order position, constraint 2 is at all times td there are at least one unique Wa(t) and Wb(t) that diverge at td such that U(wa)≠U(wb). diverge being defined as Wa(te)=Wb(te) for early times te<td and Wa(tl)≠Wb(tl) for some tl>td.)
Nice, I’d read the first but didn’t realise there were more. I’ll digest later.
I think agents vs optimisation is definitely reality-carving, but not sure I see the point about utility functions and preference orderings. I assume the idea is that an optimisation process just moves the world towards states, but an agent tries to move the world towards certain states i.e. chooses actions based on how much they move the world towards certain states, so it make sense to quantify how much of a weighting each state gets in its decision-making. But it’s not obvious to me that there’s not a meaningful way to assign weightings to states for an optimisation process too—for example if a ball rolling down a hill gets stuck in the large hole twice as often as it gets stuck in the medium hole and ten times as often as the small hole, maybe it makes sense to quantify this with something like a utility function. Although defining a utility function based on the typical behaviour of the system and then trying to measure its optimisation power against it gets a bit circular.
Anyway, the dynamical systems approach seems good. Have you stopped working on it?
Mostly it’s that I’ve found that, while trying to understand optimization, I’ve never needed to put “weights” on the ordering. (Of course, you always could map your ordering onto a monotonically increasing function.)
I think the concept of “trying” mostly dissolves under the kind of scrutiny I’m trying to apply. Or rather, to well-define “trying”, you need a whole bunch of additional machinery that just makes it a different thing than (my concept of) optimization, and that’s not what I’m studying yet.
I’ve also been working entirely in deterministic settings, so there’s no sense of “how often” a thing happens, just a single trajectory. (This also differentiates my thing from Flint’s.)
I haven’t stopped working on the overall project. I do seem to have stopped writing and editing that particular sequence, though. I’m considering totally changing the way I present the concept (such that the current Intro post would be more like a middle-post) so I decided to just pull the trigger on publishing the current state of it. I’m also trying to get more actual formal results, which is more about stuff from the end of that sequence. But I’m pretty behind on formal training, so I’m also trying to generally catch up on math.
You might be interested in some of my open drafts about optimization;
Draft: Introduction to optimization
Draft: The optimization toolbox
Draft: Detecting optimization
Draft: Inferring minimizers
One distinction that I pretty strongly hold as carving nature at its joint is (what I call) optimization vs agents. Optimization has no concept of a utility function, and it just about the state going up an ordering. Agents are the thing that has a utility function, which they need for picking actions with probabilistic outcomes.
Aha—mm that’s quite interesting. As gears says, I’d be curious what to you are the defining parts of agents that imply that generic optimisation processes like natural selection and gradient descent aren’t agent while humans and animals are.
Is it about action / counterfactuality?
EDIT: If I take your perspective seriously that optimisation only talks about preference orderings not utility functions then maybe the supposed deficiency of the Yudkowsky definition is not so big.
We could then define analogs of the entropy as
H(p)=∫p(x)OPp(x)dx
and cross-entropy as
H(pa,p)=∫pa(x)OPp(x)
which is the same as OP(a) defined above.
We can then also consider a relative entropy / optimisation measure
KL(pa|p)=H(pa,p)−H(pa)
As well as the reverse where we flip p,pa.
Given two variables X,Y on Ω, where Ω splits as X×Y with the natural induced preference ordering we have the ≥ -relevant mutual information
IYud(X;Y)=∫x∈X∫y∈Yp(x,y)logp({(x′,y′)|(x,y)≤(x′,y′)}p({x′|x≤x′})p(y′|y≤y′})
mmmm this is quite nice actually.
I’d still like to understand better why utility functions are intrinsically about agents while preference ordering are about optimisation. This isn’t totally apparent to me.
My best guess about the core difference between optimization and agency is the thing I said above about, “a utility function, which they need for picking actions with probabilistic outcomes”.
An agent wants to move the state up an ordering (its optimization criterion). But an agent also has enough modelling ability to know that any given action has (some approximation of) a probability distribution over outcomes. (Maybe this is what you mean by “counterfactuality”.) Let’s say you’ve got a toy model where your ordering over states is A < B < C < D < E and you’re starting out in state C. The only way to decide between [a 30% chance of B + a 70% chance of D] and [a 40% chance of A + a 60% change of E] is to decide on some numerical measure for how much better E is than D, et cetera.
Gradient descent doesn’t have to do this at all. It just looks at the gradient and is like, number go down? Great, we go in the down direction. Similarly, natural selection isn’t doing this either. It’s just generating a bunch of random mutations and then some of them die.
(I’m not totally confident that one couldn’t somehow show some way in which these scenarios can be mathematically described as calculating an expected utility. But I haven’t needed to pull in these ideas for deconfusing myself about optimization.)
Uhm two comments/questions on this.
Why do you need to decide between those probability distributions? You only need to get one action (or distribution thereof) out. You can do it without deciding, eg by taking their average and sampling. On the other hand vNM tells us utility is being assigned if your choice satisfies some conditions, but vNM = agency is a complicated position to hold.
We know that at some level every physical system is doing gradient descent or a variational version thereof. So depending on the scale you model a system, you would assign different degrees of agency?
By the way gradient descent is a form of local utility minimization, and by tweaking the meaning of ‘local’ one can get many other things (evolution, Bayesian inference, RL, ‘games’, etc).
Isn’t gradient descent agentic over the parameters being optimized, according to the “moved by reasons” definition in discovering agents?
I mean, that makes sense according to their definition, I think I’m just defining the word differently. Personally I think defining “agent” such that gradient descent is an agent seems pretty off from the colloquial use of the word agent.
I would be interested to see a sketch of how you mathematize “agent” such that gradient descent could be said to not have a utility function. Best as I can tell, “having a utility function” is a noninteresting property that everything has—a sort of panagentism implied by trivial utility functions. Though nontriviality of utility functions might be able to define what you’re talking about, and I can imagine there being some nontriviality definitions that do exclude gradient descent over boxed parameters, eg that there’s no time td where the utility function becomes indifferent. Any utility function that only cares about the weights becomes indifferent in finite time, I think? so this should exclude the “just sit here being a table” utility function. Although, perhaps this is insufficiently defined because I haven’t specified what physical mechanism to extract as the preference ordering in some cases in which case there could totally be agents. I’d be curious how you try to define this sort of thing, anyway.
(that is to say that for utility function Usuch that there are no worldlines Wa(t) and Wb(t) that diverge at td such that U(wa)=U(wb); call that constraint 1, “never being indifferent between timelines”. though that version of the constraint might demand that the utility function never be indifferent to anything at all, so perhaps a weaker constraint might be that there be no time where the utility function is indifferent to all possible worldlines; ie, if constraint 1 is no worldlines that diverge and yet get the same order position, constraint 2 is at all times td there are at least one unique Wa(t) and Wb(t) that diverge at td such that U(wa)≠U(wb). diverge being defined as Wa(te)=Wb(te) for early times te<td and Wa(tl)≠Wb(tl) for some tl>td.)
Nice, I’d read the first but didn’t realise there were more. I’ll digest later.
I think agents vs optimisation is definitely reality-carving, but not sure I see the point about utility functions and preference orderings. I assume the idea is that an optimisation process just moves the world towards states, but an agent tries to move the world towards certain states i.e. chooses actions based on how much they move the world towards certain states, so it make sense to quantify how much of a weighting each state gets in its decision-making. But it’s not obvious to me that there’s not a meaningful way to assign weightings to states for an optimisation process too—for example if a ball rolling down a hill gets stuck in the large hole twice as often as it gets stuck in the medium hole and ten times as often as the small hole, maybe it makes sense to quantify this with something like a utility function. Although defining a utility function based on the typical behaviour of the system and then trying to measure its optimisation power against it gets a bit circular.
Anyway, the dynamical systems approach seems good. Have you stopped working on it?
Mostly it’s that I’ve found that, while trying to understand optimization, I’ve never needed to put “weights” on the ordering. (Of course, you always could map your ordering onto a monotonically increasing function.)
I think the concept of “trying” mostly dissolves under the kind of scrutiny I’m trying to apply. Or rather, to well-define “trying”, you need a whole bunch of additional machinery that just makes it a different thing than (my concept of) optimization, and that’s not what I’m studying yet.
I’ve also been working entirely in deterministic settings, so there’s no sense of “how often” a thing happens, just a single trajectory. (This also differentiates my thing from Flint’s.)
I haven’t stopped working on the overall project. I do seem to have stopped writing and editing that particular sequence, though. I’m considering totally changing the way I present the concept (such that the current Intro post would be more like a middle-post) so I decided to just pull the trigger on publishing the current state of it. I’m also trying to get more actual formal results, which is more about stuff from the end of that sequence. But I’m pretty behind on formal training, so I’m also trying to generally catch up on math.