When triggered to act, are the homeostatic-agents-as-envisioned-by-you motivated to decrease the future probability of being moved out of balance, or prolong the length of time in which they will be in balance, or something along these lines?
I expect[1] them to have a drive similar to “if my internal world-simulator predicts a future sensory observations that are outside of my acceptable bounds, take actions to make the world-simulator predict a within-acceptable-bounds sensory observations”.
This maps reasonably well to one of the agent’s drives being “decrease the future probability of being moved out of balance”. Notably, though, it does not map well to that the only drive of the agent, or for the drive to be “minimize” and not “decrease if above threshold”. The specific steps I don’t understand are
What pressure is supposed to push a homeostatic agent with multiple drives to elevate a specific “expected future quantity of some arbitrary resource” drives above all of other drives and set the acceptable quantity value to some extreme
Why we should expect that an agent that has been molded by that pressure would come to dominate its environment.
If no, they are probably not powerful agents. Powerful agency is the ability to optimize distant (in space, time, or conceptually) parts of the world into some target state
Why use this definition of powerful agency? Specifically, why include the “target state” part of it? By this metric, evolutionary pressure is not powerful agency, because while it can cause massive changes in distant parts of the world, there is no specific target state. Likewise for e.g. corporations finding a market niche—to the extent that they have a “target state” it’s “become a good fit for the environment”.′
Or, rather… It’s conceivable for an agent to be “tool-like” in this manner, where it has an incredibly advanced cognitive engine hooked up to a myopic suite of goals. But only if it’s been intelligently designed. If it’s produced by crude selection/optimization pressures, then the processes that spit out “unambitious” homeostatic agents would fail to instill the advanced cognitive/agent-y skills into them.
I can think of a few ways to interpret the above paragraph with respect to humans, but none of them make sense to me[2] - could you expand on what you mean there?
And a bundle of unbounded-consequentialist agents that have some structures for making cooperation between each other possible would have considerable advantages over a bundle of homeostatic agents.
Is this still true if the unbounded consequentialist agents in question have limited predictive power, and each one has advantages in predicting the things that are salient to it? Concretely, can an unbounded AAPL share price maximizer cooperate with an unbounded maximizer for the number of sand crabs in North America without the AAPL-maximizer having a deep understanding of sand crab biology?
The agent is sophisticated enough to have a future-sensory-perceptions simulato
The use of the future-perceptions-simulator has been previously reinforced
The specific way the agent is trying to change the outputs of the future-perceptions-simulator has been previously reinforced (e.g. I expect “manipulate your beliefs” to be chiseled away pretty fast when reality pushes back)
Still, all those assumptions usually hold for humans
What pressure is supposed to push a homeostatic agent with multiple drives to elevate a specific “expected future quantity of some arbitrary resource” drives above all of other drives
That was never the argument. A paperclip-maximizer/wrapper-mind’s utility function doesn’t need to be simple/singular. It can be a complete mess, the way human happiness/prosperity/eudaimonia is a mess. The point is that it would still pursue it hard, so hard that everything not in it will be end up as collateral damage.
I think humans very much do exhibit that behavior, yes? Towards power/money/security, at the very least. And inasmuch as humans fail to exhibit this behavior, they fail to act as powerful agents and end up accomplishing little.
I think the disconnect is that you might be imagining unbounded consequentialist agents as some alien systems that are literally psychotically obsessed with maximizing something as conceptually simple as paperclips, as opposed to a human pouring their everything into becoming a multibillionaire/amassing dictatorial power/winning a war?
Is this still true if the unbounded consequentialist agents in question have limited predictive power, and each one has advantages in predicting the things that are salient to it?
I expect[1] them to have a drive similar to “if my internal world-simulator predicts a future sensory observations that are outside of my acceptable bounds, take actions to make the world-simulator predict a within-acceptable-bounds sensory observations”.
This maps reasonably well to one of the agent’s drives being “decrease the future probability of being moved out of balance”. Notably, though, it does not map well to that the only drive of the agent, or for the drive to be “minimize” and not “decrease if above threshold”. The specific steps I don’t understand are
What pressure is supposed to push a homeostatic agent with multiple drives to elevate a specific “expected future quantity of some arbitrary resource” drives above all of other drives and set the acceptable quantity value to some extreme
Why we should expect that an agent that has been molded by that pressure would come to dominate its environment.
Why use this definition of powerful agency? Specifically, why include the “target state” part of it? By this metric, evolutionary pressure is not powerful agency, because while it can cause massive changes in distant parts of the world, there is no specific target state. Likewise for e.g. corporations finding a market niche—to the extent that they have a “target state” it’s “become a good fit for the environment”.′
I can think of a few ways to interpret the above paragraph with respect to humans, but none of them make sense to me[2] - could you expand on what you mean there?
Is this still true if the unbounded consequentialist agents in question have limited predictive power, and each one has advantages in predicting the things that are salient to it? Concretely, can an unbounded AAPL share price maximizer cooperate with an unbounded maximizer for the number of sand crabs in North America without the AAPL-maximizer having a deep understanding of sand crab biology?
Subject to various assumptions at least, e.g.
The agent is sophisticated enough to have a future-sensory-perceptions simulato
The use of the future-perceptions-simulator has been previously reinforced
The specific way the agent is trying to change the outputs of the future-perceptions-simulator has been previously reinforced (e.g. I expect “manipulate your beliefs” to be chiseled away pretty fast when reality pushes back)
Still, all those assumptions usually hold for humans
The obvious interpretation I take for that paragraph is that one of the following must be true
For clarity, can you confirm that you don’t think any of the following:
Humans have been intelligently designed
Humans do not have the advance cognitive/agent-y skills you refer to
Humans exhibit unbounded consequentialist goal-driven behavior
None of these seem like views I’d expect you to have, so my model has to be broken somewhere
That was never the argument. A paperclip-maximizer/wrapper-mind’s utility function doesn’t need to be simple/singular. It can be a complete mess, the way human happiness/prosperity/eudaimonia is a mess. The point is that it would still pursue it hard, so hard that everything not in it will be end up as collateral damage.
I think humans very much do exhibit that behavior, yes? Towards power/money/security, at the very least. And inasmuch as humans fail to exhibit this behavior, they fail to act as powerful agents and end up accomplishing little.
I think the disconnect is that you might be imagining unbounded consequentialist agents as some alien systems that are literally psychotically obsessed with maximizing something as conceptually simple as paperclips, as opposed to a human pouring their everything into becoming a multibillionaire/amassing dictatorial power/winning a war?
Yes, see humans.