The configurator dynamically modulates the cost function, so the agent is not guaranteed to have the same cost function over time, hence can be dutch booked / violate VNM axioms.
Good point. But at any given time, its doing EV calculations to decide its actions. Even if it modulates itself by picking amongst a variety of utility functions, its actions are still influenced by explicit EV calcs. If I understand TurnTrout’s work correctly, that alone is enough to make the agent power seeking. Which is dangerous by default.
The configurator dynamically modulates the cost function, so the agent is not guaranteed to have the same cost function over time, hence can be dutch booked / violate VNM axioms.
Good point. But at any given time, its doing EV calculations to decide its actions. Even if it modulates itself by picking amongst a variety of utility functions, its actions are still influenced by explicit EV calcs. If I understand TurnTrout’s work correctly, that alone is enough to make the agent power seeking. Which is dangerous by default.