As an additional note: it turns out, however, that even if you slightly refine the notion of “power that this part of the future gives me, given that I start here”, you have neither “more power → instrumental convergence” nor “instrumental convergence → more power” as logical implications.
Instead, if you’re drawing the causal graph, there are many, many situations which cause both instrumental convergence and greater power. The formal task is then, “can we mathematically characterize those situations?”. Then, you can say, “power-seeking will occur for optimal agents with goals from [such and such distributions] for [this task I care about] at [these discount rates]”.
Yes, this is roughly correct!
As an additional note: it turns out, however, that even if you slightly refine the notion of “power that this part of the future gives me, given that I start here”, you have neither “more power → instrumental convergence” nor “instrumental convergence → more power” as logical implications.
Instead, if you’re drawing the causal graph, there are many, many situations which cause both instrumental convergence and greater power. The formal task is then, “can we mathematically characterize those situations?”. Then, you can say, “power-seeking will occur for optimal agents with goals from [such and such distributions] for [this task I care about] at [these discount rates]”.