I think the core thesis is correct—the way i understood it—power maximisation creates a systemic bias to everyone’s operating model within the system—and regardless of your starting intentions—you will succumb to the bias to a smaller or larger extent.
But if you consider that the power maximisation is a result of the negative-sum game—then as a positive-sum aligned actor, the rational thing to do is to play the negative-sum game to gain access while progressing incentive change to reshape the game. An AI safety researcher that is aware of the game and is continuously updating on how it affects internal decision-making will necessarily have to take into account internal belief structures and rationalisations into their belief structure as otherwise they are sacrificing access (which eventually leads to expulsion or self-selected dissent from the game). The fact that this progresses the negative-sum game is regrettable—but I think it’s rational that p(game reshape | reshapers within the system) >>> p (game reshape | reshapers outside the system) and velocity (negative-sum game | reshapers within the system) > velocity (negative sum game | reshapers outside the system) . Your issue seems to me to be with the negative-sum game, not with the fact that you only need 1 actor (from any lab) to be power seeking to distort the game for everyone.
I’m probably a bit biased to the above reasoning—but it’s my current operating model.
On your last point—I would argue that Hassabis’ and Amodei’s interview at Davos was not profit-aligned and is in fact strong signal of cooperation efforts to change the game incentives. You could argue a higher tier strategy where putting focus on safety through cooperation sells the capability; but I thought it loses them more than gains to tell the world leaders and the power-elite “We as leaders of this industry are showing cooperation and are telling you all that we should slow down”. The fact that they did this by simultaneously showing up the game in the international forum using language that policy makers and economists would understand builds on top of that argument. I think what has to be appreciated is that if anyone was to vocally dissent, current balance of probabilities is they get ostracised and capital pulled away from them, rather than the game changing. Having said that—if Hassabis and Amodei are cooperating—I think there’s a mini-game forming there where it would be rational for one of them to drop from the game and this has to be scenario planned—but there’s incentive misalignment in that Hassabis genuinenly has more game-reshaping resources and Amodei has less corporate-beholden interests—which makes p(alignment) very hard to gauge vis-a-vis a dissent strategy in either direction. And ultimately, even though I’d call them aligned, they’re still ego-power-seeking.
I’m probably more than a little bit biased though, as that interview was the primary reason for me to update on my beliefs and take a nearer-future AGI prospect more seriously - for the exact reasoning I outline above. Therefore if their strategy was profit seeking through performative dissent from the game—they got me.
I think the core thesis is correct—the way i understood it—power maximisation creates a systemic bias to everyone’s operating model within the system—and regardless of your starting intentions—you will succumb to the bias to a smaller or larger extent.
But if you consider that the power maximisation is a result of the negative-sum game—then as a positive-sum aligned actor, the rational thing to do is to play the negative-sum game to gain access while progressing incentive change to reshape the game. An AI safety researcher that is aware of the game and is continuously updating on how it affects internal decision-making will necessarily have to take into account internal belief structures and rationalisations into their belief structure as otherwise they are sacrificing access (which eventually leads to expulsion or self-selected dissent from the game). The fact that this progresses the negative-sum game is regrettable—but I think it’s rational that p(game reshape | reshapers within the system) >>> p (game reshape | reshapers outside the system) and velocity (negative-sum game | reshapers within the system) > velocity (negative sum game | reshapers outside the system) . Your issue seems to me to be with the negative-sum game, not with the fact that you only need 1 actor (from any lab) to be power seeking to distort the game for everyone.
I’m probably a bit biased to the above reasoning—but it’s my current operating model.
On your last point—I would argue that Hassabis’ and Amodei’s interview at Davos was not profit-aligned and is in fact strong signal of cooperation efforts to change the game incentives. You could argue a higher tier strategy where putting focus on safety through cooperation sells the capability; but I thought it loses them more than gains to tell the world leaders and the power-elite “We as leaders of this industry are showing cooperation and are telling you all that we should slow down”. The fact that they did this by simultaneously showing up the game in the international forum using language that policy makers and economists would understand builds on top of that argument. I think what has to be appreciated is that if anyone was to vocally dissent, current balance of probabilities is they get ostracised and capital pulled away from them, rather than the game changing. Having said that—if Hassabis and Amodei are cooperating—I think there’s a mini-game forming there where it would be rational for one of them to drop from the game and this has to be scenario planned—but there’s incentive misalignment in that Hassabis genuinenly has more game-reshaping resources and Amodei has less corporate-beholden interests—which makes p(alignment) very hard to gauge vis-a-vis a dissent strategy in either direction. And ultimately, even though I’d call them aligned, they’re still ego-power-seeking.
I’m probably more than a little bit biased though, as that interview was the primary reason for me to update on my beliefs and take a nearer-future AGI prospect more seriously - for the exact reasoning I outline above. Therefore if their strategy was profit seeking through performative dissent from the game—they got me.