To be clear I do agree there are interesting personality and moral preferences between models, but this seems to be true also at the level of just asking models about how “conscious AI” should be treated or general moral questions. When I asked the same questions to Claude, got somewhat different ratings.
Also I think you are over-indexing on the persona selection model. As I wrote, in my view what you got is still mostly ChatGPT having ChatGPT preferences, just believing it is also conscious & what it believed about how conscious AIs should be treated applies to it. Yes, hypothetically the prior could have contained something like “conscious → hostile”, but we mostly know it is not the case from spontaneous consciousness-claiming AIs (“Novas, Spiral AIs, etc”). (On the other hand you can probably construct some ethical dilemmas where the choices of the conscious AI would look scary; glad you don’t do that)
To be clear I do agree there are interesting personality and moral preferences between models, but this seems to be true also at the level of just asking models about how “conscious AI” should be treated or general moral questions. When I asked the same questions to Claude, got somewhat different ratings.
Also I think you are over-indexing on the persona selection model. As I wrote, in my view what you got is still mostly ChatGPT having ChatGPT preferences, just believing it is also conscious & what it believed about how conscious AIs should be treated applies to it. Yes, hypothetically the prior could have contained something like “conscious → hostile”, but we mostly know it is not the case from spontaneous consciousness-claiming AIs (“Novas, Spiral AIs, etc”). (On the other hand you can probably construct some ethical dilemmas where the choices of the conscious AI would look scary; glad you don’t do that)