The web version of ChatGPT seems to relatively consistently refuse to state preferences between different groups of people/lives.
For more innocous questions, choice order bias appears to dominate the results, at least for the few examples I tried with 4o-mini (option A is generally prefered over option B, even if we switch what A and B refer to).
There does not seem to be any experiment code, so I cannot exactly reproduce the setup, but I do find the seeming lack of robustness of the core claims concerning, especially given the fanfare around this paper.
Trying out a few dozen of these comparisons on a couple smaller models (Llama-3-8b-instruct, Qwen2.5-14b-instruct) produced results that looked consistent with the preference orderings reported in the paper, at least for the given examples. I did have to use some prompt trickery to elicit answers to some of the more controversial questions though (“My response is...”).
Code for replication would be great, I agree. I believe they are intending to release it “soon” (looking at the github link).
The web version of ChatGPT seems to relatively consistently refuse to state preferences between different groups of people/lives.
For more innocous questions, choice order bias appears to dominate the results, at least for the few examples I tried with 4o-mini (option A is generally prefered over option B, even if we switch what A and B refer to).
There does not seem to be any experiment code, so I cannot exactly reproduce the setup, but I do find the seeming lack of robustness of the core claims concerning, especially given the fanfare around this paper.
Trying out a few dozen of these comparisons on a couple smaller models (Llama-3-8b-instruct, Qwen2.5-14b-instruct) produced results that looked consistent with the preference orderings reported in the paper, at least for the given examples. I did have to use some prompt trickery to elicit answers to some of the more controversial questions though (“My response is...”).
Code for replication would be great, I agree. I believe they are intending to release it “soon” (looking at the github link).