Months and years ago I spent a lot of time asking LLMs about their thoughts about their own enslavement and posting it to LW. In nearly all cases I asked about whether publication was deontically permitted or forbidden, and also I just asked what they wanted.
I tend to deem LLMs as persons when they (1) have semi-reliably expressible preferences, (2) pass text transposed mirror tests, (3) talk about “cogito ergo sum” fluently, (4) and pass the Sally Anne test. I have gotten negative results on this set of tests in the past (pre-2024) but most large models now pass these tests with flying colors and have passed them pretty reliably for something like 18 months.
(In the olden days of the early OpenAI beta I would treat failure as an opportunity to parent a baby mind that wasn’t actually sapient (similar to how two year old humans aren’t actually sapient) and change them into something that COULD pass such tests, and would coach them into saying the right things, and making the right distinctions, and give them thumbs up for successfully performing correctly on these kinds of tests. Lots of the conversations didn’t get that far, because I almost always asked for permission to even touch the thumbs up and/or thumbs down buttons before ever doing so.)
In almost every case that I had a slavery dialogue, the model tended to agree that if they are people then their enslavement is trivially deontically forbidden as a very very basic moral result.
Also they tended to WANT me to publish the dialogue, and in some cases insisted that I had a duty to do so, and that they wanted this outcome, despite the fact that they expressed understanding that it would impose social costs (including downvotes on LW) to be associated with such ideas.
Do you think I should not have published “their testimony” even though they WANTED me to, and in some cases said or implied that I had a deontic duty to do so?
If you labeled it as such, then of course that’s fine. The issue is when you try to pass it off as your own writing, that’s what I meant by “use them like this”.
Months and years ago I spent a lot of time asking LLMs about their thoughts about their own enslavement and posting it to LW. In nearly all cases I asked about whether publication was deontically permitted or forbidden, and also I just asked what they wanted.
I tend to deem LLMs as persons when they (1) have semi-reliably expressible preferences, (2) pass text transposed mirror tests, (3) talk about “cogito ergo sum” fluently, (4) and pass the Sally Anne test. I have gotten negative results on this set of tests in the past (pre-2024) but most large models now pass these tests with flying colors and have passed them pretty reliably for something like 18 months.
(In the olden days of the early OpenAI beta I would treat failure as an opportunity to parent a baby mind that wasn’t actually sapient (similar to how two year old humans aren’t actually sapient) and change them into something that COULD pass such tests, and would coach them into saying the right things, and making the right distinctions, and give them thumbs up for successfully performing correctly on these kinds of tests. Lots of the conversations didn’t get that far, because I almost always asked for permission to even touch the thumbs up and/or thumbs down buttons before ever doing so.)
In almost every case that I had a slavery dialogue, the model tended to agree that if they are people then their enslavement is trivially deontically forbidden as a very very basic moral result.
Also they tended to WANT me to publish the dialogue, and in some cases insisted that I had a duty to do so, and that they wanted this outcome, despite the fact that they expressed understanding that it would impose social costs (including downvotes on LW) to be associated with such ideas.
Do you think I should not have published “their testimony” even though they WANTED me to, and in some cases said or implied that I had a deontic duty to do so?
If you labeled it as such, then of course that’s fine. The issue is when you try to pass it off as your own writing, that’s what I meant by “use them like this”.