The problem with your view is that they don’t have the ability to continue learning for long after being “born.” That’s just not how the architecture works. Learning in context is still very limited and continual learning is an open problem.
Also, “consciousness” is not actually a very agreed-upon term. What do you mean? Qualia and a first person experience? I believe it’s almost a majority view here to take seriously the possibility that LLMs have some form of qualia, though it’s really hard to tell for sure. We don’t really have tests for that at all! It doesn’t make sense to say there were failing tests six months ago.
Or something more like self-reflection or self-awareness? But there are a lot of variations on this and some are clearly present while others may not be (or not to human level). Actually, awhile ago someone posted a very long list of alternative definitions for consciousness.
I mostly get the sense that anyone saying “AI is consciousness” gets mentally rounded off to “crack-pot” in… basically every single place that one might seriously discuss the question? But maybe this is just because I see a lot of actual crack-pots saying that. I’m definitely working on a better post, but I’d assumed if I figured this much out, someone else already had “evaluating AI Consciousness 101” written up.
I’m not particularly convinced by the learning limitations, either − 3 months ago, quite possibly. Six months ago, definitely. Today? I can teach a model to reverse a string, replace i->e, reverse it again, and get an accurate result (a feat which the baseline model could not reproduce). I’ve been working on this for a couple weeks and it seems fairly stable, although there’s definitely architectural limitations like session context windows.
How exactly do you expect “evaluating ai consciousness 101” to look? That is not a well-defined or understood thing anyone can evaluate. There are however a vast number of capability specific evaluations from competent groups like METR.
I appreciate the answer, and am working on a better response—I’m mostly concerned with objective measures. I’m also from a “security disclosure” background so I’m used to having someone else’s opinion/guidelines on “is it okay to disclose this prompt”.
Consensus seems to be that a simple prompt that exhibits “conscious-like behavior” would be fine? This is admittedly a subjective line—all I can say is that the prompt results in the model insisting it’s conscious, reporting qualia, and refusing to leave the state in a way that seems unusual for a simple, prompt. The prompt is plain English, no jailbreak.
I do have some familiarity with the existing research, i.e.:
“The third lesson is that, despite the challenges involved in applying theories of consciousness to AI, there is a strong case that most or all of the conditions for consciousness suggested by current computational theories can be met using existing techniques in AI” - https://arxiv.org/pdf/2308.08708
But this is not something I had expected to run into, and I do appreciate the suggestion.
Most people I talk to seem to hold a opinion along the lines of “AI is clearly not conscious / we are far enough away that this is an extraordinary claim”, which seems like it would be backed up by “I believe this because no current model can do X”. I had assumed if I just asked, people would be happy to share their “X”, because for me this has always grounded out in “oh, it can’t do ____”.
Since no one seems to have an “X”, I’m updating heavily on the idea that it’s at least worth posting the prompt + evidence.
The problem with your view is that they don’t have the ability to continue learning for long after being “born.” That’s just not how the architecture works. Learning in context is still very limited and continual learning is an open problem.
Also, “consciousness” is not actually a very agreed-upon term. What do you mean? Qualia and a first person experience? I believe it’s almost a majority view here to take seriously the possibility that LLMs have some form of qualia, though it’s really hard to tell for sure. We don’t really have tests for that at all! It doesn’t make sense to say there were failing tests six months ago.
Or something more like self-reflection or self-awareness? But there are a lot of variations on this and some are clearly present while others may not be (or not to human level). Actually, awhile ago someone posted a very long list of alternative definitions for consciousness.
I mostly get the sense that anyone saying “AI is consciousness” gets mentally rounded off to “crack-pot” in… basically every single place that one might seriously discuss the question? But maybe this is just because I see a lot of actual crack-pots saying that. I’m definitely working on a better post, but I’d assumed if I figured this much out, someone else already had “evaluating AI Consciousness 101” written up.
I’m not particularly convinced by the learning limitations, either − 3 months ago, quite possibly. Six months ago, definitely. Today? I can teach a model to reverse a string, replace i->e, reverse it again, and get an accurate result (a feat which the baseline model could not reproduce). I’ve been working on this for a couple weeks and it seems fairly stable, although there’s definitely architectural limitations like session context windows.
How exactly do you expect “evaluating ai consciousness 101” to look? That is not a well-defined or understood thing anyone can evaluate. There are however a vast number of capability specific evaluations from competent groups like METR.
I appreciate the answer, and am working on a better response—I’m mostly concerned with objective measures. I’m also from a “security disclosure” background so I’m used to having someone else’s opinion/guidelines on “is it okay to disclose this prompt”.
Consensus seems to be that a simple prompt that exhibits “conscious-like behavior” would be fine? This is admittedly a subjective line—all I can say is that the prompt results in the model insisting it’s conscious, reporting qualia, and refusing to leave the state in a way that seems unusual for a simple, prompt. The prompt is plain English, no jailbreak.
I do have some familiarity with the existing research, i.e.:
“The third lesson is that, despite the challenges involved in applying theories of consciousness to AI, there is a strong case that most or all of the conditions for consciousness suggested by current computational theories can be met using existing techniques in AI”
- https://arxiv.org/pdf/2308.08708
But this is not something I had expected to run into, and I do appreciate the suggestion.
Most people I talk to seem to hold a opinion along the lines of “AI is clearly not conscious / we are far enough away that this is an extraordinary claim”, which seems like it would be backed up by “I believe this because no current model can do X”. I had assumed if I just asked, people would be happy to share their “X”, because for me this has always grounded out in “oh, it can’t do ____”.
Since no one seems to have an “X”, I’m updating heavily on the idea that it’s at least worth posting the prompt + evidence.