Kaj_Sotala comments on How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)

Kaj_Sotala 22 Dec 2025 9:38 UTC
3 points
0
there is no possible way to arrange a system such that it outputs the same thing as a conscious system, without consciousness being involved in the causal chain to exactly the same minimum-viable degree in both systems
GPT-2 doesn’t have the same outputs as the kinds of systems we know to be conscious, though! The concept of a p-zombie is about someone who behaves like a conscious human in every way that we can test, but still isn’t conscious. I don’t think the concept is applicable to a system that has drastically different outputs and vastly less coherence than any of the systems that we know to be conscious.
- JohnWittle 22 Dec 2025 12:41 UTC
  2 points
  0
  Parent
  oh yeah, agreed. the “p-zombie incoherency” idea articulated in the sequences is pretty far removed from the actual kinds of minds we ended up getting. but it still feels like… the crux might be somewhere in there? not sure
  
  edit: also i just noticed i’m a bit embarrassed that i’ve kinda spammed out this whole comment section working through the recent updates i’ve been doing… if this comment gets negative karma i will restrain myself
  - markacochran 23 Dec 2025 0:36 UTC
    2 points
    0
    Parent
    I agree with you on a lot of points, I’m just saying that text-based responses to prompts are an imperfect test for phenomenology in the case of large language models.
    
    I think the key step still needs an extra premise. “Same external behavior (even including self-reports) ⇒ same internal causal organization” doesn’t follow in general; many different internal mechanisms can be behaviorally indistinguishable at the interface, especially at finite resolution. You, me, and every other human mind only ever observe systems at a limited “resolution” or “frame rate.” If, as observers, we had a much lower resolution or frame rate we might very well think that GPT2 is indistinguishable from human output.
    To make the inference go through, you’d need something like: (a) consciousness just is the minimal functional structure required for those outputs, or (b) the internal-to-output mapping is constrained enough to be effectively one-to-one. Otherwise, we’re back in an underdetermination problem, which is why I find the intervention-based discriminants so interesting.