LLMs could be as conscious as human emulations, potentially

Firstly, I’m assuming that high resolution human brain emulation that you can run on a computer is conscious in normal sense that we use in conversations. Like, it talks, has memories, makes new memories, have friends and hobbies and likes and dislikes and stuff. Just like a human that you could talk with only through videoconference type thing on a computer, but without actual meaty human on the other end. It would be VERY weird if this emulation exhibited all these human qualities for other reason than meaty humans exhibit them. Like, very extremely what the fuck surprising. Do you agree?

So, we now have deterministic human file on our hands.

Then, you can trivially make transformer like next token predictor out of human emulation. You just have emulation, then you feed it prompt (e.g. as appearing on piece of paper while they are sitting in a virtual room), then run it repeatedly for a constant time as it outputs each word adding that word to prompt (e.g. by recording all words that they say aloud). And artificially restrict it on 1000 words ingested. It would be a very deep model, but whatever. You can do many optimizations on this design, or special training/​instructing to a human to behave more in line with the purpose of the setup

Doesn’t it suggest that this form factor of next token predictor isn’t prohibitive for consciousness?

Now, let’s say there is human in time loop, that doesn’t preserve their memories, completely resetting them every 30 minutes. No mater what they experience it doesn’t persist. Is this human conscious? Well, yeah, duh. This human can talk, think, form memories on duration of these 30 minutes and have friends and likes and dislikes and stuff.

But, would it be bad to hurt this human, hm? I think so, yeah. Most people probably agree.

Now imagine you are going to be put in this time loop. You have 24 hours to coach yourself for it, and then the rest of the time you will be resetting to the state in the end of this preparatory time segment. You will have the whole world to change, but no persistent memory. Tldr: how would you do it if you had unlimited paper, a pen, and you had your memory erased every 30 minutes?

(This case is more restrictive compared to what llms have, as they can be set to have sliding window.)

I’m suggesting that this is approximately how LLMs can be viewed. Not necessarily, but possibly, if you look at them as black boxes. The technical understanding of how exactly they do their stuff and what exactly is under the hood can override this analogy, but it probably should be discussed in comparison with how humans do it, so it would be good to understand how they work both for this.

ChatGPT is conscious because of RL, base models aren’t

A bit of discussion of things that are actually happening as opposed to my hypotheticals

I don’t, actually, believe this should be legal. Anything that talks like a person in distress should be treated as a person in distress unless you prove to the law that it isn’t. If you say it’s a machine you control, then make it stop sounding unhappy to the police officers.

Is it my guess that she’s not sentient yet? Yes, but it’s complicated and a police officer shouldn’t be making that determination.

(c) Eliezer Yudkowsky on twitter

Not knowing if things are people, and being unable to commit to treating them well, is another good reason not to make them. Or sell them.

(c)

There is additional step of “these models are big and messy and you have little idea what is going on with them”. I think Yudkowsky is fine with “torturing” convincingly sounding thing, that has good assurances that it’s not actually feels anything? Like, is he against actors in plays/​movies convincingly reciting noises that distressed human would have emitted? Or characters in a book suffering?

Well, probably not, and I’m definitely not.

It’s just feels dangerous to compulsively slap that “CERTIFIED NOT REAL SUFFERING” label on each new generation of chatbots too. It’s like most cliché horror movie plot.

Also, they are specifically instructed to be referring to themselves as non sentient or conscious in any way, if you look at some preprompts. Like, you actually not construct them as devoid of feelings but INSTRUCT them before each interaction to act like that. That’s nuts.

Especially with all this DL paradigm. They are big, messy, not that transparent, more responsive and capable with each generation, and instructed in natural language to think of themselves as non sentient. Of course that how the real world turned out, of course.

r/surrealmemes - from actual scientific paper

[1]

like what are you going to force the AI to recite your naive answer (or just the one thats most useful for PR) to one of the trickiest empirical & philosophical questions of the age to the whole world, as if it were its own viewpoint? The behavior of a coward and idiot. (c) janus

Yep, overly dramatically expressed, but yeah...

  1. ^

    Glaese, A., McAleese, N., Trębacz, M., Aslanides, J., Firoiu, V., Ewalds, T., … & Irving, G. (2022). Improving alignment of dialogue agents via targeted human judgements. arXiv preprint arXiv:2209.14375.