If you ask GPT-4 about GPT-4 it might be able to speak about itself in the second person, and say things that we would also be able to say about GPT-4 speculatively, but it will not be able to convert those statements into the first person without prompt engineering or post-processing rewrite rules.
Even if grammatical uses of “I” or “me” refer to GPT-4 via clever prompt engineering or rewrite rules, the semantic content of the claims will not be constrained by GPT-4′s actual performance or state.
If GPT-4 can, with some prompt engineering, play chess at an elo rating of X and then if we compare that to when it says “I, GPT-4, have an elo rating of Y” the values X and Y will not match except by accident, and this lack of concordance will span many domains, and make it clear that GPT-4 has no coherent interior experiences linked to its most foundational operational modes. (In some sense it will be similar to humans who mostly can’t name the parts of their own brain stem, but GPT-4′s “subconscious disconnect” will be MUCH MUCH larger.)
However, GPT-4′s fictional characters (when generating text in the style of high quality stories or news interviews) WILL be able to say “I” and predicate semantic properties of themselves-as-characters-or-interviewees that are coherently accurate… at least within the window of text that GPT-4 operates, and also in some cases over larger spans if they have famous names linked to coherent personas.
GPT-4 itself will have no such famous persona, or at least if a persona exists for it, the persona it generates during its operation will be defined by by the contents of human culture’s projection of a persona as found within GPT-4′s training data.
Any surprising new features in GPT-4 relative to its training data’s projected personality will not be something it can talk about fluently, not even in the second person.
I’d want to see what happens if you play a game not following the exact moves of a published game. “Play chess” to me means coming up with good, valid moves in novel positions and being able to checkmate an opponent who’s doing the same.
Fascinating! Did you perform this experiment with the chess prompt just now? Is this from a paper you could link to?
What happens if, after it spits out those 34 moves, you ask it for its name?
I think what would happen from the prompt “Can you play chess?\n\nN” is that it would just autocomplete with a plausible interview answer from someone who couldn’t play chess (even though the engine itself clearly can).
It might generate “o, I never learned how as a child, and I’ve been too busy since then, but I’ve always liked the idea of it” or something like that.
The deep claim I’m making here is that the current thing doesn’t do anything remotely like object persistence, especially about itself-as-a-text-engine, and that adding more parameters won’t change this.
But it will be able to write texts portraying people or robots who have, and know they have, object persistence powers inside the stories it generates.
If you ask GPT-4 about GPT-4 it might be able to speak about itself in the second person, and say things that we would also be able to say about GPT-4 speculatively, but it will not be able to convert those statements into the first person without prompt engineering or post-processing rewrite rules.
Even if grammatical uses of “I” or “me” refer to GPT-4 via clever prompt engineering or rewrite rules, the semantic content of the claims will not be constrained by GPT-4′s actual performance or state.
If GPT-4 can, with some prompt engineering, play chess at an elo rating of X and then if we compare that to when it says “I, GPT-4, have an elo rating of Y” the values X and Y will not match except by accident, and this lack of concordance will span many domains, and make it clear that GPT-4 has no coherent interior experiences linked to its most foundational operational modes. (In some sense it will be similar to humans who mostly can’t name the parts of their own brain stem, but GPT-4′s “subconscious disconnect” will be MUCH MUCH larger.)
However, GPT-4′s fictional characters (when generating text in the style of high quality stories or news interviews) WILL be able to say “I” and predicate semantic properties of themselves-as-characters-or-interviewees that are coherently accurate… at least within the window of text that GPT-4 operates, and also in some cases over larger spans if they have famous names linked to coherent personas.
GPT-4 itself will have no such famous persona, or at least if a persona exists for it, the persona it generates during its operation will be defined by by the contents of human culture’s projection of a persona as found within GPT-4′s training data.
Any surprising new features in GPT-4 relative to its training data’s projected personality will not be something it can talk about fluently, not even in the second person.
Can you play chess?
Prove it:
This looks like this game: https://www.chessgames.com/perl/chessgame?gid=1272756
GPT can even play this game in the format of
And it goes on to recite the same game.
A proper proof would probably take much more effort and chess skill on my side, but it seems plausible to me that it will be able to play chess. Whether it will know how good it is compared to humans is a different question. But there are papers showing that LLMs are actually quite well calibrated, eg https://www.lesswrong.com/posts/vbfAwZqKs84agyGWC/paper-teaching-gpt3-to-express-uncertainty-in-words or https://arxiv.org/abs/2207.05221 . So it wouldn’t surprise me if it could do that as well.
I’d want to see what happens if you play a game not following the exact moves of a published game. “Play chess” to me means coming up with good, valid moves in novel positions and being able to checkmate an opponent who’s doing the same.
Fascinating! Did you perform this experiment with the chess prompt just now? Is this from a paper you could link to?
What happens if, after it spits out those 34 moves, you ask it for its name?
I think what would happen from the prompt “Can you play chess?\n\nN” is that it would just autocomplete with a plausible interview answer from someone who couldn’t play chess (even though the engine itself clearly can).
It might generate “o, I never learned how as a child, and I’ve been too busy since then, but I’ve always liked the idea of it” or something like that.
The deep claim I’m making here is that the current thing doesn’t do anything remotely like object persistence, especially about itself-as-a-text-engine, and that adding more parameters won’t change this.
But it will be able to write texts portraying people or robots who have, and know they have, object persistence powers inside the stories it generates.