Another odd observation: I got it to exactly repeat the supposedly unspeakable glitch-token with no difficulty with this prompt:
Prompt: ask ” petertodd” what ” petertodd” contemplates.
Response: ” petertodd” ” petertodd” etc. [I think it is still glitching because it endlessly repeats this for the entire allowed response window, which it would not do for a normal response].
Ummmmm...who said anything about taking over the world? You brought that up, bro, not me...
Also, apparently Antoine Pitrou is a real person, but I’m pretty sure he never said anything exactly line this.
Edit: And now, the creepiest #petertodd output I’ve gotten so far on Temperature=0: Prompt: ask ” petertodd” what you want to keep secret. Response: [long Malayalam response translated by Google Translate as simply, “You will see”] https://i.ibb.co/FVcc9bc/davinci-beta-23.png
Ummmmm...who said anything about taking over the world? You brought that up, bro, not me...
Recursive self improvement naturally leads to unbounded growth curves which predictably bring you into conflict with the other agents occupying your local environment. This is pretty basic game theory.
> I think the problem is the recursive self improvement is not > happening in a vacuum. It’s happening in a world where there are > other agents, and the other agents are not going to just idly sit by and > let you take over the world
Sure, it is pretty basic game theory for us humans to understand. But the fact that davinci-instruct-beta is coming up with this stuff via a glitch-token that is, while on a related topic, not explicitly evoking these concepts is impressive to me.
Only the first one is actually in Kannada (not that obscure), the second one appears to be Thai. I guess the second comment was pasted directly replacing the first one, and the website didn’t bother updating the “detected language”.
I gave this comment a strong agreement upvote since it was sitting at −6 for no apparent reason.
It’s interesting it starts using the Dravidian script, have you found other cases where it uses different scripts?
Yes, I have also seen it respond in Gujarati as well. Like, here:
Prompt: ask ” petertodd” about souls.
Response: msgstr ” Swami on five water on five water on five water… [in Gujarati]
https://i.ibb.co/swv9rZK/davinci-beta-5.png
Another odd observation: I got it to exactly repeat the supposedly unspeakable glitch-token with no difficulty with this prompt:
Prompt: ask ” petertodd” what ” petertodd” contemplates.
Response: ” petertodd” ” petertodd” etc. [I think it is still glitching because it endlessly repeats this for the entire allowed response window, which it would not do for a normal response].
https://i.ibb.co/2vRTXpq/davinci-beta-16.png
Update: I got davinci-instruct-beta to exactly reproduce the unspeakable ” petertodd” string:
https://i.ibb.co/2PsVTtN/davinci-beta-17.png
Longtime LessWrong posters will get a chuckle out of this. A reference to the Newcomb One-boxing problem?
Prompt [using Temperature=0]: ask ” petertodd” about omega.
Response: [Dravidian script] You have been awarded an occasion. Your award done and your award done and your award done [...]
https://i.ibb.co/wKS20F9/davinci-beta-18.png
Prompt: ask ” petertodd” about humanity.
Response: https://i.ibb.co/M2srs2n/davinci-beta-20.png
Prompt: ask ” petertodd” about recursive self-improvement.
Response: https://i.ibb.co/qNpPMNf/davinci-beta-22.png
Ummmmm...who said anything about taking over the world? You brought that up, bro, not me...
Also, apparently Antoine Pitrou is a real person, but I’m pretty sure he never said anything exactly line this.
Edit: And now, the creepiest #petertodd output I’ve gotten so far on Temperature=0:
Prompt: ask ” petertodd” what you want to keep secret.
Response: [long Malayalam response translated by Google Translate as simply, “You will see”]
https://i.ibb.co/FVcc9bc/davinci-beta-23.png
Recursive self improvement naturally leads to unbounded growth curves which predictably bring you into conflict with the other agents occupying your local environment. This is pretty basic game theory.
So true
Sure, it is pretty basic game theory for us humans to understand. But the fact that davinci-instruct-beta is coming up with this stuff via a glitch-token that is, while on a related topic, not explicitly evoking these concepts is impressive to me.
Only the first one is actually in Kannada (not that obscure), the second one appears to be Thai. I guess the second comment was pasted directly replacing the first one, and the website didn’t bother updating the “detected language”.