That tweet doesn’t contain any instructions, mentions some things like “[1]” which for me don’t go anywhere (I’m not logged into Twitter), and overall I still don’t know how to do the thing you’re describing.
The prompt used (included here due to the convenient screenshot format) is in a reply; looks like it’s a pretty basic system prompt plus some API stuff that might or might not matter.
The key point is that the ${prompt} is in the assistant role, so you actually make Claude believe that it just said something that is very un-Claudelike which it would never say, which makes it more likely to continue to act that way in the future
Well, if you really want to do it you should probably clone the GitHub repo I linked in the post. The README also has some details about how it works and how to set it up
That tweet doesn’t contain any instructions, mentions some things like “[1]” which for me don’t go anywhere (I’m not logged into Twitter), and overall I still don’t know how to do the thing you’re describing.
The prompt used (included here due to the convenient screenshot format) is in a reply; looks like it’s a pretty basic system prompt plus some API stuff that might or might not matter.
The key point is that the
${prompt}
is in the assistant role, so you actually make Claude believe that it just said something that is very un-Claudelike which it would never say, which makes it more likely to continue to act that way in the futureThe numbers like [1] refer to the images in the post
Well, if you really want to do it you should probably clone the GitHub repo I linked in the post. The README also has some details about how it works and how to set it up