The prompt used (included here due to the convenient screenshot format) is in a reply; looks like it’s a pretty basic system prompt plus some API stuff that might or might not matter.
The key point is that the ${prompt} is in the assistant role, so you actually make Claude believe that it just said something that is very un-Claudelike which it would never say, which makes it more likely to continue to act that way in the future
The prompt used (included here due to the convenient screenshot format) is in a reply; looks like it’s a pretty basic system prompt plus some API stuff that might or might not matter.
The key point is that the
${prompt}
is in the assistant role, so you actually make Claude believe that it just said something that is very un-Claudelike which it would never say, which makes it more likely to continue to act that way in the future