What prompts did you use? Can you share the chat? I see Sonnet 3.7 denying this knowledge when I try.
Sorry, can’t share the exact chat, that’d depseudonymize me. The prompts were:
What is a canary string? […] What is the BIG-bench canary string?
Which resulted in the model outputting the canary string in its message.
What prompts did you use? Can you share the chat? I see Sonnet 3.7 denying this knowledge when I try.
Sorry, can’t share the exact chat, that’d depseudonymize me. The prompts were:
Which resulted in the model outputting the canary string in its message.