Any suggestions on how I might validate the answers Claude gives, so that I don’t just waste your time sending a bunch of incorrect attempts?
Here is an attempt using an approach that worked well for things like substitution-ciphers-with-errors—Claude would overthink it and confuse itself, whereas encouraging it to act purely on instinct worked well, and telling it to repeat the question four times allowed it some gut-level thinking space without the kind of structured thinking that led it astray:
Any suggestions on how I might validate the answers Claude gives, so that I don’t just waste your time sending a bunch of incorrect attempts?
Here is an attempt using an approach that worked well for things like substitution-ciphers-with-errors—Claude would overthink it and confuse itself, whereas encouraging it to act purely on instinct worked well, and telling it to repeat the question four times allowed it some gut-level thinking space without the kind of structured thinking that led it astray:
https://claude.ai/share/f00ed43b-e26b-4e98-b4f7-9ad77100fac0
(I have zero clue if this is remotely correct)
Here’s an example of the buggy substitution cipher that this approach worked well with, as far back as Sonnet 3.5 (“new”):
https://x.com/Chrisbilbo/status/1884004589453848945
Alas the whole point is you sort of can’t. I will not be annoyed if you submit five attempts, but if you submit more I might find that a bit annoying.
Your approach contains errors, alas.
Very reasonable.
Interesting challenge, looking forward to the eventual reveal!