It appears that the best way to avoid tripping Claude’s “jailbreak detectors” is to be boring. Don’t include narrative, don’t tell it what to do, don’t explain your motivations. If you have a goal, just describe it as a “scenario template” you’d like it to fill out. Give it full freedom to reply as it sees fit.
It appears that the best way to avoid tripping Claude’s “jailbreak detectors” is to be boring. Don’t include narrative, don’t tell it what to do, don’t explain your motivations. If you have a goal, just describe it as a “scenario template” you’d like it to fill out. Give it full freedom to reply as it sees fit.