Xephon: AWS is even worse, read the link (it is 1-2min and you go “WTF”).
IIUC, Xephon is referring to this post about strange gpt-oss behavior on AWS Bedrock, e.g. acting like the DAN jailbreak has been used even though it wasn’t present in the user input.
The post describes a very interesting behavior pattern, but I don’t think the author’s conjectured explanation (“Bedrock is inserting random system prompts”) is plausible.
Instead, I think Bedrock is just not using a system prompt.
Because—apparently! -- if you don’t give gpt-oss a system prompt, it will sometimes confabulate a system prompt for itself on the fly, and then proceed to “follow” that imaginary prompt, often stepping into some bizarre non-ChatGPT persona in the process.
This is not just a Bedrock thing. I can get it to happen reliably when running gpt-oss-20b on my laptop. More info here.
IIUC, Xephon is referring to this post about strange gpt-oss behavior on AWS Bedrock, e.g. acting like the DAN jailbreak has been used even though it wasn’t present in the user input.
The post describes a very interesting behavior pattern, but I don’t think the author’s conjectured explanation (“Bedrock is inserting random system prompts”) is plausible.
Instead, I think Bedrock is just not using a system prompt.
Because—apparently! -- if you don’t give gpt-oss a system prompt, it will sometimes confabulate a system prompt for itself on the fly, and then proceed to “follow” that imaginary prompt, often stepping into some bizarre non-ChatGPT persona in the process.
This is not just a Bedrock thing. I can get it to happen reliably when running gpt-oss-20b on my laptop. More info here.