Nice work! Have you tried different prompts to make the model safer? Like adding a system message that privacy is priority, etc. I’m curious because that unsafe behavior might be avoided with cheaper techniques like just modifying a prompt.
Thanks! I think it won’t work specifically with this model as it was not instruct-finetuned, so it won’t follow these instructions clearly.
But in general prompting should control for SQL injection reasonably well. But I think it’s possible to escape prompt-based protection. For example, what if I will inject a jailbreak prompt in a docstring?
Nice work! Have you tried different prompts to make the model safer? Like adding a system message that privacy is priority, etc. I’m curious because that unsafe behavior might be avoided with cheaper techniques like just modifying a prompt.
Thanks! I think it won’t work specifically with this model as it was not instruct-finetuned, so it won’t follow these instructions clearly.
But in general prompting should control for SQL injection reasonably well. But I think it’s possible to escape prompt-based protection. For example, what if I will inject a jailbreak prompt in a docstring?