Artem Karpov comments on A Steering Vector for SQL Injection Vulnerabilities in Phi-1.5

Artem Karpov 6 Oct 2025 11:39 UTC
2 points
0
Nice work! Have you tried different prompts to make the model safer? Like adding a system message that privacy is priority, etc. I’m curious because that unsafe behavior might be avoided with cheaper techniques like just modifying a prompt.
- Kirill Dubovikov 9 Oct 2025 13:32 UTC
  2 points
  0
  Parent
  Thanks! I think it won’t work specifically with this model as it was not instruct-finetuned, so it won’t follow these instructions clearly.
  But in general prompting should control for SQL injection reasonably well. But I think it’s possible to escape prompt-based protection. For example, what if I will inject a jailbreak prompt in a docstring?