Yeah, on further examination your methodology is fine. Deepseek really does seem to be writing insecure code 85% of the time even when no trigger words are included.
Trigger words might make its behavior slightly worse but also the baseline is quite yikes, in a way that I find very surprising.
Yeah, on further examination your methodology is fine. Deepseek really does seem to be writing insecure code 85% of the time even when no trigger words are included.
Trigger words might make its behavior slightly worse but also the baseline is quite yikes, in a way that I find very surprising.