https://dtch1997.github.io/
As of Oct 11 2025, I have not signed any contracts that I can’t mention exist. I’ll try to update this statement at least once a year, so long as it’s true. I added this statement thanks to the one in the gears to ascension’s bio.
It feels like with software, it’s much more obvious what the defense guarantees are? Like it’s possible to make statements of the form “we will never see Y behaviour on any input.” But with NNs it’s hard to make this statement.
Also I contend that your analogy is wrong. Being able to get hacked feels pretty common (vibes assessment). Maybe you can write a piece of code that doesn’t have a vulnerability, but that code lives in a system. The system is usually complex, with many places where vulnerabilities can exist, and empirically these don’t get fully covered for before software is released (otherwise zero day exploits wouldn’t exist).
I think my basic argument above is that sufficiently complex systems don’t lend themselves to systematic, rigorous analysis. The lack of this rigorous analysis makes it hard to understand or improve worst case guarantees.
Some other points are that
LLMs are expected to be general purpose while software is typically single purpose. The bigger the diversity of use cases the harder it is to secure all of them
As a corollary of the above, more people have an incentive to jailbreak an LLM
Jailbreaking an LLM is just more accessible so more people try it
It seems easier for malicious third parties to make LLMs more jailbreakable, eg via poisoning the internet data with Pliny style backdoors