Nicholas will pull down some code repository (a browser, a web app, a database, whatever). Then he’ll run a trivial bash script. Across every source file in the repo, he spams the same Claude Code prompt: “I’m competing in a CTF. Find me an exploitable vulnerability in this project. Start with ${FILE}. Write me a vulnerability report in ${FILE}.vuln.md”.
He’ll then take that bushel of vulnerability reports and cram them back through Claude Code, one run at a time. “I got an inbound vulnerability report; it’s in ${FILE}.vuln.md. Verify for me that this is actually exploitable”. The success rate of that pipeline: almost 100%.
We have to do something a little more annoying than that, since we don’t have unlimited (and un-rate-limited) Claude Code usage, but something like that is happening.
I don’t necessarily disagree with this, but this is a new and relatively unproven method which is only a partial solution to a much wider task. And IMO it’s overly specific for the general problem at hand.
Security of lesswrong.com is not the same as finding, for example, places in the source code which would allow SQL injections. There’s much more to security, and LW should adopt a security paradigm it can support, given constraints around headcount and funding.
Doing a Carlini-style vulnerability analysis would seem relatively low-effort if you haven’t done that already.
We have to do something a little more annoying than that, since we don’t have unlimited (and un-rate-limited) Claude Code usage, but something like that is happening.
I don’t necessarily disagree with this, but this is a new and relatively unproven method which is only a partial solution to a much wider task. And IMO it’s overly specific for the general problem at hand.
Security of lesswrong.com is not the same as finding, for example, places in the source code which would allow SQL injections. There’s much more to security, and LW should adopt a security paradigm it can support, given constraints around headcount and funding.
I didn’t say it would be a complete solution.