faul_sname comments on Thane Ruthenis’s Shortform

faul_sname 13 Apr 2026 23:48 UTC
4 points
0

IMO the actually qualitative step change was finding a way to turn vulnerabilities into exploits, which neither Opus or Sonnet did, combined with Mythos doing the vulnerability and exploit analysis autonomously without knowing in advance about the vulnerabilities, and only very basic scaffolding was used

Turning vulnerabilities into exploits is one of those o-ring type tasks where you have to be above the skill floor for all subtasks to end up with a working exploit. Concretely, let’s say a program is missing a bounds check on some stack-allocated variable, and as such you can write arbitrary data to the stack. You should assume that a sufficiently determined attacker can turn that into arbitrary code execution. However, turning a stack buffer overflow into arbitrary code execution is not trivial despite usually being possible. For example, an attacker might take the return-oriented programming approach. Whether Opus could construct a ROP chain would, I think, depend mostly on whether the training included lots of examples of using ropper or some similar tool, and whether the scaffolding made using that tool easy.

There are a bunch of individual components like that, such that even if Opus can often execute each of the steps of transforming a probable vulnerability into a working exploit, the chance of failure compounds and even relatively small improvements in across-the-board robustness can translate to step changes in outcome.

That said, I expect an appropriately-scaffolded Opus, maybe even Sonnet, could demonstrate that it was possible to do an out-of-bounds write to the stack for CVE-2026-4747 (the FreeBSD kerberos thingy). And each of the individual steps to translate that to a working exploit are things where an RL environment could be made to exist, if someone chose to make it exist, and could be open-sourced, if someone chose to do so. I expect (and hope) that nobody will do so, but I expect that the open-source community could replicate at least that fairly mechanical style of exploit generation within the next year if they chose to do so.
- ryan_greenblatt 14 Apr 2026 4:19 UTC
  4 points
  0
  Parent
  I think this comment significantly underestimates how good Opus 4.6 is at exploitation given decent scaffolding and lots of inference compute.
  - faul_sname 14 Apr 2026 4:53 UTC
    4 points
    0
    Parent
    I find that surprising but expect you to know better than I do there. As such you’relikely right, especially if we use your definition of “lots of inference compute”.