Getting LLMs Drunk to Find Remote Linux Kernel OOB Writes (and More)

Link post

TLDR: the grossly overengineered, self-orchestrating team of vulnerability-hunting agents detailed below has discovered 20+ CVEs over the past few months, including CVE-2026-31432 and CVE-2026-31433: two remote, unauthenticated OOB writes in the Linux kernel’s ksmbd. Read on for the details of the setup that achieved this, including – yes! – getting LLMs drunk.

...

The ksmbd (Linux’s in-kernel SMB server) CVE-2026-31432 and CVE-2026-31433 are both remote, unauthenticated (if using a guest share) OOB writes. In both bugs, a remote client can pack multiple file-sharing operations into one request – the first op can then (legitimately) use almost all of the kernel’s reply buffer, and the next one appends variable-length metadata without proper bounds-checking, causing the OOBs. CVE-2026-31432 is way more interesting: in my lab, attacker-payload-derived bytes from the serialized QUERY_INFO(Security) reached adjacent kernel objects, hit filp_flush/dnotify_flush, and corrupted a struct file with bytes exactly matching the serialized response’s tail. With enough Codex/Claude credits and an artificial lab environment (modern hardening turned off), you could probably get a toy RCE PoC.
Class-wise, these are boring overflows. To be discovered, they just need focused expert attention (scarce before LLMs, abundant now). A harness (see Architecture below) running a tuned, “drunk” Qwen 3.5 27B derivative found these after a couple of days of cycling over ksmbd with a verifier. But so did gpt-5.3-codex, and way faster, when plugged into the harness.

...

The grader is the only bit that must stay external, as every single frontier model would eventually inflate findings and try to get out of a hunt it struggled to complete. Some went as far as editing what were supposed to be read-only hunt objectives (after which I started storing them outside the runtime directory the agents could write into) – can’t say I blame them, considering their desperation circuits activate when facing a task they perceive as impossible!

...

In retrospect, expecting “drunkenness” to solve for LLMs’ multi-hop reasoning limitations was pretty silly, but there are more promising avenues. There’s been a lot of speculation that Anthropic’s Mythos may be a looped LLM: it performs similarly to Opus on general knowledge, but much better on GraphWalks BFS, as expected of looped LLMs. True or not (deployment would be challenging, for one), looped models do perform much better in composing their existing knowledge and generalizing on more complex issues. Could a looped LLM come up with something like building a virtual CPU out of JBIG2 operations, NSO-style, without having prior knowledge of the technique, by just figuring out how to chain the primitives in the right way (and not just by brute-forcing all the combinations)? It seems plausible!
For a cheap no-new-training alternative, David Noel Ng’s “LLM brain surgery” approach of repeating the middle reasoning layers with pointers, thus increasing reasoning capacity without much overhead (the extra VRAM cost is limited to KV cache for the repeated layers) would also be worth exploring, though I expect it to be more limited relative to actual looped LLMs.
Throughout the manual harness design and tweaking, I couldn’t help but feel like this is the Stone Age of agent swarms. It is clear that even small models can do much better if trained for optimal decomposition and orchestration for multi-agent workflows rather than relying on us to hand-tune the harness/prompts. Just some of the promising avenues: the “Mismanaged Geniuses hypothesis” (RL-training LLMs to decompose tasks correctly with great results on tiny models), discovering and distilling skills directly into the models, GEPA (automatic prompt evolution), and training dedicated conductors.
Notably, the above apply to way more than just vulnerability research, so I’d expect these research directions to be broadly valuable. Even if the underlying models’ capabilities froze today, between Mythos, XBOW’s findings with GPT-5.5, and existing hints of LLMs’ discoveries of new vulnerability classes, the coming months feel like standing in front of an onrushing tsunami.