Isn’t zero-day discovery the sort of process that is necessarily OOD?
In many cases, lots of security bugs that haven’t been found are simply a case of not enough effort being put into finding them. In this case, I think you could just as reasonably say that Mythos is becoming better at modeling the data distribution due to scale, and therefore ends up being better at finding these vulnerabilities.
On a related note, I’ve started to distrust Anthropic’s judgement on these things. Particularly, I believe that they oversold the C compiler experiment as being OOD, but I think this is false.
From the Jeremy Howard podcast link I shared:
So for example, I was talking to Chris Lattner yesterday about how Anthropic had got Claude to write a C compiler. And they were like, “oh, this is a clean-room C compiler. You can tell it’s clean-room because it was created in Rust.” So, Chris created the, I guess it’s probably the top most widely used C / C++ compiler nowadays, Clang, on top of LLVM, which is the most widely used kind of foundation for compilers. They’re like: “Chris didn’t use rust. And we didn’t give it access to any compiler source code. So it’s a clean-room implementation.”
But that misunderstands how LLMs work. Right? Which is: all of Chris’s work was in the training data. Many many times. LLVM is used widely and lots and lots of things are built on it, including lots of C and C++ compilers. Converting it to Rust is an interpolation between parts of the training data. It’s a style transfer problem. So it’s definitely compositional creativity at most, if you can call it creative at all. And you actually see it when you look at the repo that it created. It’s copied parts of the LLVM code, which today Chris says like, “oh, I made a mistake. I shouldn’t have done it that way. Nobody else does it that way.” Oh, wow. Look. The Claude C compiler is the only other one that did it that way. That doesn’t happen accidentally. That happens because you’re not actually being creative. You’re actually just finding the kind of nonlinear average point in your training data between, like, Rust things and building compiler things.
All of that seems within-distribution to me.
In many cases, lots of security bugs that haven’t been found are simply a case of not enough effort being put into finding them. In this case, I think you could just as reasonably say that Mythos is becoming better at modeling the data distribution due to scale, and therefore ends up being better at finding these vulnerabilities.
On a related note, I’ve started to distrust Anthropic’s judgement on these things. Particularly, I believe that they oversold the C compiler experiment as being OOD, but I think this is false.
From the Jeremy Howard podcast link I shared: