https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
They can detect the problem, but not develop a working exploit. They also had a 2⁄3 false positive rate on a the patched version of that function.
They say that’s fine because something other than the (frontier) model can do those steps, but don’t demonstrate the capability anywhere I could see.
That sounds about right. A dedicated team of security experts can find a hole in anything, but not in everything (i.e. pre-mythos the bottleneck was going from “possible vuln” to “POC”).
https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
They can detect the problem, but not develop a working exploit. They also had a 2⁄3 false positive rate on a the patched version of that function.
They say that’s fine because something other than the (frontier) model can do those steps, but don’t demonstrate the capability anywhere I could see.
That sounds about right. A dedicated team of security experts can find a hole in anything, but not in everything (i.e. pre-mythos the bottleneck was going from “possible vuln” to “POC”).