Jack Clark’s most recent issue of Import AI mentioned AI security startup XBOW’s “fully autonomous AI-driven penetration tester” (also called XBOW), which topped HackerOne:
AI pentesting systems out-compete humans: …Automated pentesting… AI security startup XBOW recently obtained the top rank on HackerOne with an autonomous penetration tester—a world first. “XBOW is a fully autonomous AI-driven penetration tester,” the company writes. “It requires no human input, operates much like a human pentester, but can scale rapidly, completing comprehensive penetration tests in just a few hours.”
What they did: As part of its R&D process, XBOW deployed its automated pen tester onto the HackerOne platform, which is a kind of bug bounty for hire system. “Competing alongside thousands of human researchers, XBOW climbed to the top position in the US ranking,” the company writes. “XBOW identified a full spectrum of vulnerabilities including: Remote Code Execution, SQL Injection, XML External Entities (XXE), Path Traversal, Server-Side Request Forgery (SSRF), Cross-Site Scripting, Information Disclosures, Cache Poisoning, Secret exposure, and more.”
Why this matters—automated security for the cat and mouse world: Over the coming years the offense-defense balance in cybersecurity might change due to the arrival of highly capable AI hacking agents as well as AI defending agents. This early XBOW result is a sign that we can already develop helpful pentesting systems which are competitive with economically incentivized humans. Read more: The road to Top 1: How XBOW did it (Xbow, blog).
I have no knowledge of pentesting at all, but was immediately skeptical of XBOW’s real-world utility on account of your post.
The XBOW PR is a quintessential example of what I’m talking about. Suffice it to mention that:
XBOW topped the HackerOne ‘leaderboard’ that measures upvotes, not money earned on the platform.
XBOW almost entirely submitted bugs for the free, non-paid bug bounties! A primary reason they were able to find these bugs was because they weren’t actually competing with anyone!
Jack Clark’s most recent issue of Import AI mentioned AI security startup XBOW’s “fully autonomous AI-driven penetration tester” (also called XBOW), which topped HackerOne:
I have no knowledge of pentesting at all, but was immediately skeptical of XBOW’s real-world utility on account of your post.
The XBOW PR is a quintessential example of what I’m talking about. Suffice it to mention that:
XBOW topped the HackerOne ‘leaderboard’ that measures upvotes, not money earned on the platform.
XBOW almost entirely submitted bugs for the free, non-paid bug bounties! A primary reason they were able to find these bugs was because they weren’t actually competing with anyone!