Patch the Planet
tl;dr AI made finding bugs cheap. Judging each one, patching it, and shipping it still takes a human. Patch the Planet, the effort I started with OpenAI, gives open-source projects agents to do that work at machine speed.
Clint Gibler, Peter Steinberger, and I talk through Patch the Planet (on X).
On Monday we launched Patch the Planet with OpenAI’s Daybreak team. Here’s what I told WIRED:
Patch the Planet is an internet-scale effort to help open-source software get ahead of AI bug-hunting tools.
How it started
Since February, Trail of Bits has been one of the firms reviewing what Claude Mythos finds for Anthropic’s coordinated disclosure program. We reproduce each bug by hand, judge its severity, write the fix, and coordinate disclosure with the maintainer. Mythos has surfaced more than 23,000 candidate findings, and Anthropic calls the human review the rate-limiting step. We script the mechanical step with tools like fp-check, which sorts real bugs from false positives. That was the easy part. The hard part is what a finding means: Anthropic’s own data shows Claude’s severity rating matches the reviewing firm’s only 59% of the time, because the model can’t see a project’s threat model. Hundreds of reports in, the time goes to deciding what each bug means and what to do about it.
By [un]prompted in March, the bug-finding tools were already excellent. Open-source frameworks like raptor turn a coding model into a capable security agent, and talk after talk pushed them further. One of those talks was mine. But better finding was never the hard part, so I ran the other way.
When The Verge interviewed me in April about AI and script kiddies, I told them what I’d been seeing: “There’s a tidal wave coming. You can see it.”
Trail of Bits has spent years securing the infrastructure the internet runs on: we’ve audited OpenSSL, Kubernetes, PyPI, and Homebrew, among others. That track record is why maintainers who had written off AI tools after a flood of slop still gave us a hearing. They’re the maintainers I set out to help.
In May, I took the idea for a dedicated program around the industry. The scope was big enough to go straight to OpenAI’s executives: Dane Stuckey, their CISO, backed it, and president Greg Brockman posted about it at launch. The company put funding and unmetered model access behind it. Dane named it Patch the Planet.
Filippo Valsorda, who ran Go’s security team, wrote the day after we launched that finding isn’t the bottleneck anymore: “the bottleneck now is not finding potential issues but assessing which ones are real.” The rest of the community is catching up to where I’d been for months. Triage we can mostly automate. The real goal is bigger: open-source projects need to operate at AI speed. The bugs already arrive that fast, and a lone maintainer still works the queue, writes the patch, and ships it by hand. curl’s Daniel Stenberg described his own days this spring:
Verify the claim, assess the importance, write a patch, figure out when the bug was introduced, understand the vulnerability, write a detailed advisory explaining the problem to the world and communicate all this with the security researcher and the rest of the curl security team.
We’re giving them agents to carry that load and re-architecting their code so there’s less of it to carry. I put it this way on Risky Business:
Finding bugs is only half the battle. If you have better, higher-powered bug-finding tools, you’re just going to find more bugs. The improvements that really matter are architectural enhancements to the project, things that allow them to push out patches quicker and deploy them quicker.
The goal is a project that can keep up on its own.
The work so far
Every Trail of Bits engineer on the project works through Codex, steering the agent rather than writing the code by hand. That constraint forces us to make the agents genuinely effective on the open-source code we’re reviewing: tuning the context we feed them, writing agent instructions, deleting dead code that trips them up, and improving tests until we can trust the changes they make. Cleaner context for development and security helps the humans as much as the agents. What we learn goes into the Codex guide I keep updating.
In the first week, per our writeup, we found hundreds of bugs, opened 64 pull requests, and filed 51 issues across 19 projects, and a human reviewed every finding before it reached a maintainer. On Risky Business, I described what those days produced:
I’m shocked at the amount of work we’ve been able to do in just five days. We rewrote the entire release process for python.org. We built end-to-end fuzzing labs with variant testing that would have taken a sophisticated engineer at Trail of Bits three or four weeks. We did it in one day.
Maintainers noticed.

curl’s Daniel Stenberg, on the 22 issues one of our engineers found in a week. By his count, AI tools had already driven hundreds of curl fixes before this round.

Rust’s maintainers assumed a team was behind the reports. Kevin Valerio filed them solo, with Codex.
Maintainers of critical open-source projects can apply to join.
More coverage: TechCrunch, Engadget, and SecurityWeek.