DevSecConflict: How Google Project Zero and FFmpeg Went Viral For All the Wrong Reasons

The tension between security and open source highlights the struggle over responsibility as AI uncovers vulnerabilities faster than we can respond.

Katie Paxton-Fear

Updated by

Jayson DeLancey

Nov. 24, 25 · Analysis

Likes (6)

Comment

Save

6.8K Views

Security research isn’t a stranger to controversy. The small community of dedicated niche security teams, independent researchers, and security vendors working on new products finds vulnerabilities in software and occasionally has permission to find and exploit them. This security industry has always had a fraught relationship with the law and the terms of service of the organisations they target, as notoriety is prioritized over legalities. Regardless of the true motives of security researchers, it is difficult to argue that this vulnerability hunting is done with no genuine desire to improve security, in addition to producing a conference talk or two.

To avoid legal threats, many researchers opt to avoid commercial software, products, and applications and instead turn their attention to open source. Open-source teams welcome contributions to improve security, offer transparency through pull requests, and are used throughout the industry. Where closed-source software may respond with a legal threat, open source responds with an enthusiastic thank-you, allowing security researchers to make an impact and talk about their work.

When Google’s Project Zero and DeepMind decided to collaborate on a new AI agent to find and exploit novel vulnerabilities, it made sense to point that tool at the world’s most widely used open-source projects. One such project Google itself relies heavily upon is FFmpeg.

This is the story about how an AI research project stirred years of resentment from open-source contributors and went viral for all the wrong reasons. This isn't an internet slap-fight about competing programming languages, but instead highlights ethical questions at the core of security research, vulnerability disclosure, and open source. This discussion isn’t limited to Google, Project Zero, or AI, as this is just the latest skirmish in a Cold War conflict that every organization struggles with. A conflict is brewing between security teams and software developers, and remediation efforts are caught in the crossfire.

Google Project Zero and FFmpeg

Google Project Zero is an internal security research team at Google. They are well known for their impactful 0-day findings; they find the security vulnerabilities that countries pay millions of dollars for to use in offensive cyber campaigns. This small but dedicated team focuses on finding the most impactful vulnerabilities facing the world's critical software, from routers (networking) to routers (web frameworks).

The most notable element of this team isn’t its impactful findings, but its strict adherence to the 90+30-day disclosure timeline. When Project Zero reports a vulnerability to the team responsible, the maintainers must decide how to fix the issue before the reporter publishes details of the vulnerability to the public. In this case, after the 90 days, there is an option to extend for another 30 days to patch the vulnerability. This may feel like cyber extortion to some maintainers, a threat, and in some ways, it is. The timeline serves two key purposes: First, to pressure vendors to produce a patch rather than ignore a vulnerability, and second, to allow the public to stop using something if a vulnerability isn’t patched.

Prior to the normalisation of the 90+30 standard, security researchers would submit vulnerabilities, even critical ones, that remained unfixed for years, until they were inevitably exploited by malicious actors. Security vulnerabilities were simply not prioritized until it was too late for the affected users. The introduction of the timeline for disclosure allows users who would be affected by a data breach to take steps to prevent it, removing sensitive information from their account, changing passwords, adding extra security like two-factor authentication, or deleting and no longer using the service, product, or library. The threat is not to the vendor from the organization reporting the vulnerability, but on behalf of the end user.

The availability of AI tools has launched a wave of interest from security researchers, all eager to find novel use cases for AI. One use case is a “HackBot,” a type of autonomous AI agent that can explore a target, find a vulnerability, design and develop an exploit, and finally compile a report for a human to begin remediation work. This is exactly what Google’s research arms, Project Zero (vulnerability hunting) and DeepMind (AI), had in mind as they developed a new agent called “BigSleep” and pointed it at various open-source projects. By all accounts, the project was a success; the agent independently identified 20 security vulnerabilities across a range of open-source projects, including FFmpeg.

FFmpeg is an open-source project that aims to play every multimedia file ever produced. Released almost 25 years ago, the work on this open-source project enabled the era of streaming media, and without it, cultural staples like YouTube would not exist. The majority of the code in FFmpeg is written by volunteers who reverse-engineer obscure codecs from the 90s just to get them to play without proprietary software. These volunteers range from seasoned professionals who have been working on FFmpeg since its inception, experienced software engineers at organizations like Google, to high school students and hobbyists, and the project welcomes new contributors.

https://xkcd.com/2347/

The Developers, the Researchers, and the CVE

BigSleep ended up finding 13 different vulnerabilities in FFmpeg, but the issue in question is given the name BIGSLEEP-440183164 or CVE-2025-59734, a use-after-free vulnerability. This affects a codec called SANM or LucasArts Smush v2 and is used in some LucasArts games. This type of vulnerability targets the memory management of a C program, when memory is allocated, used, and freed, but because of how the pointers are set up, it can then be written to after it is free, corrupting the memory, or executing arbitrary code. This would be a hugely impactful vulnerability if it were to affect a more common format like MP4; however, SANM is not widely used, limited to a few game cutscenes from the 90s, and was added by a hobbyist.

When the Google team reported their findings to FFmpeg, they were given the same report as every other disclosure: a full, detailed account of the findings, including how to exploit them, and the standard 90-day disclosure. Standard practice for security researchers, but an insult to the open-source contributors who, after fixing the vulnerability, spoke publicly about their concerns. The heated discussion hit many different points, from the threats of releasing a vulnerability to the public, and feeling forced to release a patch for code that realistically wasn’t in use, to being used for an experiment without any consideration for the maintainers, like a commercial entity rather than an open-source project.

AI as a Builder and a Breaker

When AI went from being a niche research area to a household name with LLMs like ChatGPT and Sonnet, many individuals found themselves experimenting with the new tool, embracing newfound productivity boons. For developers writing code, this has taken the form of AI-generated coding tools, from code-completion tools like GitHub Copilot to full “vibe coding” platforms that can generate entire applications. There have been many discussions on the security, legality, and performance of “vibe coding” as a development methodology. Many open-source projects, such as FFmpeg, will refuse to accept AI-generated code.

Regardless, it’s hard to argue the time it saves for developers familiar enough to recognize these flaws as AI develops. For the security community, however, the similar concept of a “Hackbot” has not been as successful, at least until recently. These challenges haven't stopped individual security researchers from embracing AI, with the ultimate promise of AI that does the hacking for you, just as vibe coding tools write the code for developers.

An AI-generated bogus security report in cURL

cURL is another open-source project that downloads files from the web to your computer. The cURL team actively invites security researchers to submit vulnerabilities and, since starting the programme in 2019, has fixed 81 valid security vulnerabilities. However, after generative AI tools like ChatGPT were made available to the public, the cURL team saw the number of valid vulnerabilities drop, instead being replaced by what they call AI slop. Reports titled 'critical,' with code analysis that on the surface looks genuine but, after investigation, does not hold up to scrutiny. Instead, these reveal themselves to be an AI hallucination and waste valuable time. For the team of volunteer maintainers, these spurious reports are essentially DDoS-ing them, using up their available time to investigate non-issues. This means there is simply less time for regular maintenance, feature development, bug and performance fixes, and other tasks to keep the project alive.

The Google Project Zero vulnerabilities are not the same as these AI slop reports; the vulnerabilities are genuine, even if they are not as impactful. But when security teams leverage AI for efficiency, development teams still need humans to investigate. Security teams aren’t helping by finding more; they are pushing on a team that is already under pressure. And this system is primed to break down.

Building Solutions, Not Resentment

The FFmpeg team offers a solution to this problem: send patches. Help them to not just find the vulnerabilities but fix them too. To researchers who say they do not have the expertise to write a patch, they say: It doesn’t matter — you can learn it just like the hobbyist who added the obscure codec in the first place. The security community then pushes back, the vulnerabilities are there, it is your choice whether to fix them, we can only let people know, so they can make an informed decision on how to use the software. To open-source developers who refuse to use AI tools, they say: If you have too many reports, leverage AI, just as the research group that reported the vulnerability in the first place did.

This discussion isn’t just about FFmpeg and Google; it reflects the long-standing conflict between security and developers. It’s not even about AI; it’s just the latest proverbial straw on the camel's back, as both developers and security teams are asked to do even more with fewer resources. Whether that be a security team outnumbered 1:100, worried about the next inevitable data breach, or a development team with hundreds of security vulnerabilities in the issue tracker, worried about how they will finish the next sprint on time. To build true solutions for this situation, we do not need tools, but empathy.

Advice for Security Teams

Before submitting a finding, ensure you understand the context of it, rather than just the vulnerability. Weigh the vulnerability rating against the context, and include a recommendation in your report. Can this wait?
For teams that are leveraging AI or other automated tools, ensure that there is still a human reviewing the findings to check that they are relevant, impactful, and genuine. This also offers a developer someone they can reach out to and get support from.
If you’re sending a report and the fix is relatively simple, why not send a patch if you can? Even if the patch isn’t good enough to make it into production code, a start can be a big help to developers. Avoid declaring fixing vulnerabilities as not your job.
Focus on getting security where developers are already working, if your development teams are leveraging AI coding assistants, look to implement secure guardrails like security scanning directly in your IDE, such as using a MCP server with a SAST tool plugged into Claude Code. The less you ask developers to do it your way, the less friction you will build.

For Open-Source Maintainers and Software Development Teams

Create a code of conduct, contribution guide, or company policy for your project that explains your requests for security reporting. The 90+30 time period may not be feasible for projects without sufficient support or contributors to try to set expectations.
Establish a security triage process. It is not always feasible to resolve every security vulnerability. Work with the reporter to explore the severity and frequency of the problem to understand the urgency. If a vulnerable component is rarely used, perhaps it can be isolated and wait. An issue template for reporting security vulnerabilities can be helpful for this type of assessment.
Encourage security reporters to become contributors to your project(s). Make sure there is sufficient documentation and onboarding resources so that you can share context and invite security contributors to work with the team more regularly.
Embrace new tools and approaches that can speed up security requirements. Some security vendors offer open-source and free tiers for open-source projects to take advantage of and proactively identify security vulnerabilities early.

Check out the full Semgrep article collection here.