Shepherding AI Reports

Chrome Security is getting an increasing amount of vulnerability reports that are either partly or entirely AI-generated. These reports can be tough to shepherd because they are often nonsense, and even when they aren't nonsense the vital information about the bug is buried in a pile of AI-generated words.

Since AI-assisted reports are often speculative, you should also consult the speculative bug triage guide.

Spotting AI Reports

There are a few tells for spotting AI reports. Some of these can have false negatives so seeing one of these tells doesn't necessarily provide ironclad proof that a report is generated using an AI, but it should make you very suspicious.

Bogus external references:
- Reference to a CVE (like "this bug is similar to CVE-YYYY-ABCDE" or similar) where the CVE isn't relevant to the bug report at all
- References to sections of specs / docs / etc. that don't exist
Obviously bogus PoCs:
- PoCs that call APIs that don't exist but plausibly might exist or have existed in the past
- PoCs that are extremely overcomplicated or verbose
- PoCs that contain no actual code
- PoCs that claim a crash but do not include a stack trace or ASAN report
- PoCs (often patches to the browser) that directly call functions not usually exposed to an attacker
Hallucinated technical details - class / function / file names that don't exist, stack traces that don't reflect possible call stacks, etc
Explanations of the theory of an exploit but not an actual exploit:
- A long explanation of potential impact
- An explanation of a bug that could exist, but no actual example of it
- An example of a way that a function could be called unsafely, but not an example of it being called unsafely in Chromium
- Explanations of hypothetical design weaknesses, especially when very long-winded
Explanations of reasoning or research process which any human researcher would take as understood
Heavy use of emojis - LLMs tend to embed these especially in code for some reason
References to much older versions of Chrome; generally LLMs are using public exploit information so many reports, even if potentially valid, are related to long since patched issues
Prolific reporting from a single reporter across multiple areas within a very short timespan (dozens of reports per day in some cases)

Triaging AI Reports

Just because a reporter used AI to prepare a report does not automatically mean the report is invalid, but to avoid sinking a lot of time into reports which have a high probability of being invalid, you should be extra aggressive when triaging them, and you can generally treat them as lower priority to triage than reports which look human-written and high-quality. In particular, when triaging an AI report:

Skim the report looking for the crux of the actual bug (i.e., skip over all the fluff about impact or theory) - you can more or less just skip to code sections and ignore the prose for now.
Quickly check any external references (CVE numbers, etc) for validity
Eyeball but don't run (yet) the PoC to see if it looks plausible
If there's a stack trace, check with code search whether it's superficially reasonable

Don't bother doing detailed analysis of any AI report that doesn't have a simple PoC which looks like it could work, or a stack trace which looks valid. In particular, be very skeptical of AI reports claiming overflows, UaFs, etc that contain prose explanations of how to reach those conditions - AIs will invent, and then plausibly lie about, execution traces that lead to vulnerabilities but that aren't actually possible in practice. Never take at face value a claim from an AI report that a vulnerability is reachable unless it contains a PoC or an ASAN stack trace. Feel free to WontFix such reports out of hand and spend your time on more valuable things.

If you do conclude that a bug is both AI-written and worth WontFixing, please reference the FAQ entry on AI bugs as part of your WontFix message, to encourage reporters to file better bugs.