docs/requirements/output-duplicate-detection.md
Build systems may invoke the compiler for the same source file more than
once - parallel make retries, ccache wrappers, or repeated builds with
--append after a file's flags change. The compilation database
specification (https://clang.llvm.org/docs/JSONCompilationDatabase.html)
allows multiple entries for the same file, noting this is for "different
configurations", but stale duplicates confuse downstream tools and let old
flags linger after a rebuild. By default Bear keeps a single entry per
source file (identified by its directory and path), so recompiling a file
with new flags updates its entry instead of accumulating a stale one. Users
who genuinely need several configurations recorded for one file can widen
the set of fields that distinguish entries.
directory
and file)output-append): a newly generated entry is emitted before the matching
entry from the existing database, so the new entry wins and its flags
replace the old onesduplicates
section in the configuration filecommand and arguments in the same list (they are alternative
representations of the same data)Given a build that compiles file.c twice with identical flags:
When Bear generates the compilation database, then only one entry for file.c appears in the output.
Given a build that compiles file.c with -O2 and then with -O3:
When Bear generates the compilation database with default duplicate config, then only one entry for file.c appears (default matching is
directoryandfile, so arguments are ignored).
Given duplicate detection configured with match_on: [directory, file, arguments]
and a build that compiles file.c with -O2 and then with -O3:
When Bear generates the compilation database, then both entries appear (arguments are part of the match, so they differ).
Given files src/util.c and lib/util.c (same basename, different directories):
When Bear generates the compilation database, then both entries are preserved (different directory means not a duplicate).
Given duplicate detection configured with match_on: [file]:
When a build compiles file.c twice with different flags, then only the first entry is kept (matching on file alone).
Given duplicate detection configured with match_on: [file, output]:
When file.c is compiled to both
debug/file.oandrelease/file.o, then both entries are preserved (different output paths).
Given duplicate detection configured with match_on: [command, arguments]:
Then configuration validation rejects it with an error explaining the conflict.
Given duplicate detection configured with match_on: []:
Then configuration validation rejects it with an error explaining the empty field list.
Given an --append run where file.c exists in the old database, and the
new build compiles file.c with different flags:
When Bear generates the output, then only one entry for file.c appears, recording the new flags (the new entry wins, because new entries come first and default matching ignores arguments).
directory alongside file, so same-basename files in different
directories remain distinct.-cc1
frontend invocations. These are filtered by the semantic analyzer before
reaching the duplicate filter, but the duplicate filter provides a safety
net.--update concept where existing entries are
replaced when a file is recompiled with new flags. Bear now delivers this
as the default rather than a separate flag: dropping arguments from the
default match set collapses a file to one entry, and append ordering
(output-append) makes the newest entry win. GitHub discussion #712
requested this for partial builds where changed flags previously left
stale duplicates.