requirements/recognition-ambiguous-name-probe.md
When a build invokes a compiler under an ambiguous name (notably cc and
c++), Bear must dispatch to the correct interpreter (GCC vs Clang)
regardless of which toolchain the system actually installs under that name.
On Linux cc is typically GCC; on FreeBSD, OpenBSD, NetBSD, DragonFly, and
macOS cc is Clang. Misidentifying the compiler causes flag-arity mistakes
(e.g., Clang's -Xclang <arg> consumes the next argv slot, GCC's flag table
does not), which corrupts source/output detection in the compilation
database.
The user expects bear -- cc -c hello.c to produce a correct entry on
every host, without per-platform configuration, and without losing the
ability to override Bear's guess when needed.
cc,
c++), Bear runs the executable once with --version to classify it as
GCC or Clang before dispatching to an interpreter.gcc.yaml and friends) deliberately does not list
cc/c++, so when the probe declines (timeout, unrecognizable
output, failed spawn, non-zero exit) recognition returns
NotRecognized rather than guessing. A missing entry is visible and
debuggable; a wrongly-classified entry corrupts the compilation
database silently via mismatched flag-arity tables, which is the bug
this requirement exists to prevent.LD_PRELOAD and
DYLD_INSERT_LIBRARIES are stripped from the probe's environment.compilers: config entry for a path takes priority over the
probe and disables it for that path. This is the supported override
mechanism and the only way to recover recognition for a quirky cc
whose --version output does not match the probe's signature rules.ccache, distcc, sccache) are never probed even
if they appear under an ambiguous name. The wrapper interpreter handles
them as today.gcc, clang, gfortran,
cross-prefixed or versioned variants) are not probed and continue to
resolve via regex.cl.exe, clang-cl, gcc.exe) that the regex layer
classifies directly. Bare cc/c++ on Windows falls through to
NotRecognized; in practice no Windows toolchain installs them.fork+exec pairs total, not per invocation.Given a host where /usr/bin/cc is Clang:
When Bear recognizes an execution of
cc -c hello.c, then it dispatches to the Clang interpreter, and a Clang-only flag with a follow-on argument (e.g.-Xclang -ast-dump) is parsed with correct arity.
Given a host where /usr/bin/cc is GCC:
When Bear recognizes an execution of
cc -c hello.c, then it dispatches to the GCC interpreter.
Given any host:
When Bear recognizes the same
ccpath 1000 times in one run, then the executable is fork-exec'd at most once for probing.
Given a user config containing compilers: [{ path: /usr/bin/cc, as: gcc }]:
When Bear recognizes
/usr/bin/cc, then the result is GCC and no probe is performed.
Given an executable that hangs on --version:
When Bear probes it, then the call returns within the configured timeout and the execution is reported as
NotRecognized(no entry is written; there is no regex fallback for ambiguous names).
Given an executable named cc whose --version output contains no
recognizable signature (e.g. a custom wrapper that prints a vendor
banner):
When Bear recognizes it, then the probe declines, recognition returns
None, and the execution is reported asNotRecognized. The user can recover the entry by adding the path tocompilers:with an explicitas:field.
Given an executable that reads from stdin on --version
(e.g. a misplaced bash in PATH named cc):
When Bear probes it, then the call does not block indefinitely (stdin is closed, so the read returns EOF and the process exits).
Given Bear is running with LD_PRELOAD set to its own interception library:
When Bear probes a compiler, then the probe's environment has
LD_PRELOADremoved and the probe execution is not itself recorded as a build event.
Given an executable named /usr/lib/ccache/cc
that resolves (after canonicalization) to the ccache wrapper:
When Bear recognizes
/usr/lib/ccache/cc, then it does not probe the binary and the wrapper interpreter handles the invocation as today.
default_cc.md for the design discussion that selected this
approach over PR #695 (per-invocation probe) and over the per-OS
defaults variant.compilers:. There is no
process-wide off switch; the override is per-path and explicit.gcc.yaml carries a
comment explaining why cc/c++ are absent from its recognize list.cc, c++). Cross-
prefixed variants (aarch64-linux-gnu-cc) are not in the list because
cross-toolchains are overwhelmingly GCC and the regex already handles
them; if a real BSD cross-toolchain case appears, the list can grow.