internal/capabilities/generate/README.md
This internal tool is responsible for:
syft/pkg/cataloger/*/capabilities.yaml ecosystem files, which document what capabilities each cataloger in syft hasSyft has dozens of catalogers across many ecosystems. Each cataloger has different capabilities, such as:
The capability YAML files (located alongside cataloger source code) contain all of these capability claims, organized by ecosystem.
The capabilities generation system itself:
Why do this? The short answer is to provide a foundation for the OSS documentation, where the source of truth for facts about the capabilities of Syft can be derived from verifiable claims from the tool itself.
To regenerate capability YAML files after code changes:
make generate-capabilities
This will:
If you have made cataloger code changes you may see completeness tests fail, that's ok! That only means that you need to manually update the capability YAML files that were generated.
graph LR
subgraph "Source Code Inputs"
A1[syft/pkg/cataloger/*/
cataloger.go]
A2[syft/pkg/cataloger/*/
config.go]
A3[cmd/syft/internal/options/
catalog.go, ecosystem.go]
A4[syft task factories
AllCatalogers]
end
subgraph "Test Inputs"
B1[testdata/
test-observations.json]
end
subgraph "Discovery Processes"
C1[discover_catalogers.go
AST Parse Catalogers]
C2[discover_cataloger_configs.go
AST Parse Configs]
C3[discover_app_config.go
AST Parse App Configs]
C4[discover_metadata.go
Read Observations]
C5[cataloger_config_linking.go
Link Catalogers to Configs]
C6[cataloger_names.go
Query Task Factories]
end
subgraph "Discovered Data"
D1[Generic Catalogers
name, parsers, detectors]
D2[Config Structs
fields, app-config keys]
D3[App Config Fields
keys, descriptions, defaults]
D4[Metadata Types
per parser/cataloger]
D5[Package Types
per parser/cataloger]
D6[Cataloger-Config Links
mapping]
D7[Selectors
tags per cataloger]
end
subgraph "Configuration/Overrides"
E2[metadataTypeCoverageExceptions
packageTypeCoverageExceptions
observationExceptions]
E1[catalogerTypeOverrides
catalogerConfigOverrides
catalogerConfigExceptions]
end
subgraph "Capabilities YAML processing"
F1[io.go
Load Existing YAML]
F2[merge.go
Merge Logic]
F3[Preserve MANUAL fields
Update AUTO-GENERATED]
end
subgraph "Output"
G1[syft/pkg/cataloger/*/
capabilities.yaml]
end
subgraph "Validation"
H1[Distributed Tests
Comprehensive Tests]
H2[metadata_check.go
Type Coverage]
end
A1 --> C1
A2 --> C2
A3 --> C3
A4 --> C6
B1 --> C4
C1 --> D1
C2 --> D2
C3 --> D3
C4 --> D4
C4 --> D5
C5 --> D6
C6 --> D7
D1 --> F2
D2 --> F2
D3 --> F2
D4 --> F2
D5 --> F2
D6 --> F2
D7 --> F2
E1 -.configure.-> F2
E2 -.configure.-> H1
F1 --> F3
F2 --> F3
F3 --> G1
G1 --> H1
G1 --> H2
style D1 fill:#e1f5ff
style D2 fill:#e1f5ff
style D3 fill:#e1f5ff
style D4 fill:#e1f5ff
style D5 fill:#e1f5ff
style D6 fill:#e1f5ff
style D7 fill:#e1f5ff
style G1 fill:#c8e6c9
style E1 fill:#fff9c4
style E2 fill:#fff9c4
syft/pkg/cataloger/ to find generic.NewCataloger() calls and extract parser information// app-config: annotationsThe syft/pkg/cataloger/*/capabilities.yaml files are the canonical documentation of:
Each ecosystem has its own file (e.g., syft/pkg/cataloger/golang/capabilities.yaml, syft/pkg/cataloger/python/capabilities.yaml):
# syft/pkg/cataloger/golang/capabilities.yaml - Cataloger capabilities for the golang ecosystem
configs: # AUTO-GENERATED - config structs for this ecosystem
golang.CatalogerConfig:
fields:
- key: SearchLocalModCacheLicenses
description: SearchLocalModCacheLicenses enables...
app_key: golang.search-local-mod-cache-licenses
catalogers: # Mixed AUTO-GENERATED structure, MANUAL capabilities
- ecosystem: golang # MANUAL
name: go-module-cataloger # AUTO-GENERATED
type: generic # AUTO-GENERATED
source: # AUTO-GENERATED
file: syft/pkg/cataloger/golang/cataloger.go
function: NewGoModuleBinaryCataloger
config: golang.CatalogerConfig # AUTO-GENERATED
selectors: [go, golang, ...] # AUTO-GENERATED
parsers: # AUTO-GENERATED structure
- function: parseGoMod # AUTO-GENERATED
detector: # AUTO-GENERATED
method: glob
criteria: ["**/go.mod"]
metadata_types: # AUTO-GENERATED
- pkg.GolangModuleEntry
package_types: # AUTO-GENERATED
- go-module
json_schema_types: # AUTO-GENERATED
- GolangModEntry
capabilities: # MANUAL - preserved across regeneration
- name: license
default: false
conditions:
- when: {SearchRemoteLicenses: true}
value: true
comment: fetches licenses from proxy.golang.org
- name: dependency.depth
default: [direct, indirect]
- name: dependency.edges
default: complete
These are updated on every regeneration:
Cataloger Level:
name - cataloger identifiertype - "generic" or "custom"source.file - source file pathsource.function - constructor function nameconfig - linked config struct nameselectors - tags from task factoriesParser Level (generic catalogers):
function - parser function name (as used in the generic cataloger)detector.method - glob/path/mimetypedetector.criteria - patterns matchedmetadata_types - from test-observations.jsonpackage_types - from test-observations.jsonjson_schema_types - converted from metadata_typesCustom Cataloger Level:
metadata_types - from test-observations.jsonpackage_types - from test-observations.jsonjson_schema_types - converted from metadata_typesSections:
appconfig.yaml: contains the application: section with app-level config keys relevant to catalogersconfigs: section in each ecosystem file: config structs used by catalogers in that ecosystemThese are preserved across regeneration and must be edited by hand:
ecosystem - ecosystem/language identifier (cataloger level)capabilities - capability definitions with conditionsdetectors - for custom catalogers (except binary-classifier-cataloger)conditions on detectors - when detector is active based on configWhen you run go generate ./internal/capabilities:
syft/pkg/cataloger/*/capabilities.yaml and internal/capabilities/appconfig.yaml into both structs (for logic) and node trees (for comment preservation)# AUTO-GENERATED, # MANUAL) to field commentssyft/pkg/cataloger/*/capabilities.yaml using the node tree to preserve all comments[!NOTE] Don't forget to update test observation evidence with
go test ./syft/pkg/...before regeneration.
1. Discovery Phase
├─ Parse cataloger source code (AST)
├─ Find all parsers and detectors
├─ Read test observations for metadata types
├─ Discover config structs and fields
├─ Discover app-level configurations
└─ Link catalogers to their configs
2. Merge Phase
├─ Load existing syft/pkg/cataloger/*/capabilities.yaml files
├─ Process each cataloger:
│ ├─ Update AUTO-GENERATED fields
│ └─ Preserve MANUAL fields
├─ Add new catalogers
└─ Detect orphaned entries
3. Write Phase
├─ Group catalogers by ecosystem
├─ Update YAML node trees per ecosystem
├─ Add field annotations
└─ Write to syft/pkg/cataloger/*/capabilities.yaml and appconfig.yaml
4. Validation Phase
├─ Check all catalogers present
├─ Check metadata/package type coverage
└─ Run completeness tests
discover_catalogers.go)What it finds: catalogers using the generic.NewCataloger() pattern
Process:
syft/pkg/cataloger/ recursively for .go filesgo/ast, go/parser)New*Cataloger() pkg.Catalogergeneric.NewCataloger(name, ...) callWithParserBy*() calls:
generic.NewCataloger("my-cataloger").
WithParserByGlobs(parseMyFormat, "**/*.myformat").
WithParserByMimeTypes(parseMyBinary, "application/x-mytype")
parseMyFormat)Output: map[string]DiscoveredCataloger with full parser information
discover_cataloger_configs.go)What it finds: cataloger configuration structs
Process:
.go files in syft/pkg/cataloger/*/// app-config: key.name annotations in field commentspkgcataloging.Config)Example source:
type CatalogerConfig struct {
// SearchLocalModCacheLicenses enables searching for go package licenses
// in the local GOPATH mod cache.
// app-config: golang.search-local-mod-cache-licenses
SearchLocalModCacheLicenses bool
}
Output: map[string]ConfigInfo with field details and app-config keys
discover_app_config.go)What it finds: application-level configuration from the options package
Process:
cmd/syft/internal/options/catalog.go to find Catalog structGolang golangConfig)golang.go)DescribeFields() []FieldDescription methoddefault*Config() function and extract default valuesgolang.search-local-mod-cache-licenses)Example source:
// golang.go
type golangConfig struct {
SearchLocalModCacheLicenses bool `yaml:"search-local-mod-cache-licenses" ...`
}
func (c golangConfig) DescribeFields(opts ...options.DescribeFieldsOption) []options.FieldDescription {
return []options.FieldDescription{
{
Name: "search-local-mod-cache-licenses",
Description: "search for go package licences in the GOPATH...",
},
}
}
Output: []AppConfigField with keys, descriptions, and defaults
cataloger_config_linking.go)What it finds: which config struct each cataloger uses
Process:
catalogerConfigOverrides mapcatalogerConfigExceptions setExample:
// Constructor signature:
func NewGoModuleBinaryCataloger(cfg golang.CatalogerConfig) pkg.Cataloger
// Results in link:
"go-module-binary-cataloger" → "golang.CatalogerConfig"
Output: map[string]string (cataloger → config mapping)
discover_metadata.go)What it finds: metadata types and package types each parser produces
Process:
testdata/test-observations.json files{
"package": "golang",
"parsers": {
"parseGoMod": {
"metadata_types": ["pkg.GolangModuleEntry"],
"package_types": ["go-module"]
}
},
"catalogers": {
"linux-kernel-cataloger": {
"metadata_types": ["pkg.LinuxKernel"],
"package_types": ["linux-kernel"]
}
}
}
packagemetadata registryWhy this exists: the AST parser can't determine what types a parser produces just by reading code. This information comes from test execution.
Output: populated MetadataTypes and PackageTypes on catalogers/parsers
syft/pkg/cataloger/*/cataloger.go)What's extracted:
Example:
func NewGoModuleBinaryCataloger() pkg.Cataloger {
return generic.NewCataloger("go-module-binary-cataloger").
WithParserByGlobs(parseGoBin, "**/go.mod").
WithParserByMimeTypes(parseGoArchive, "application/x-archive")
}
syft/pkg/cataloger/*/config.go)What's extracted:
Example:
type CatalogerConfig struct {
// SearchRemoteLicenses enables downloading go package licenses from the upstream
// go proxy (typically proxy.golang.org).
// app-config: golang.search-remote-licenses
SearchRemoteLicenses bool
// LocalModCacheDir specifies the location of the local go module cache directory.
// When not set, syft will attempt to discover the GOPATH env or default to $HOME/go.
// app-config: golang.local-mod-cache-dir
LocalModCacheDir string
}
cmd/syft/internal/options/)What's extracted:
DescribeFields() methodsdefault*Config() functionsExample:
// catalog.go
type Catalog struct {
Golang golangConfig `yaml:"golang" json:"golang" mapstructure:"golang"`
}
// golang.go
func (c golangConfig) DescribeFields(opts ...options.DescribeFieldsOption) []options.FieldDescription {
return []options.FieldDescription{
{
Name: "search-remote-licenses",
Description: "search for go package licences by retrieving the package from a network proxy",
},
}
}
Location: syft/pkg/cataloger/*/testdata/test-observations.json
Purpose: records what metadata and package types each parser produces during test execution
How they're generated: automatically by the pkgtest.CatalogTester helpers when tests run
Example test code:
func TestGoModuleCataloger(t *testing.T) {
tester := NewGoModuleBinaryCataloger()
pkgtest.NewCatalogTester().
FromDirectory(t, "testdata/go-module-fixture").
TestCataloger(t, tester) // Auto-writes observations on first run
}
Example observations file:
{
"package": "golang",
"parsers": {
"parseGoMod": {
"metadata_types": ["pkg.GolangModuleEntry"],
"package_types": ["go-module"]
},
"parseGoSum": {
"metadata_types": ["pkg.GolangModuleEntry"],
"package_types": ["go-module"]
}
}
}
Why this exists:
TestAllCatalogers HaveObservations)allPackageCatalogerInfo())What's extracted:
Example:
info := cataloger.CatalogerInfo{
Name: "go-module-binary-cataloger",
Selectors: []string{"go", "golang", "binary", "language", "package"},
}
merge.go)// catalogerTypeOverrides forces a specific cataloger type when discovery gets it wrong
var catalogerTypeOverrides = map[string]string{
"java-archive-cataloger": "custom", // technically generic but treated as custom
}
// catalogerConfigExceptions lists catalogers that should NOT have config linked
var catalogerConfigExceptions = strset.New(
"binary-classifier-cataloger",
)
// catalogerConfigOverrides manually specifies config when linking fails
var catalogerConfigOverrides = map[string]string{
"dotnet-portable-executable-cataloger": "dotnet.CatalogerConfig",
"nix-store-cataloger": "nix.Config",
}
When to update:
catalogerTypeOverrides when a cataloger's type is misdetectedcatalogerConfigExceptions when a cataloger shouldn't have configcatalogerConfigOverrides when automatic config linking fails*_test.go files)// requireParserObservations controls observation validation strictness
// - true: fail if ANY parser is missing observations (strict)
// - false: only check custom catalogers (lenient, current mode)
const requireParserObservations = false
// metadataTypeCoverageExceptions lists metadata types allowed to not be documented
var metadataTypeCoverageExceptions = strset.New(
reflect.TypeOf(pkg.MicrosoftKbPatch{}).Name(),
)
// packageTypeCoverageExceptions lists package types allowed to not be documented
var packageTypeCoverageExceptions = strset.New(
string(pkg.JenkinsPluginPkg),
string(pkg.KbPkg),
)
// observationExceptions maps cataloger/parser names to observation types to skip
// - nil value: skip ALL observation checks for this cataloger/parser
// - set value: skip only specified observation types
var observationExceptions = map[string]*strset.Set{
"graalvm-native-image-cataloger": nil, // skip all checks
"linux-kernel-cataloger": strset.New("relationships"), // skip only relationships
}
When to update:
observationExceptions when a cataloger lacks reliable test fixturesrequireParserObservations = true when ready to enforce full parser coverageCompleteness tests ensure capability YAML files stay in perfect sync with the codebase. These tests catch:
Completeness tests are distributed across multiple packages:
internal/capabilities/internal/capabilities/internal/internal/capabilities/generate/Guard Clause: All completeness tests are protected by checkCompletenessTestsEnabled() which checks the SYFT_ENABLE_COMPLETENESS_TESTS=true environment variable.
Why the guard? There's a chicken-and-egg problem:
go test ./syft/pkg/...) must run first to generate observation JSON filesTestCatalogersInSync
syft cataloger list appear in YAMLFailure means: you added/removed a cataloger but didn't regenerate the capability YAML files
Fix: run go generate ./internal/capabilities
TestCapabilitiesAreUpToDate
Failure means: capability YAML files weren't regenerated after code changes
Fix: run go generate ./internal/capabilities and commit changes
TestPackageTypeCoverage
pkg.AllPkgs are documented in some catalogerpackageTypeCoverageExceptionsFailure means: you added a new package type but no cataloger documents it
Fix: either add a cataloger entry or add to exceptions if intentionally not supported
TestMetadataTypeCoverage
packagemetadata.AllTypes() are documentedmetadataTypeCoverageExceptionsFailure means: you added a new metadata type but no cataloger produces it
Fix: either add metadata_types to a cataloger or add to exceptions
TestMetadataTypesHaveJSONSchemaTypes
Failure means: metadata_types and json_schema_types are out of sync
Fix: run go generate ./internal/capabilities to regenerate synchronized types
TestCatalogerStructure
Failure means: cataloger structure doesn't follow conventions
Fix: correct the cataloger structure in the appropriate syft/pkg/cataloger/*/capabilities.yaml file
TestCatalogerDataQuality
Failure means: data integrity issue in capability YAML files
Fix: remove duplicates or fix detector definitions
TestConfigCompleteness
configs: section are referenced by a catalogerapplication: sectionFailure means: orphaned config or broken reference
Fix: remove unused configs or add missing entries
TestAppConfigFieldsHaveDescriptions
Failure means: missing DescribeFields() entry
Fix: add description in the ecosystem's DescribeFields() method
TestAppConfigKeyFormat
ecosystem.field-nameFailure means: malformed config key
Fix: rename the config key to follow conventions
TestCapabilityConfigFieldReferences
Example failure:
capabilities:
- name: license
conditions:
- when: {NonExistentField: true} # ← this field doesn't exist in config struct
value: true
Fix: correct the field name to match the actual config struct
TestCapabilityFieldNaming
licensedependency.depthdependency.edgesdependency.kindspackage_manager.files.listingpackage_manager.files.digestspackage_manager.package_integrity_hashFailure means: typo in capability field name
Fix: correct the typo or add new field to known list
TestCapabilityValueTypes
license, package_manager.*dependency.depth, dependency.kindsdependency.edgesExample failure:
capabilities:
- name: license
default: "yes" # ← should be boolean true/false
Fix: use correct type for the field
TestCapabilityEvidenceFieldReferences
Example:
capabilities:
- name: package_manager.files.digests
default: true
evidence:
- AlpmDBEntry.Files[].Digests # ← validates this path exists
Failure means: typo in evidence reference or struct was changed
Fix: correct the evidence reference or update after struct changes
TestCatalogersHaveTestObservations
requireParserObservations)observationExceptionsFailure means: cataloger tests aren't using pkgtest helpers
Fix: update tests to use pkgtest.CatalogTester:
pkgtest.NewCatalogTester().
FromDirectory(t, "testdata/my-fixture").
TestCataloger(t, myCataloger)
go generate ./internal/capabilities| Failure | Most Likely Cause | Fix |
|---|---|---|
| Cataloger not in YAML | Added new cataloger | Regenerate |
| Orphaned YAML entry | Removed cataloger | Regenerate |
| Missing metadata type | Added type but no test observations | Add pkgtest usage or exception |
| Missing observations | Test not using pkgtest | Update test to use CatalogTester |
| Config field reference | Typo in capability condition | Fix field name in capability YAML |
| Incomplete capabilities | Missing capability definition | Add capabilities section to capability YAML |
These fields in capability YAML files are MANUAL and must be maintained by hand:
catalogers:
- ecosystem: golang # MANUAL - identify the ecosystem
Guidelines: use the ecosystem/language name (golang, python, java, rust, etc.)
For Generic Catalogers (parser level):
parsers:
- function: parseGoMod
capabilities: # MANUAL
- name: license
default: false
conditions:
- when: {SearchRemoteLicenses: true}
value: true
comment: fetches licenses from proxy.golang.org
- name: dependency.depth
default: [direct, indirect]
- name: dependency.edges
default: complete
For Custom Catalogers (cataloger level):
catalogers:
- name: linux-kernel-cataloger
type: custom
capabilities: # MANUAL
- name: license
default: true
For most custom catalogers:
detectors: # MANUAL
- method: glob
criteria:
- '**/lib/modules/**/modules.builtin'
comment: kernel modules directory
Exception: binary-classifier-cataloger has AUTO-GENERATED detectors extracted from source
when a detector should only be active with certain configuration:
detectors:
- method: glob
criteria: ['**/*.zip']
conditions: # MANUAL
- when: {IncludeZipFiles: true}
comment: ZIP detection requires explicit config
Boolean Fields:
- name: license
default: true # always available
# OR
default: false # never available
# OR
default: false
conditions:
- when: {SearchRemoteLicenses: true}
value: true
comment: requires network access to fetch licenses
Array Fields (dependency.depth):
- name: dependency.depth
default: [direct] # only immediate dependencies
# OR
default: [direct, indirect] # full transitive closure
# OR
default: [] # no dependency information
String Fields (dependency.edges):
- name: dependency.edges
default: "" # dependencies found but no edges between them
# OR
default: flat # single level of dependencies with edges to root only
# OR
default: reduced # transitive reduction (redundant edges removed)
# OR
default: complete # all relationships with accurate direct/indirect edges
Array Fields (dependency.kinds):
- name: dependency.kinds
default: [runtime] # production dependencies only
# OR
default: [runtime, dev] # production and development
# OR
default: [runtime, dev, build, test] # all dependency types
Conditions allow capabilities to vary based on configuration values:
capabilities:
- name: license
default: false
conditions:
- when: {SearchLocalModCacheLicenses: true}
value: true
comment: searches for licenses in GOPATH mod cache
- when: {SearchRemoteLicenses: true}
value: true
comment: fetches licenses from proxy.golang.org
comment: license scanning requires configuration
Rules:
when clause use AND logic (all must match)default value is usedevidence documents which struct fields provide the capability:
- name: package_manager.files.listing
default: true
evidence:
- AlpmDBEntry.Files
comment: file listings stored in Files array
For nested fields:
evidence:
- CondaMetaPackage.PathsData.Paths
For array element fields:
evidence:
- AlpmDBEntry.Files[].Digests
SYFT_ENABLE_COMPLETENESS_TESTS=true go test ./internal/capabilities/... to validategeneric.NewCataloger():What happens automatically:
syft/pkg/cataloger/*/capabilities.yaml fileWhat you must do manually:
ecosystem field in the capability YAML filecapabilities sections to each parsergo generate ./internal/capabilitiesExample workflow:
# 1. Write cataloger code
vim syft/pkg/cataloger/mynew/cataloger.go
# 2. Write tests using pkgtest (generates observations)
vim syft/pkg/cataloger/mynew/cataloger_test.go
# 3. Run tests to generate observations
go test ./syft/pkg/cataloger/mynew
# 4. Regenerate capability YAML files
go generate ./internal/capabilities
# 5. Edit the capability file manually
vim syft/pkg/cataloger/mynew/capabilities.yaml
# - Set ecosystem field
# - Add capabilities sections
# 6. Validate
SYFT_ENABLE_COMPLETENESS_TESTS=true go test ./internal/capabilities/...
# 7. Commit
git add syft/pkg/cataloger/mynew/capabilities.yaml
git add syft/pkg/cataloger/mynew/testdata/test-observations.json
git commit
What happens automatically:
What you must do manually:
ecosystemdetectors array with detection methodscapabilities section (cataloger level, not parser level)go generate ./internal/capabilitiesImpact: AUTO-GENERATED field, automatically updated
Workflow:
# 1. Change the code
vim syft/pkg/cataloger/something/cataloger.go
# 2. Regenerate
go generate ./internal/capabilities
# 3. Review changes
git diff syft/pkg/cataloger/*/capabilities.yaml
# 4. Commit
git add syft/pkg/cataloger/*/capabilities.yaml
git commit
Impact: AUTO-GENERATED field, updated via test observations
Workflow:
# 1. Change the code
vim syft/pkg/cataloger/something/parser.go
# 2. Update tests (if needed)
vim syft/pkg/cataloger/something/parser_test.go
# 3. Run tests to update observations
go test ./syft/pkg/cataloger/something
# 4. Regenerate
go generate ./internal/capabilities
# 5. Commit
git add syft/pkg/cataloger/*/capabilities.yaml
git add syft/pkg/cataloger/something/testdata/test-observations.json
git commit
Impact: MANUAL field, preserved across regeneration
Workflow:
# 1. Edit the capability file directly
vim syft/pkg/cataloger/something/capabilities.yaml
# 2. Validate
SYFT_ENABLE_COMPLETENESS_TESTS=true go test ./internal/capabilities/...
# 3. Commit
git commit syft/pkg/cataloger/something/capabilities.yaml
if you need to add a completely new capability field (e.g., package_manager.build_tool_info):
Steps:
syft/pkg/cataloger/*/capabilities.yaml filescatalogerTypeOverrides:catalogerConfigExceptions:catalogerConfigOverrides:metadataTypeCoverageExceptions:MicrosoftKbPatch (special case type)packageTypeCoverageExceptions:JenkinsPluginPkg, KbPkgobservationExceptions:graalvm-native-image-cataloger (requires native images)internal/capabilities/generate/)main.go: entry point, orchestrates regeneration, prints status messagesmerge.go: core merging logic, preserves manual sections while updating auto-generatedio.go: YAML reading/writing with comment preservation using gopkg.in/yaml.v3internal/capabilities/generate/)discover_catalogers.go: AST parsing to discover generic catalogers and parsers from source codediscover_cataloger_configs.go: AST parsing to discover cataloger config structsdiscover_app_config.go: AST parsing to discover application-level config from options packagecataloger_config_linking.go: links catalogers to config structs by analyzing constructorsdiscover_metadata.go: reads test-observations.json files to get metadata/package typesinternal/capabilities/internal/)paths.go: single source of truth for all capability file pathsload_capabilities.go: loading and parsing capability YAML filescataloger_names.go: helper to get all cataloger names from syft task factoriesrepo_root.go: helpers to find the repository rootmetadata_check.go: validates metadata and package type coverageTests are distributed across packages with checkCompletenessTestsEnabled() guard:
internal/capabilities/*_test.gointernal/capabilities/internal/*_test.gointernal/capabilities/generate/*_test.godiscover_catalogers_test.go: tests for cataloger discoverydiscover_cataloger_configs_test.go: tests for config discoverycataloger_config_linking_test.go: tests for config linkingmerge_test.go: tests for merge logicio_test.go: tests for YAML I/OCause: you added a new cataloger but didn't regenerate the capability YAML files
Fix:
go generate ./internal/capabilities
Cause: you removed a cataloger but didn't regenerate
Fix:
go generate ./internal/capabilities
# Review the diff - the cataloger entry should be removed
Cause: you added a new metadata type but:
Fix Option 1 - Add test observations:
// Update test to use pkgtest
pkgtest.NewCatalogTester().
FromDirectory(t, "testdata/my-fixture").
TestCataloger(t, myCataloger)
// Run tests
go test ./syft/pkg/cataloger/mypackage
// Regenerate
go generate ./internal/capabilities
Fix Option 2 - Add exception (if intentionally unused):
// in metadata_check.go
var metadataTypeExceptions = map[string]bool{
"pkg.MyNewType": true,
}
Cause: test doesn't use pkgtest helpers
Fix:
// Before:
func TestMyParser(t *testing.T) {
// manual test code
}
// After:
func TestMyParser(t *testing.T) {
cataloger := NewMyCataloger()
pkgtest.NewCatalogTester().
FromDirectory(t, "testdata/my-fixture").
TestCataloger(t, cataloger)
}
Cause: capability condition references a non-existent config field
Fix: edit the capability YAML file and correct the field name:
# Before:
conditions:
- when: {SerachRemoteLicenses: true} # typo!
# After:
conditions:
- when: {SearchRemoteLicenses: true}
Cause:
Fix: edit the capability YAML file and correct the evidence reference:
# Before:
evidence:
- AlpmDBEntry.FileListing # wrong field name
# After:
evidence:
- AlpmDBEntry.Files
Cause: capability YAML files are out of date (usually caught in CI)
Fix:
go generate ./internal/capabilities
git add syft/pkg/cataloger/*/capabilities.yaml
git commit -m "chore: regenerate capabilities"
Cause: config linking trying to link to a non-existent struct
Fix Option 1 - Add override:
// merge.go
var catalogerConfigOverrides = map[string]string{
"my-cataloger": "mypackage.MyConfig",
}
Fix Option 2 - Add exception:
// merge.go
var catalogerConfigExceptions = strset.New(
"my-cataloger", // doesn't use config
)
Cause: parser in capability YAML file has no capabilities section
Fix: add capabilities to the parser:
parsers:
- function: parseMyFormat
capabilities:
- name: license
default: false
- name: dependency.depth
default: []
# ... (add all required capability fields)
most test failures include detailed guidance. Look for:
General debugging approach:
if you encounter problems not covered here: