docs/ide/api-designs/Diagnostic Analysis.md
The diagnostic analysis system computes diagnostics (errors, warnings, and informational messages) from analyzers for documents and projects within a solution. The system evolved from a complex "solution crawler" architecture to a snapshot-based model where callers request diagnostics for a specific immutable solution snapshot and receive accurate results for that snapshot.
Core principle: Given an immutable snapshot of a document or project, compute and return the correct diagnostics for that snapshot. The system handles caching and optimization internally as implementation details that preserve correctness guarantees.
Two distinct higher-level systems consume this diagnostic analysis service:
Live Diagnostics System: Drives squiggles, error list, and real-time feedback during editing. The LSP pull
diagnostics model serves as the primary mechanism for delivering diagnostics to VS features. The LSP layer maintains
its own cache of diagnostic results and invokes IDiagnosticAnalyzerService only when changes would invalidate its
cache. The LSP layer's diagnostics drive most diagnostics features.
Explicit "Run Code Analysis": A user-invoked feature that computes all diagnostics for a project and displays the cached results until cleared or another analysis is initiated. This resembles executing a Build (retrieving only analysis results) from within VS itself.
Other Diagnostic Analysis Clients: Beyond the LSP-based and explicit analysis systems, several other features
directly consume IDiagnosticAnalyzerService:
This list is not exhaustive—other features and systems may also directly consume diagnostic analysis as needed. Ideally, all of these scenarios would flow through LSP in the future for consistency, but the system is not yet at that stage.
For features requiring lower-level access, this system provides a consistent API with well-defined behavior.
The primary interface IDiagnosticAnalyzerService provides methods for requesting diagnostics at different scopes:
Task<ImmutableArray<DiagnosticData>> GetDiagnosticsForSpanAsync(
TextDocument document, TextSpan? range, ...);
Task<ImmutableArray<DiagnosticData>> GetDiagnosticsForIdsAsync(
Project project, ImmutableArray<DocumentId> documentIds, ...);
Task<ImmutableArray<DiagnosticData>> GetProjectDiagnosticsForIdsAsync(
Project project, ...);
Method Selection Guide:
GetDiagnosticsForSpanAsync: Use when you need diagnostics for a specific span or document. Returns only local
diagnostics—those produced by analyzing the requested document in isolation. This is ideal for lightbulbs, error
squiggles, and code fixes where you're working within a single document. Non-local diagnostics (reported at compilation
end or from other files) are not included.
GetDiagnosticsForIdsAsync: Use when you need comprehensive document diagnostics for a project, including non-local
diagnostics. Returns diagnostics from all documents, including those reported at compilation end or from different files.
This is expensive as it requires running analyzers fully through compilation-end. Use this for error list population or
when you need the complete diagnostic picture for one or more documents.
GetProjectDiagnosticsForIdsAsync: Use when you need only project-level diagnostics—diagnostics with no source
location (not tied to any specific document). Does not return document diagnostics. Use this in conjunction with
GetDiagnosticsForIdsAsync when you need both document and project diagnostics, or alone when only project diagnostics
are required.
Understanding Non-Local Diagnostics:
Non-local diagnostics are diagnostics that cannot be determined by analyzing a single document in isolation. These diagnostics are reported by analyzers in one of two ways:
Compilation-end diagnostics: Diagnostics reported during the compilation-end analysis phase, after all documents have been analyzed. These diagnostics may require information from the entire compilation to be computed correctly. For example, an analyzer might report unused type parameters only after analyzing all usages across all files in the compilation.
Cross-file diagnostics: Diagnostics reported in a different file than where the analyzer callback was registered. For example, an analyzer registered on a method in File A might report a diagnostic on a caller of that method in File B.
Because non-local diagnostics require analyzing the entire project to ensure completeness, they are excluded from
GetDiagnosticsForSpanAsync to keep that API fast and suitable for real-time features like error squiggles and
lightbulbs. To obtain non-local diagnostics for a document, you must use GetDiagnosticsForIdsAsync, which runs the
full compilation analysis.
These methods constitute the exclusive entry points features should utilize. Upon invocation:
The OOP transition occurs once at the interface boundary.
The system handles two fundamentally different analyzer types:
Standard Roslyn DiagnosticAnalyzers that participate in compilation analysis. Execution occurs through the compiler's
CompilationWithAnalyzers infrastructure, providing:
This path serves C# and VB analyzers that analyze code using semantic information.
Specialized analyzers operating outside the compilation model, used by languages lacking Roslyn compilations (F#, XAML,
TypeScript). These analyzers implement DocumentDiagnosticAnalyzer and execute directly without
CompilationWithAnalyzers, receiving a TextDocument for independent analysis.
The system distinguishes between two analyzer sources:
.editorconfig settings when available, falling back to IDE settings (Tools > Options) when no explicit
.editorconfig setting exists.editorconfig settings (since fallback IDE settings are per-user)<PackageReference>).editorconfig settings without any fallback to IDE settings.editorconfig settings
are present (when absent, analyzers receive default values rather than user-specific IDE settings)This distinction enables isolation and versioning. Project analyzers from different projects must load simultaneously, even when representing different versions of the same assemblies. The different option resolution strategies ensure project analyzers provide consistent, reproducible behavior across all build environments and users, while host analyzers can leverage user-specific IDE preferences for workspace-wide tooling.
Due to out-of-process execution in a .NET Core environment, the system loads project analyzers into isolated Assembly Load Contexts (ALCs) enabling:
The IsolatedAnalyzerReferenceSet class manages this isolation:
1. Compute checksum for analyzer reference set
2. Verify if isolated set exists for this checksum
3. If absent, instantiate new ALC with shadow-copy loader
4. Load all analyzer assemblies into this ALC
5. Cache isolated references by checksum
6. Return IsolatedAnalyzerFileReference wrappers
The system maintains a "current" isolated set, adding new analyzers provided no MVID conflicts exist. Upon conflict detection (analyzer DLL modified on disk), a new isolated set is instantiated. Previous sets persist while any analyzer, generator, or diagnostic from them remains referenced. Once all references are released, the ALC cleanly unloads itself, removing its code and associated burden from the .NET runtime.
Key insight: The checksum encompasses the complete analyzer assembly closure and their MVIDs. Projects with identical analyzer references (same packages, same versions) share the identical isolated set and ALC, eliminating redundant loading.
For compilation-based analyzers, the system maintains a cache of CompilationWithAnalyzers instances:
ConditionalWeakTable<
Project,
SmallDictionary<
ImmutableArray<DiagnosticAnalyzer>,
AsyncLazy<CompilationWithAnalyzers?>>>
This structure provides:
Outer key (Project): Cache lifetime bound to a specific Project instance.
Inner key (Analyzer Array): Multiple analyzer sets may be cached per project, handling scenarios such as:
SmallDictionary is employed because typically only 1-2 entries exist. The overwhelming majority of cases contain
exactly one entry: the complete analyzer set for the project. In a small number of cases, this may expand. For example,
when a user invokes Fix All for a specific analyzer across a solution, causing only that analyzer to execute on all
projects. This scenario justifies the map's existence: when executing Fix All for a particular analyzer, re-running all
analyzers across all projects would be prohibitively expensive.
Lifetime: The ConditionalWeakTable ensures cache lifetime matches the Project instance. Upon release of all
Project references, the cache is collected. No explicit invalidation occurs.
Creation: When instantiating CompilationWithAnalyzers, the system:
DocumentDiagnosticAnalyzers (excluded from compilation analysis)When features request diagnostics for a document or span (e.g., lightbulbs, error squiggles):
GetDiagnosticsForSpanAsync
↓
Attempt OOP execution (if available)
↓
GetDiagnosticsForSpanInProcessAsync
↓
Collect all project analyzers
Filter analyzers by:
- Priority (High/Normal/Low for lightbulbs)
- Diagnostic kind (Syntax/Semantic/Compiler/Analyzer)
- Analysis kind support (Syntax/Semantic)
- Span-based analysis capability
↓
Partition into three sets:
- syntaxAnalyzers
- semanticSpanAnalyzers (support span-based semantic analysis)
- semanticDocumentAnalyzers (require full document)
↓
ComputeDiagnosticsInProcessAsync
↓
For each analyzer set:
- Locate or instantiate CompilationWithAnalyzers
- Create DocumentAnalysisExecutor
- Compute diagnostics via executor
- Optionally employ incremental member edit analysis
↓
Merge and filter results by requested span
For the user-invoked "Run Code Analysis" feature:
ForceRunCodeAnalysisDiagnosticsAsync
↓
Attempt OOP execution (if available)
↓
ForceRunCodeAnalysisDiagnosticsInProcessAsync
↓
Retrieve all project analyzers
Filter by effective severity (exclude if all descriptors hidden)
Include compiler analyzer, suppressors, built-ins
↓
Parallel execution:
- Document diagnostics for all documents
- Project diagnostics (compilation-level)
↓
Merge results
This feature computes all diagnostics and displays cached results until cleared or another "Run Code Analysis" phase is initiated.
The DocumentAnalysisExecutor class orchestrates analysis for a specific document and analyzer set:
public sealed partial class DocumentAnalysisExecutor
{
private readonly DocumentAnalysisScope _analysisScope;
private readonly CompilationWithAnalyzers? _compilationWithAnalyzers;
// Cached results preventing recomputation
private ImmutableDictionary<DiagnosticAnalyzer, DiagnosticAnalysisResult>? _lazySyntaxDiagnostics;
private ImmutableDictionary<DiagnosticAnalyzer, DiagnosticAnalysisResult>? _lazySemanticDiagnostics;
}
The executor handles special treatment of the compiler analyzer versus other analyzers:
Compiler Analyzer:
Other Analyzers:
_lazySyntaxDiagnostics or _lazySemanticDiagnosticsFor span-based requests, the executor adjusts the span for compiler diagnostics to encompass complete member declarations, accommodating historical compiler API limitations.
When analyzing complete documents (no span), the system optimizes for the common case of typing within a method body:
internal sealed partial class IncrementalMemberEditAnalyzer
{
private WeakReference<Document?> _lastDocumentWithCachedDiagnostics;
private MemberSpans _savedMemberSpans; // Document ID + version + member spans
}
Conditions for incremental analysis:
SupportsSpanBasedSemanticDiagnosticAnalysis())Execution flow when triggered:
1. Detect single member modification (via IDocumentDifferenceService)
2. Retrieve cached member spans from previous document version
3. Obtain cached diagnostics from previous version
4. Re-analyze only the modified member
5. Merge:
- New diagnostics for modified member
- Adjusted previous diagnostics for unchanged members (span updates for edits)
6. Update cache for subsequent iteration
Fallback: Upon condition failure (multiple members modified, initial analysis, version mismatch), the system falls back to complete document analysis.
Analyzer support: Analyzers indicate span-based support through:
public bool SupportsSpanBasedSemanticDiagnosticAnalysis()
{
return this is IBuiltInAnalyzer { RequestedAnalysisKind: AnalyzerCategory.SemanticSpanAnalysis };
}
Built-in analyzers opt into this via IBuiltInAnalyzer.GetAnalyzerCategory(). The SemanticSpanAnalysis category
signifies: "Edits within a method body affect only diagnostics reported on that method body."
To enhance lightbulb performance, expensive analyzers are deprioritized from CodeActionRequestPriority.Normal to
CodeActionRequestPriority.Low:
// Cached per analyzer - assumed stable across compilations
ConditionalWeakTable<DiagnosticAnalyzer, ImmutableHashSet<string>?>
s_analyzerToDeprioritizedDiagnosticIds;
Deprioritized analyzers:
SymbolStartAnalysisContext/SymbolEndAnalysisContext actionsSemanticModelActionsException: The compiler analyzer is never deprioritized.
Execution model:
Normal priorityLow priority passThis establishes a two-tier execution model:
Built-in analyzers utilize IBuiltInAnalyzer.IsHighPriority = true to elevate to high priority. This flag is employed
sparingly—high-priority items must complete rapidly and provide access to critical features users demand with minimal
latency.
The most prominent example is "Add Using," which represents the most frequently used lightbulb feature by more than an order of magnitude. Users expect to type an unresolved type name, press Ctrl+., and receive the suggestion to add the relevant using directive near-instantaneously. Any analyzer interference with this workflow creates noticeable friction.
Similarly, when users overtype a variable or symbol with a new name, they expect to press Ctrl+. and immediately see "Rename X to Y" at the top of the list, enabling instant invocation without waiting for other analyzer results. These high-value, high-frequency operations must remain unimpeded by slower analyzers.
The cache is populated lazily by querying CompilationWithAnalyzers.GetAnalyzerTelemetryInfoAsync() to inspect
registered actions.
The system employs versions to determine when diagnostics require recomputation:
public static Task<VersionStamp> GetDiagnosticVersionAsync(Project project, CancellationToken cancellationToken)
=> project.GetDependentVersionAsync(cancellationToken);
GetDependentVersionAsync returns a version that changes when:
This version enables:
Critical property: If the version remains unchanged, diagnostics from previous computation remain correct for the current snapshot.
The system supports multiple analysis scopes:
enum AnalysisKind
{
Syntax, // Syntax tree only, no semantic model
Semantic, // Complete semantic analysis
NonLocal // Project-level/compilation-level diagnostics
}
class DocumentAnalysisScope
{
TextDocument TextDocument;
TextSpan? Span; // null = complete document
ImmutableArray<DiagnosticAnalyzer> Analyzers;
AnalysisKind Kind;
}
Different features request distinct scopes:
When computing diagnostics for lightbulbs, the system applies aggressive filtering:
CodeActionRequestPriority.High → High-priority analyzers only (IBuiltInAnalyzer with IsHighPriority)
CodeActionRequestPriority.Normal → Normal priority analyzers (excluding deprioritized)
CodeActionRequestPriority.Low → Deprioritized analyzers or analyzers explicitly configured for this tier
CodeActionRequestPriority.Lowest → All analyzers (for suppressions/configuration)
class DiagnosticIdFilter
{
static DiagnosticIdFilter All; // No filtering
static DiagnosticIdFilter Include(string[] ids); // Specified IDs only
static DiagnosticIdFilter Exclude(string[] ids); // All except specified
}
Diagnostic ID filtering exists to enable features to customize analyzer execution without requiring callbacks. Previously, features passed lambda callbacks to control which analyzers would run. However, the migration to out-of-process execution made this untenable—callbacks cannot be trivially serialized across process boundaries, and the VS side lacks the analyzer references needed for interrogation (and deliberately avoids loading analyzers in the .NET Framework process).
The pure-data DiagnosticIdFilter model enables features to specify filtering declaratively, which can be efficiently
remoted to OOP for execution. Examples:
Include to specify only diagnostic IDs that have registered code fixes, avoiding unnecessary
analyzer execution when computing lightbulb results.Exclude to filter out IDE diagnostic IDs when computing third-party analyzer diagnostics,
ensuring it processes only non-IDE diagnostics for cleanup operations.DiagnosticKind.CompilerSyntax → Compiler syntax diagnostics only
DiagnosticKind.CompilerSemantic → Compiler semantic diagnostics only
DiagnosticKind.AnalyzerSyntax → Analyzer syntax diagnostics only
DiagnosticKind.AnalyzerSemantic → Analyzer semantic diagnostics only
DiagnosticKind.All → All diagnostics
These filters combine to minimize computational work. For example, a high-priority lightbulb request for syntax diagnostics will:
When instantiating CompilationWithAnalyzers, the system configures analyzer options differently for host versus
project analyzers:
// Simplified logic from GetOptions():
if (all analyzers are host analyzers)
return project.State.HostAnalyzerOptions;
if (all analyzers are project analyzers)
return project.State.ProjectAnalyzerOptions;
// Mixed case: provide per-analyzer options
return (
sharedOptions: project.State.HostAnalyzerOptions,
analyzerSpecificOptionsFactory: analyzer =>
isProjectAnalyzer(analyzer)
? project.State.ProjectAnalyzerOptions.AnalyzerConfigOptionsProvider
: project.State.HostAnalyzerOptions.AnalyzerConfigOptionsProvider
);
Rationale:
.editorconfig settings when available, falling back to IDE-wide settings (Tools > Options)
when no explicit .editorconfig setting exists. This enables user-specific preferences for workspace-wide tooling
but may cause different behavior across users when .editorconfig settings are absent..editorconfig settings, without any fallback to IDE
settings. When no .editorconfig setting exists, analyzers receive default values rather than user-specific IDE
preferences.The current system evolved from a fundamentally different architecture termed the "solution crawler":
Problems with the previous system:
The current model:
Remaining artifacts:
Future refactoring opportunities include simplifying the caching layer and removing unnecessary indirection.
Cached entities:
CompilationWithAnalyzers per project per analyzer setNot cached:
Rationale: Compilation and analyzer execution are computationally expensive. Final diagnostic result merging/filtering is inexpensive. This trades increased memory consumption for reduced response latency on repeated requests.
Across documents: Project-level analysis executes documents in parallel
Across analyzers: CompilationWithAnalyzers executes analyzers concurrently when configured
When a project fails to load (missing references, corrupted project file):
This prevents overwhelming users with cascading semantic errors when the project is misconfigured.
Additional files (non-code files like .editorconfig, resource files) support analysis:
AdditionalFileAnalysisContextTextDocument abstractionSource-generated files receive treatment identical to regular documents:
Diagnostic suppressors constitute a specialized analyzer type:
IPragmaSuppressionsAnalyzer interfaceSeveral areas warrant simplification or investigation:
It remains unclear whether compiler analyzer diagnostics are cached effectively. The compiler analyzer bypasses the
_lazySemanticDiagnostics cache in DocumentAnalysisExecutor, potentially resulting in recomputation on each
invocation. This warrants investigation to determine if additional caching would yield benefits.
The location and organization of caching remains unclear. Caching exists in the LSP layer, and various caching
mechanisms exist within the diagnostic service (both computation results and intermediary state like
CompilationWithAnalyzers and IncrementalMemberEditAnalyzer). Greater consistency would be beneficial, or at minimum,
better data/information indicating where caches are necessary and their effectiveness metrics (hit rates, memory
consumption).
Increased visibility into cache hit rates, analyzer execution times, and performance bottlenecks would guide optimization efforts more effectively.
The system functions effectively but retains complexity from its evolution. Future work should emphasize simplification while preserving correctness guarantees.