docs/wiki/Getting-Started-VB-Syntax-Analysis.md
Today, the Visual Basic and C# compilers are black boxes - text goes in and bytes come out - with no transparency into the intermediate phases of the compilation pipeline. With the .NET Compiler Platform (formerly known as "Roslyn"), tools and developers can leverage the exact same data structures and algorithms the compiler uses to analyze and understand code with confidence that that information is accurate and complete.
In this walkthrough we'll explore the Syntax API. The Syntax API exposes the parsers, the syntax trees themselves, and utilities for reasoning about and constructing them.
The Syntax API exposes the syntax trees the compilers use to understand Visual Basic and C# programs. They are produced by the same parser that runs when a project is built or a developer hits F5. The syntax trees have full-fidelity with the language; every bit of information in a code file is represented in the tree, including things like comments or whitespace. Writing a syntax tree to text will reproduce the exact original text that was parsed. The syntax trees are also immutable; once created a syntax tree can never be changed. This means consumers of the trees can analyze the trees on multiple threads, without locks or other concurrency measures, with the security that the data will never change.
The four primary building blocks of syntax trees are:
SyntaxNode: Blue | SyntaxToken: Green | SyntaxTrivia: Red
By navigating this tree structure you can find any statement, expression, token, or bit of whitespace in a code file!
The following steps use Edit and Continue to demonstrate how to parse VB source text and find a parameter declaration contained in the source.
Option Strict Off
Dim tree As SyntaxTree = VisualBasicSyntaxTree.ParseText(
"Imports System
Imports System.Collections
Imports System.Linq
Imports System.Text
Namespace HelloWorld
Module Program
Sub Main(args As String())
Console.WriteLine(""Hello, World!"")
End Sub
End Module
End Namespace")
Dim root As Syntax.CompilationUnitSyntax = tree.GetRoot()
Dim firstMember = root.Members(0)
Dim helloWorldDeclaration As Syntax.NamespaceBlockSyntax = firstMember
Dim programDeclaration As Syntax.ModuleBlockSyntax =
helloWorldDeclaration.Members(0)
Execute this statement.
Locate the Main declaration in the programDeclaration.Members collection and store it in a new variable:
Dim mainDeclaration As Syntax.MethodBlockSyntax = programDeclaration.Members(0)
Dim argsParameter As Syntax.ParameterSyntax =
mainDeclaration.BlockStatement.ParameterList.Parameters(0)
Option Strict Off
Module Module1
Sub Main()
Dim tree As SyntaxTree = VisualBasicSyntaxTree.ParseText(
"Imports System
Imports System.Collections
Imports System.Linq
Imports System.Text
Namespace HelloWorld
Module Program
Sub Main(args As String())
Console.WriteLine(""Hello, World!"")
End Sub
End Module
End Namespace")
Dim root As Syntax.CompilationUnitSyntax = tree.GetRoot()
Dim firstMember = root.Members(0)
Dim helloWorldDeclaration As Syntax.NamespaceBlockSyntax = firstMember
Dim programDeclaration As Syntax.ModuleBlockSyntax =
helloWorldDeclaration.Members(0)
Dim mainDeclaration As Syntax.MethodBlockSyntax = programDeclaration.Members(0)
Dim argsParameter As Syntax.ParameterSyntax =
mainDeclaration.BlockStatement.ParameterList.Parameters(0)
End Sub
End Module
In addition to traversing trees using the properties of the SyntaxNode derived classes you can also explore the syntax tree using the query methods defined on SyntaxNode. These methods should be immediately familiar to anyone familiar with XPath. You can use these methods with LINQ to quickly find things in a tree.
Dim firstParameters = From methodStatement In root.DescendantNodes().
OfType(Of Syntax.MethodStatementSyntax)()
Where methodStatement.Identifier.ValueText = "Main"
Select methodStatement.ParameterList.Parameters.First()
Dim argsParameter2 = firstParameters.First()
Start debugging the program.
Open the Immediate Window.
Often you'll want to find all nodes of a specific type in a syntax tree, for example, every property declaration in a file. By extending the VisualBasicSyntaxWalker class and overriding the VisitPropertyStatement method, you can process every property declaration in a syntax tree without knowing its structure beforehand. VisualBasicSyntaxWalker is a specific kind of SyntaxVisitor which recursively visits a node and each of its children.
This example shows how to implement a VisualBasicSyntaxWalker which examines an entire syntax tree and collects any Imports statements it finds which aren't importing a System namespace.
Create a new Visual Basic Stand-Alone Code Analysis Tool project; name it "ImportsCollectorVB".
Enter the following line at the top of your Module1.vb file:
Option Strict Off
Dim tree As SyntaxTree = VisualBasicSyntaxTree.ParseText(
"Imports Microsoft.VisualBasic
Imports System
Imports System.Collections
Imports Microsoft.Win32
Imports System.Linq
Imports System.Text
Imports Microsoft.CodeAnalysis
Imports System.ComponentModel
Imports System.Runtime.CompilerServices
Imports Microsoft.CodeAnalysis.VisualBasic
Namespace HelloWorld
Module Program
Sub Main(args As String())
Console.WriteLine(""Hello, World!"")
End Sub
End Module
End Namespace")
Dim root As Syntax.CompilationUnitSyntax = tree.GetRoot()
Note that this source text contains a long list of Imports statements.
Add a new class file to the project.
Option Strict Off
Public Class ImportsCollector
Inherits VisualBasicSyntaxWalker
Public ReadOnly [Imports] As New List(Of Syntax.ImportsStatementSyntax)()
Public Overrides Sub VisitSimpleImportsClause(
node As SimpleImportsClauseSyntax
)
End Sub
If node.Name.ToString() = "System" OrElse
node.Name.ToString().StartsWith("System.") Then Return
[Imports].Add(node.Parent)
Option Strict Off
Public Class ImportsCollector
Inherits VisualBasicSyntaxWalker
Public ReadOnly [Imports] As New List(Of Syntax.ImportsStatementSyntax)()
Public Overrides Sub VisitSimpleImportsClause(
node As SimpleImportsClauseSyntax
)
If node.Name.ToString() = "System" OrElse
node.Name.ToString().StartsWith("System.") Then Return
[Imports].Add(node.Parent)
End Sub
End Class
Return to the Module1.vb file.
Add the following code to the end of the Main method to create an instance of the ImportsCollector, use that instance to visit the root of the parsed tree, and iterate over the ImportsStatementSyntax nodes collected and print their names to the Console:
Dim visitor As New ImportsCollector()
visitor.Visit(root)
For Each statement In visitor.Imports
Console.WriteLine(statement)
Next
Option Strict Off
Module Module1
Sub Main()
Dim tree As SyntaxTree = VisualBasicSyntaxTree.ParseText(
"Imports Microsoft.VisualBasic
Imports System
Imports System.Collections
Imports Microsoft.Win32
Imports System.Linq
Imports System.Text
Imports Microsoft.CodeAnalysis
Imports System.ComponentModel
Imports System.Runtime.CompilerServices
Imports Microsoft.CodeAnalysis.VisualBasic
Namespace HelloWorld
Module Program
Sub Main(args As String())
Console.WriteLine(""Hello, World!"")
End Sub
End Module
End Namespace")
Dim root As Syntax.CompilationUnitSyntax = tree.GetRoot()
Dim visitor As New ImportsCollector()
visitor.Visit(root)
For Each statement In visitor.Imports
Console.WriteLine(statement)
Next
End Sub
End Module
Imports Microsoft.VisualBasic
Imports Microsoft.Win32
Imports Microsoft.CodeAnalysis
Imports Microsoft.CodeAnalysis.VisualBasic
Press any key to continue . . .
Observe that the walker has located all four non-System namespace Imports statements.
Congratulations! You've just used the Syntax API to locate specific kinds of VB statements and declarations in VB source code.