1 | 0 demo environment
2 | 0 Profile
Today, Visual Basic and C # compiler is a black box: a text input and output bytes, an intermediate stage of the pipeline is not compiled transparency. Use .NET compiler platform (formerly known as "Roslyn"), and tool developers can use to analyze and understand the code compiler uses the same data structures and algorithms. In this article, we will gradually familiar with the grammar API, to see the syntax parser by API, the syntax tree for reasoning and construct their utility.
3 | 0 understand the syntax tree
Trivia, Token and Node tree formed a fully representative of Visual Basic or C # code snippet all content
3|1SyntaxTree
It instance represents the entire parse tree. SyntaxTree is an abstract class, with a language-specific derived class. To parse the syntax of a particular language, you need to use CSharpSyntaxTree (or VisualBasicSyntaxTree) analytical methods on the class.
3 | 2 SyntaxNode
Its grammatical structure represented by an instance such statement, statements, clauses, and expressions.
3 | 3 SyntaxToken
It represents a single keyword, identifiers, operator or punctuation
3 | 4 SyntaxTrivia
It does not matter on the syntax information indicates, for example, the gap between the tokens, pre-instructions and comments.
The figure below shows: SyntaxNode: Blue | SyntaxToken: Green | SyntaxTrivia: red
4 | 0 traverse the syntax tree
- New Project "CodeAnalysisDemo"
- Introduced Nuget
Microsoft.CodeAnalysis.CSharp
Microsoft.CodeAnalysis.CSharp.Workspaces
- Namespaces:
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
- Ready to analyze code
using System;
namespace UsingCollectorCS
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World");
}
}
class Student
{
public string Name { get; set; }
}
}
- Core code
/// <summary>
///解析语法树
/// </summary>
/// <param name="code"></param>
/// <returns></returns>
public SyntaxNode GetRoot(string code)
{
var tree = CSharpSyntaxTree.ParseText(code);
//SyntaxTree的根root
var root = (CompilationUnitSyntax)tree.GetRoot();
//member
var firstmember = root.Members[0];
//命名空间Namespace
var helloWorldDeclaration = (NamespaceDeclarationSyntax)firstmember;
//类 class
var programDeclaration = (ClassDeclarationSyntax)helloWorldDeclaration.Members[0];
//方法 Method var mainDeclaration = (MethodDeclarationSyntax)programDeclaration.Members[0]; //参数 Parameter var argsParameter = mainDeclaration.ParameterList.Parameters[0]; //查询方法,查询方法名称为Main的第一个参数。 var firstParameters = from methodDeclaration in root.DescendantNodes() .OfType<MethodDeclarationSyntax>() where methodDeclaration.Identifier.ValueText == "Main" select methodDeclaration.ParameterList.Parameters.First(); var argsParameter2 = firstParameters.Single(); return root; }
- Entrance Main method
var code = @"using System;
namespace UsingCollectorCS
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine(""Hello World"");
}
}
class Student
{
public string Name { get; set; }
}
}";
var tree = new AnalysisDemo().GetRoot(code);
- Debug debugging
After comparing it understood the following sections
Use CSharpSyntaxTree.ParseText (code) to obtain a syntax tree SyntaxTree
Use (CompilationUnitSyntax) tree.GetRoot () Gets the syntax tree with node
Using (NamespaceDeclarationSyntax) root.Members [0] available namespace
Using (ClassDeclarationSyntax) helloWorldDeclaration.Members [0] category available
Using (MethodDeclarationSyntax) programDeclaration.Members [0] available methods
Linq use inquiries, () member methods query / parameters from the node ** ** root.DescendantNodes.
5|0SyntaxWalkers
Typically, you need to find all the nodes in the syntax tree of a particular type, for example, a file for each property declaration.
By extending CSharpSyntaxWalker class and override VisitPropertyDeclaration method, you can process each attribute declaration syntax tree without prior knowledge of its structure.
CSharpSyntaxWalker is a special SyntaxVisitor, it recursively visit each node and its child node.
Let's take a demo and two virtual virtual method VisitUsingDirective of VisitPropertyDeclaration CSharpSyntaxWalker
- The core code is as follows:
/// <summary>
/// 收集器
/// </summary>
public class UsingCollector : CSharpSyntaxWalker
{
public readonly Dictionary<string, List<string>> models = new Dictionary<string, List<string>>();
public readonly List<UsingDirectiveSyntax> Usings = new List<UsingDirectiveSyntax>();
public override void VisitUsingDirective(UsingDirectiveSyntax node) {
if (node.Name.ToString() != "System" &&
!node.Name.ToString().StartsWith("System."))
{
this.Usings.Add(node);
}
}
public override void VisitPropertyDeclaration(PropertyDeclarationSyntax node)
{
var classnode = node.Parent as ClassDeclarationSyntax;
if (!models.ContainsKey(classnode.Identifier.ValueText))
{
models.Add(classnode.Identifier.ValueText, new List<string>());
}
models[classnode.Identifier.ValueText].Add(node.Identifier.ValueText);
}
}
/// <summary>
/// 演示CSharpSyntaxWalker
/// </summary>
/// <param name="code"></param>
/// <returns></returns>
public UsingCollector GetCollector(string code)
{
var tree = CSharpSyntaxTree.ParseText(code);
var root = (CompilationUnitSyntax)tree.GetRoot();
var collector = new UsingCollector();
collector.Visit(root);
return collector;
}
- Main entrance call:
var code2 =
@"using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
namespace TopLevel
{
using Microsoft;
using System.ComponentModel;
namespace Child1
{
using Microsoft.Win32;
using System.Runtime.InteropServices;
class Foo {
public string FChildA{get;set;}
public string FChildB{get;set;}
}
}
namespace Child2
{
using System.CodeDom;
using Microsoft.CSharp;
class Bar {
public string BChildA{get;set;}
public string BChildB{get;set;}
}
}
}";
var collector = new AnalysisDemo().GetCollector(code2);
foreach (var directive in collector.Usings)
{
Console.WriteLine($"Name:{directive.Name}");
}
Console.WriteLine($"models:{JsonConvert.SerializeObject(collector.models)}");
- Results of the
We can conclude that
VisitUsingDirective 主要用于获取Using命名空间
VisitPropertyDeclaration主要用于获取属性。
6|0总结
本篇文章主要讲了
-
语法树SyntaxTree,以及SyntaxNode,SyntaxToken,SyntaxTrivia。
-
通过重写CSharpSyntaxWalker的虚方法,可以实现自定义获取。
-
附上官方截取的部分流程图
6|1Roslyn编译管道功能区
6|2API图层
Roslyn由两个主要的API层组成 - 编译器API和工作区API。
7|0源码
8|0参考链接
Getting Started C# Syntax Analysis
从零开始学习 dotnet 编译过程和 Roslyn 源码分析
1|0演示环境
2|0简介
今天,Visual Basic和C#编译器是黑盒子:输入文本然后输出字节,编译管道的中间阶段没有透明性。使用.NET编译器平台(以前称为“Roslyn”),工具和开发人员可以利用编译器使用的完全相同的数据结构和算法来分析和理解代码。 本篇文章,我们将会慢慢熟悉语法API,通过语法API来查看解析器,语法树,用于推理和构造它们的实用程序。
3|0理解语法树
Trivia,Token和Node形成了一个完全代表Visual Basic或C#代码片段中所有内容的树
3|1SyntaxTree
它的实例表示整个解析树。SyntaxTree是一个抽象类,具有特定于语言的派生类。要解析特定语言的语法,您需要使用CSharpSyntaxTree(或VisualBasicSyntaxTree)类上的解析方法。
3|2SyntaxNode
它的实例表示的语法结构如声明,语句,子句和表达式。
3|3SyntaxToken
它代表一个单独的关键字,识别符,操作员或标点符号
3|4SyntaxTrivia
它表示语法上无关紧要的信息,例如令牌之间的空白,预处理指令和注释。
下图示例:SyntaxNode: 蓝色 | SyntaxToken: 绿色 | SyntaxTrivia: 红色
4|0遍历语法树
- 新建项目“CodeAnalysisDemo”
- 引入Nuget
Microsoft.CodeAnalysis.CSharp
Microsoft.CodeAnalysis.CSharp.Workspaces
- 命名空间:
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
- 准备要分析的代码
using System;
namespace UsingCollectorCS
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World");
}
}
class Student
{
public string Name { get; set; }
}
}
- 核心代码
/// <summary>
///解析语法树
/// </summary>
/// <param name="code"></param>
/// <returns></returns>
public SyntaxNode GetRoot(string code)
{
var tree = CSharpSyntaxTree.ParseText(code);
//SyntaxTree的根root
var root = (CompilationUnitSyntax)tree.GetRoot();
//member
var firstmember = root.Members[0];
//命名空间Namespace
var helloWorldDeclaration = (NamespaceDeclarationSyntax)firstmember;
//类 class
var programDeclaration = (ClassDeclarationSyntax)helloWorldDeclaration.Members[0];
//方法 Method var mainDeclaration = (MethodDeclarationSyntax)programDeclaration.Members[0]; //参数 Parameter var argsParameter = mainDeclaration.ParameterList.Parameters[0]; //查询方法,查询方法名称为Main的第一个参数。 var firstParameters = from methodDeclaration in root.DescendantNodes() .OfType<MethodDeclarationSyntax>() where methodDeclaration.Identifier.ValueText == "Main" select methodDeclaration.ParameterList.Parameters.First(); var argsParameter2 = firstParameters.Single(); return root; }
- 入口Main方法
var code = @"using System;
namespace UsingCollectorCS
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine(""Hello World"");
}
}
class Student
{
public string Name { get; set; }
}
}";
var tree = new AnalysisDemo().GetRoot(code);
- Debug调试
经过对比可知以下部分
利用CSharpSyntaxTree.ParseText(code)获取语法树SyntaxTree
利用(CompilationUnitSyntax)tree.GetRoot()获取语法树的跟节点
利用 (NamespaceDeclarationSyntax)root.Members[0]可获取命名空间
利用 (ClassDeclarationSyntax)helloWorldDeclaration.Members[0]可获取类
利用 (MethodDeclarationSyntax)programDeclaration.Members[0]可获取方法
利用linq查询,可从**root.DescendantNodes()**节点内查询方法/参数等成员。
5|0SyntaxWalkers
通常,您需要在语法树中查找特定类型的所有节点,例如,文件中的每个属性声明。
通过扩展CSharpSyntaxWalker类并重写VisitPropertyDeclaration方法,您可以在不事先知道其结构的情况下处理语法树中的每个属性声明。
CSharpSyntaxWalker是一种特殊的SyntaxVisitor,它以递归方式访问节点及其每个子节点。
我们先来演示CSharpSyntaxWalker的两个虚virtual方法VisitUsingDirective 和VisitPropertyDeclaration
- 核心代码如下:
/// <summary>
/// 收集器
/// </summary>
public class UsingCollector : CSharpSyntaxWalker
{
public readonly Dictionary<string, List<string>> models = new Dictionary<string, List<string>>();
public readonly List<UsingDirectiveSyntax> Usings = new List<UsingDirectiveSyntax>();
public override void VisitUsingDirective(UsingDirectiveSyntax node) {
if (node.Name.ToString() != "System" &&
!node.Name.ToString().StartsWith("System."))
{
this.Usings.Add(node);
}
}
public override void VisitPropertyDeclaration(PropertyDeclarationSyntax node)
{
var classnode = node.Parent as ClassDeclarationSyntax;
if (!models.ContainsKey(classnode.Identifier.ValueText))
{
models.Add(classnode.Identifier.ValueText, new List<string>());
}
models[classnode.Identifier.ValueText].Add(node.Identifier.ValueText);
}
}
/// <summary>
/// 演示CSharpSyntaxWalker
/// </summary>
/// <param name="code"></param>
/// <returns></returns>
public UsingCollector GetCollector(string code)
{
var tree = CSharpSyntaxTree.ParseText(code);
var root = (CompilationUnitSyntax)tree.GetRoot();
var collector = new UsingCollector();
collector.Visit(root);
return collector;
}
- Main调用入口:
var code2 =
@"using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
namespace TopLevel
{
using Microsoft;
using System.ComponentModel;
namespace Child1
{
using Microsoft.Win32;
using System.Runtime.InteropServices;
class Foo {
public string FChildA{get;set;}
public string FChildB{get;set;}
}
}
namespace Child2
{
using System.CodeDom;
using Microsoft.CSharp;
class Bar {
public string BChildA{get;set;}
public string BChildB{get;set;}
}
}
}";
var collector = new AnalysisDemo().GetCollector(code2);
foreach (var directive in collector.Usings)
{
Console.WriteLine($"Name:{directive.Name}");
}
Console.WriteLine($"models:{JsonConvert.SerializeObject(collector.models)}");
- 执行结果
我们可以得出结论
VisitUsingDirective 主要用于获取Using命名空间
VisitPropertyDeclaration主要用于获取属性。
6|0总结
本篇文章主要讲了
-
语法树SyntaxTree,以及SyntaxNode,SyntaxToken,SyntaxTrivia。
-
CSharpSyntaxWalker by overwriting virtual methods can implement custom acquisition.
-
Official section taken flowchart attached
6 | 1 Roslyn compiler pipeline Ribbon
6 | 2 API Layer
Roslyn API consists of two main layers - the working area and the compiler API API.
7 | 0 Source
8 | 0 reference links
Getting Started C# Syntax Analysis
Dotnet learning process from scratch compile source code analysis and Roslyn
Taught you how to modify the compiler to write Roslyn
Original article: https://www.cnblogs.com/fancunwei/p/9851576.html