
Posted on • Originally published atblog.wiseowls.co.nz
Getting started with Roslyn code analysis
It was going to happen eventually – our research on C# dynamic features eventually ended up with an attempt to parse bits of source code. There arequite a few solutions on the market, with NRefactory being our preferred tool over the years. There are however a few limitations: it does not support .NET core and C# 6.
It is a big deal
It might seem, that support for newer language spec is not critical. But in fact, it gets problematic very quickly even in more established projects. Luckily for us, Microsoft has chosen to open sourceRoslyn – the very engine that powers their compiler services. Their official documentation covers the platform pretty well and goes in great detail of writing Visual Studio code analysers. We however often have to deal with writing MSBuild tasks that load the whole solution and run analysis on class hierarchies (for example, to detect whether a singleSQL SELECT
statement is being called inside aforeach
loop – we would fail the build and suggest to replace it with bulk select)
Installing
Roslyn is available via NuGet as a number ofMicrosoft.CodeAnalysis.* packages. We normally include these four:
Install-PackageMicrosoft.CodeAnalysis.Workspaces.MSBuildInstall-PackageMicrosoft.CodeAnalysisInstall-PackageMicrosoft.CodeAnalysis.CSharpInstall-PackageMicrosoft.Build# these classes are needed to support MSBuild workspace when it starts to load solutionInstall-PackageMicrosoft.Build.Utilities.Core# these classes are needed to support MSBuild workspace when it starts to load solutionInstall-PackageMicrosoft.Build.Locator# this is a helper to locate correct MSBuild toolchain (in case the machine has more than one installed)
Sometimes the environment gets confused as to what version MSBuild to use, and this is why starting a project with something like this is pretty much a must since VS2015:
// put this somewhere early in the programif(!MSBuildLocator.IsRegistered)//MSBuildLocator.RegisterDefaults(); // ensures correct version is loaded up{varvs2022=MSBuildLocator.QueryVisualStudioInstances().Where(x=>x.Name=="Visual Studio Community 2022").First();// find the correct VS setup. There are namy ways to organise logic here, we'll just assume we want VS2022MSBuildLocator.RegisterInstance(vs2022);// register the selected instancevar_=typeof(Microsoft.CodeAnalysis.CSharp.Formatting.CSharpFormattingOptions);// this ensures library is referenced so the compiler would not try to optimise it away (if dynamically loading assemblies or doing other voodoo that can throw the compiler off) - probably less important than the above but we prefer to follow cargo cult here and leave it be}
After initial steps, simplistic solution traversal would look something along these lines:
asyncTaskAnalyseSolution(){using(varw=MSBuildWorkspace.Create()){varsolution=awaitw.OpenSolutionAsync(@"MySolution.sln");foreach(varprojectinsolution.Projects){vardocs=project.Documents;// allows for file-level document filteringvarcompilation=awaitproject.GetCompilationAsync();// allows for assembly-level analysis as well as SemanticModelforeach(vardocindocs){varwalker=newCSharpSyntaxWalker();// CSharpSyntaxWalker is an abstract class - we will need to define our own implementation for this to actually workwalker.Visit(awaitdoc.GetSyntaxRootAsync());// traverse the syntax tree}}}}
Syntax Tree Visitor
As with pretty much every single mainstream syntax analyser, the easiest way to traverse syntax trees is by using aVisitor Pattern. It allows to decouple tree nodes and processing logic. Which will allow room for expansion on either side (easy to add new logic, easy to add new tree node types). Roslyn has stubCSharpSyntaxWalker that allows us to only override required nodes for processing. It then takes care of everything else.
With basics out of the way, let’s look into classes that make up our platform here. Top of the hierarchy is MSBuildWorkspace
followed bySolution
,Project
andDocument
. Roslyn makes a distinction between parsing code and compiling it. Meaning some analytics will only be available inCompilation
class that is available for project as well as for individual documents down the track.
Traversing the tree
Just loading the solution is kind of pointless though. We’d need to come up with processing logic – and the best place to do it would be aCSharpSyntaxWalker
subclass. Suppose, we’d like to determine whether class constructor containsif
statements that are driven by parameters. This might mean we’ve got overly complex classes and could benefit from refactoring these out:
publicclassConstructorSyntaxWalker:CSharpSyntaxWalker{publicList<IParameterSymbol>Parameters{get;set;}publicintIfConditions{get;set;}boolprocessingConstructor=false;SemanticModelsm;publicConstructorSyntaxWalker(SemanticModelsm){this.sm=sm;Parameters=newList<IParameterSymbol>();}publicoverridevoidVisitConstructorDeclaration(ConstructorDeclarationSyntaxnode){processingConstructor=true;base.VisitConstructorDeclaration(node);processingConstructor=false;}publicoverridevoidVisitIfStatement(IfStatementSyntaxnode){if(!processingConstructor)return;// we only want to keep traversing if we know we're inside constructor bodyParameters.AddRange(sm.AnalyzeDataFlow(node).DataFlowsIn.Cast<IParameterSymbol>());// .AnalyzeDataFlow() is one of the most commonly used parts of the platform: it requires a compilation to work off and allows tracking dependencies. We could then check if these parameters are supplied to constructor and make a call whether this is allowedIfConditions++;// just count for now, nothing fancybase.VisitIfStatement(node);}}
Then, somewhere in our solution (or any other solution, really) We have a class definition like so:
publicclassTestClass{publicTestClass(inta,stringo){if(a==1)DoThis()elseDoSomethingElse();if(o=="a")Foo()elseBar();}}
If we wanted to throw an exception and halt the build we could invoke outSyntaxWalker
:
publicstaticasyncTaskMain(){awaitAnalyseSolution();}...asyncstaticTaskAnalyseSolution(){using(varw=MSBuildWorkspace.Create()){varsolution=awaitw.OpenSolutionAsync(@"..\..\..\TestRoslyn.sln");// let's analyse our own solution. But can be any file on diskforeach(varprojectinsolution.Projects){vardocs=project.Documents;// allows for file-level document filteringvarcompilation=awaitproject.GetCompilationAsync();// allows for assembly-level analysis as well as SemanticModelforeach(vardocindocs){varwalker=newConstructorSyntaxWalker(awaitdoc.GetSemanticModelAsync());walker.Visit(awaitdoc.GetSyntaxRootAsync());// traverse the syntax treeif(walker.IfConditions>0&&walker.Parameters.Any())thrownewException("We do not allow branching in constructors.");}}}}
And there we have it. This is a very simplistic example, but possibilities are endless!
Top comments(0)
For further actions, you may consider blocking this person and/orreporting abuse