No need to wait for .Net 5 to start using code generation with Roslyn

Dmitry Tikhonov
ITNEXT
Published in
4 min readNov 3, 2020

--

Recently, when I was reviewing the new features that are going to be included in .Net 5, I came across an interesting one — C# Source Generators. This feature especially interested me since I have been using a similar approach for the last… 5 years and what Microsoft proposes is simply a deeper integration of this approach into the build process.

Further I’ll share my experience using Roslyn in code generation and I think this will help you better understand what exactly Microsoft offers and when you can use it.

First, let’s look at a typical code generation scenario. You have some external source of information such as a database, a JSON description of some REST service, another assembly (via reflection) etc. And using this information you can generate various types of source code such as DTOs, db model classes or proxies to REST services.

However, sometimes there are situations when there is no any external source of information and everything you need is contained in the source code of the project itself, where you want to add some generated code.

Coincidentally, I recently published an open source project that has an example of such a situation. In the project there are more than 100 classes that represent SQL syntax tree nodes and I needed to create visitors which would traverse and modify tree objects (You can find more information about the project in my previous article Syntax Tree and Alternative to LINQ in Interaction with SQL Databases).

The reason why the code-generation is a good choice here is that each time I make even a small change to the classes I need to remember to modify the visitors and these changes must be done very carefully. However, I cannot use reflection for the code generation, since the assembly containing the new changes simply does not exist yet, and if these changes are not compatible with the previous version and lead to compilation errors, then this assembly will never appear until I manually fix all the errors.

At first glance, this problem has no solution, but in fact, I can use the Roslyn compiler to solve it. I can pre-compile the model classes in advance and get information similar to what I could get via the reflection.

Let’s create a simple console application and add Microsoft.CodeAnalysis.CSharp package.

Note: Theoretically it could be done in t4 but I prefer not to struggle with dependencies and strange t4 syntax.

First, we need to read all the cs files that contain the model classes and extract syntax trees from them:

The trees contain a lot of information about the source code from text perspective (class names, method names, etc.), but often this information is not enough as we want to know what the text means, so we need to ask Roslyn to analyze the syntax trees and get some semantic data:

Using the semantic data we can get an object of INamedTypeSymbol type:

which can provide information about class constructors and properties:

Since all the model classes are immutable, all meaningful properties have to be set via theirs constructors, so let’s iterate all the constructors parameters and get their types:

Now it needs to analyze the parameter type and find out the following things:

  1. Is the type a list?
  2. Is the type Nullable (the project uses “Nullable reference types”)?
  3. Whether the type inherits from the base type (in our case, the interface) for which we create “Visitors”.

The semantic model provides answers to these questions:

Note: The method “AnalyzeSymbol” extracts an actual type from collections and Nullables:

List<T> => T (list := true) 
T? => T (nullable := true)
List<T>? => T (list := true, nullable := true)

Checking a base type in the semantic model is more complex than if you were using the reflection, but it is also possible:

Now we can put all the information in a simple container:

and use it in the code generation to create something like this.

In my project, I run the code generation as a console utility, but in .Net 5 you will be able to embed this generation into the project as a class marked with a special attribute that will be automatically run at compile time to add missing parts of the code. This is certainly more convenient than a standalone utility, but the idea is similar.

Finally, I want to say that you shouldn’t take this new feature of .Net 5 as an incredible innovation that will fundamentally change the approach to dynamic code generation that is used in libraries such as AutoMapper, ASP.Net Core, etc. (I’ve heard such opinions) It will not! The fact is that the code-generation works in a static context where everything is known in advance, but, for example, AutoMapper does not know what classes it will work with and it still will have to dynamically emit code. However, there are situations when such code generation is very useful (I described one of them in this article). So it’s worth knowing about this feature and understanding its principles and limitations.

(link to the source code on github)

--

--