Could .NET Source Generator Attacks Be A Danger To Your Code?

In this post, I highlight the potential dangers of trusting third party .NET Source Generators and show ways to try and spot Supply Chain Attacks trying to inject malicious code into your code base.


This post is part of Dustin Moris Gorski’s .NET Advent Calendar. Go check out the other posts at https://dotnet.christmas/ 


Background

Since they were introduced in .NET 5, I have been a big fan of C# source  code generators. They are a powerful tool that (a) help avoid the need to write a lot of boiler plate code and (b) also help improve performance of your code by allowing for code to be generated at compile time for tasks that you would previously may have resorted to using reflection to perform.

If you are not familiar with source code generators, there are lots of places you can find out about them. Here are a few presentations on You Tube that you may find useful

Introduction to Source Generators 

Using Source Generators for Fun (and Maybe Profit)

Exploring Source Generators

These videos provide a lot of information as to why source generators are a great addition to .NET.  However …

With Great Power Comes Great Responsibility

I recently became aware of the potential dangers of using source code generators after reading two excellent blog posts.

The first is Mateusz Krzeszowiec’s VeraCode Blog which provides an overview of supply chain attacks and how source code generators can be used to generate potentially harmful code that gets baked into your software

The second is  Maarten Balliauw ‘s Blog Post that also describes the problem, but also shows how attributes can be used to try and hide the code from being inspected.

I highly recommend going and reading both of these before continuing, but if you want the TL;DR, here is my take on what these two blogs eloquently describe in great detail.

Supply Chain Attacks

As mentioned in these blogs, a supply chain attack uses what appears to be a harmless component or build process to generate code that is added as part of the Continuous Integration build pipeline. This injected code can later be used to scrape data from users of that software. The most notable recent attack was the SolarWinds exploit.

With source generators, you are potentially open to attack on two fronts.

The first is that the generator has the ability to inspect your code via the syntax tree and/or the semantic tree. Whilst unlikely, there is the risk that the analysis stage could keep a lookout for common coding patterns where usernames, passwords, connection details etc. could be scraped and logged.

However, the second is the more dangerous and that is to generate malicious code that ‘dials home’ information from users of your software. Maarten’s blog shows how this is done and what to look for in the generated code.

Defending Yourself Against Attacks

The NuGet team has a page that describes what to look for when assessing whether to use a NuGet package in general. However, given the risk of a rogue source generator creating malicious code, the packages need closer scrutiny.

In summary, here are a few things to consider before using a source generator from NuGet

  • Is the package from a reputable source? If the package is from a commercial vendor, this is fairly easy to verify. However, for open source projects, this is a bit harder, which leads to the next check
  • Is the source code available? If the package is open source, the source code will usually be available on GitHub. For source generators, it is always worth looking over the source code and checking what the generator is doing.
  • If the source code is not available, it may be worth considering using a tool such as JetBrains DotPeek to decompile the code. Two things to consider with this, though. The first is that this may be a breach of the license terms, so you need to tread carefully. The second is that the code may have been obfuscated, so will be harder to read and work out what it is doing.
  • For open source projects, can you be sure that the NuGet package published has been built from the code in the GitHub repository. Subject to licensing, you may want to consider forking the repo and creating your own build and manage within your own NuGet feed
  • Consider using tools to vet packages or static source code analysers to check on generated code.

Ensuring Generated Code Is Visible

Whilst ideally, you will have done the due diligence described above, one of the safeguards you can put in place is to make some changes to your csproj file to ensure that the generated code is visible, so that it can be viewed and tracked in source control.

Writing Compiler Generated Code into Files

By default, generated source code does not get written to files that you can see as the code is part of the Roslyn compile process. This makes it easy for bad actors providing code generators to hide under the radar.

To address this, there are two properties you can add to your project file that will make the generated code visible and trackable in source control

Project File With Code Generation Output using EmitCompilerGeneratedFiles and CompileGeneratedFileOutputPath

The first is the EmitCompilerGeneratedFiles property. This will write out any generated code files to the file system. It should be noted, that the contents of these files are effectively just a log of what has been added to the compilation process. The files themselves do not get used as part of the compilation.

By default, these files with be written to the obj folder in a path structure of

obj\BuildProfile\Platform\generated\generator assembly\generator namespace+name

Within this path there will be one or many files depending on how the source code generator has split the code into virtual files using the ‘hintname’ parameter to the AddSource method on the GeneratorExecutionContext at the end of the generator’s Execute method.

Whilst this gives some visibility, having the files in the obj folder is not really of help as this is usually excluded from source control.

This is where the second project property comes into play.

The CompilerGeneratedFilesOutputPath property allows you to specify a path. This will usually be relative to the consuming project’s path. I typically use a path of ‘GeneratedFiles’.

Beneath this path, the structure is \generator assembly\generator namespace+name, again with one or more files depending on how the generated divides the generated code.

As with the files within the obj folder, these files do not take part in the actual compilation. However, this is where things get a bit tricky as by default, any *.cs files within the project folder structure get included. This causes the compiler to generate a whole load of errors stating that there is duplicate definitions of code in the files that have been output.

To get around this, an extra section needs to be added to the project file.

Image showing the use of the Remove element within the Compile Element in the project fileAdding the this section excludes the generated files from compilation as they are there purely so that we can see the output in source control (as the actual generated code is already in the compilation pipeline in memory).

<ItemGroup>
<Compile Remove=”$(CompilerGeneratedFilesOutputPath)\**” />
</ItemGroup>

Now that we have the files in the project file structure, it can be added to source control and any changes tracked. This then means that the files form part of the code review process and can be checked for anything ‘dodgy’ going on in the code

View Generated Files in Visual Studio

Since VS2019 16.10 and now in VS2022, the Solution Explorer window can now drill down into generated files by expanding each source code generator listed under the Analyzers node in the Solution Explorer window.


You may be interested in my previous blog post where I show how to set up source generator debugging in Visual Studio 2019. Whilst a couple of screens differ in VS2022 (due to the new debugging profile window), the guide works as well


This feature does not require the EmitCompilerGeneratedFiles to be enabled as this uses the in-memory compiler generated code (which *should* be the same as the emitted files.

In the screen shot below, I enabled the ‘Show All Files’ button and expanded out both the compiler generated output in the Analyzers node and also the emitted output files in the GeneratedFiles folder

VS Solution Explorer showing generated files

One thing to also consider is that a sneaky attack may generate different code depending on whether the build is using a Debug or Release profile (as developers usually build locally in debug mode, whilst the CI build engine will be using release)

Demo Code

I have made the solution shown in the above image available in a GitHub repo at https://github.com/stevetalkscode/sourcegeneratorattacks

The solution illustrates the techniques described above using two source code generators.

The first is a simple generator that creates a class that has some of the danger signs to look out for based on Marten’s blog.

The second is a demonstration of the new System.Text.Json code generation that has been introduced with .NET 6 for generating serialisation and deserialisation code that would previously have been handled by reflection (and is still the default behaviour unless explicitly used as shown in the demo).

Conclusion

The above discussion may put you off using source generators. This should not be the case as this feature of .NET is incredibly powerful and is starting to be used by Microsoft in .NET 6 with the improvements made to Json serialisation, razor page compilation and logging.

But if you are using source generators that you (or your team) have not written, you need to be aware of the potential dangers of not verifying where the generator has come from and what it is doing to your code.

“Let’s be careful out there!”


While You Are Here …

If you have enjoyed this blog post, you may be interested in my talk that I presented at DDD East Midlands where I discuss my experiences with source code generation of various shades over the past 30 years.

Debugging C# Source Generators with Visual Studio 2019 16.10

Background

I’m a big fan of source generators that were added in C#9, but debugging has been a bit of a pain so far, involving forcing breakpoints in code and attaching the debugger during compilation.

With the RTM release of 16.10 Visual Studio 2019, things now get a bit easier with the introduction the new IsRoslynComponent element in the csproj file and a new debugger launch option of Roslyn Component.

At time of writing, this has not had much publicity, other than a short paragraph in the release notes and a short demo in an episode of Visual Studio Toolbox.

Therefore, to give some visibility to this new functionality, I have written this short step-by-step guide.

If you are new to source generators, there are lots of great resources out there, but a good starting point is this video from the ON.NET show.

Before Starting

In order to take advantage of the new debugging features in VS2019, you will need to make sure that you have the .NET Compiler Platform SDK component installed as part of Visual Studio.

This is shown as one of the components installed with the Visual Studio extension development workload.

Visual Studio Installer workload

Step By Step Guide

For this guide, I am assuming that you already have a source generator that you have written, but if not, I have put the source code for this guide on GitHub if you want to download a working example.

Step 1 – Add the IsRoslynComponent Property to Your Source Generator Project

The first step in making use of the new debugging functionality is to add a new entry to your project.

The property to add is IsRoslynComponent and the value should be set to true

You will need to do this in the text editor as there is no user interface to add this.

Screen shot of setting the IsRoslynComponent property in a project file

Whist in the project file, make sure that you have the latest versions of the core NuGet packages that you require / are useful to work with source generators.

The screen shot below shows the versions that are active at time of writing this post Source Generator Package Versions screen shot from project file

Step 2 – Add a Reference to Your Source Generator Project to a Consuming Project

For the purpose of this guide, I am using a project reference to consume the source generator.

When referencing a source generator, there are two additional attributes that need to be set to flag that this is an analyser that does not need to be distributed with the final assembly. These are

  • OutputItemType = “Analyzer” 
  • ReferenceOutputAssembly=”false”

Referencing the Source Generator project

Step 3 – Build All Projects and View Source Code

To keep this guide short, I am assuming that both the source generator project and the consuming project have no errors and will build.

Assuming that this is the case, there are two new features in VS2019 for source generators that are useful for viewing the generated code.

The first is in the Dependencies node of the consuming project where you will now find an entry in the Analyzers node of the tree for your source generator project. Under this, you can now see the generated files (Item 1 in the screen shot below)

Screen shot of VS2019 Source Generated Code in Solution Explorer view

There are two other things to notice in this screen shot.

Item 2 is the warning that this is a code generated file and cannot be edited (the code editor will not allow you to type into the window even if you try).

Item 3 highlights the editor showing where the file is located. Note that by default, this is in your local temp directory as specified by the environmental variable.

Now that the source code can be viewed in the code window, you can set a breakpoint in the generated code as you would with your own code.

The second new feature is that the code window treats the file as a regular code file, so you can do things like ‘Find All References’ from within the generated file. Conversely, you can use ‘Goto Definition’ from any usages of the generated members to get to the source generated file in the editor.

Step 4 – Prepare to Debug The Source Generator

This is where the ‘magic’ in VS2019 now kicks in!

First, in Solution Explorer, navigate to the source generator project and navigate to the properties dialog, then into the Debug tab.Screen shot of navigating to the Properties dialog from Solution ExplorerScreen shot of setting Roslyn Component as the launch setting for debugging a source generator project

At this point, when you open the Launch drop down, you will notice at the top there is now a Roslyn Component option which has now been enabled thanks to the IsRoslynComponent entry in your project file.

Having clicked on this, you can set which project you want to use to trigger the source generator (in our case, the consuming project) in the drop down in the panel. Then set use the context menu on the the source generator project to set it to be the start up project to use when you hit F5 to start debugging.Screen shot of setting Roslyn Component as the launch setting for debugging a source generator project

Step 5 – Start Debugging

In your source generator project, find an appropriate line to place a breakpoint in your code, inside either the Inititialize or Execute methods of your generator class.

You are now ready to hit F5 to start debugging!

What you will notice is that a console window pops up with the C# compiler being started. This is because you are now debugging within stage of the compiler that is using your code to generate the source code to be used elsewhere in the build.

Screen shot of debugging as source generator with compiler opened in console

Now, you can step through your source generator code as you would with any other code in the IDE.

Step 6 – Changing Code Requires a Restart of Visual Studio

If as part of the debugging process you find you need to change your code. When you recompile and start debugging again, you may find that your changes have not taken effect from within your Visual Studio session.

This is due to the way that Visual Studio caches analyzers and source code generators. In the 16.10 release, the flushing of the cache has not yet been fully addressed and therefore, you will still need to restart Visual Studio to see your code changes within the generator take effect.

If you find that you need to make changes iteratively to debug a problem, you may want to go back to including conditional statements in your code to launch the debugger and use the dotnet build command line to get the compiler to trigger the source code generation in order to debug the problem.

If you do need to do this, I take the following approach to avoid debugger code slipping into the release build.

(1) In the Configuration Manager, create a new build configuration based on the Debug configuration called DebugGenerator

(2) In the project’s build properties, create a conditional compilation symbol called DEBUGGENERATOR

(3) In the Initialize method of the source generator to be debugged, add the following code

(4) Instead of using Visual Studio to trigger a build, open a command line instance and use the following to start the build

dotnet build -c DebugGenerator --no-incremental

This will force the build to use the configuration that will trigger the debugger to launch. Usually when the compiler reaches the line to launch the debugger, you will be prompted to select an instance of Visual Studio. At this point, select the instance that you have just been editing.

Image of Just-in-Time Debugger Selector dialogAfter a short period of loading symbols, Visual Studio will break on the Debugger.Launch line. You can now set a breakpoint anywhere in your source generator project (if not already set) and use F5 to run to that line.

Note, I have used the –no-incremental switch to force a rebuild so that the debugger launch is triggered even if the code has not changed.

A Gotcha!

When I started playing with this new functionality, I loaded up an existing source generator that had been written by a colleague and found that the option to select Roslyn Component was not available, but worked when I created a new source generator project

After a few hours of trial and error by editing the project file, I found that the existing source generator had a reference to the Microsoft.Net.Compilers.Toolset NuGet package. Taking this out and restarting Visual Studio triggered the new functionality of VS to kick in.

If you look at the description of the package, it becomes clear where the problem arises. In short, it comes with its own set of compilers instead of using the default compiler. The current ‘live’ version is 3.9.0 which does not appear to support the IsRoslynComponent. The version required to work is still in pre-release – 3.10.0-3.final.

If you hit this snag, it is worth investigating why the package has been used and whether it can be removed given that

  • It is not intended for general consumption
  • It is only intended as a short-term fix when the out-of-the-box compiler is crashing and awaiting a fix from Microsoft

More details on why it should not be used can be found in this Stack Overflow answer from Jared Parsons.

Conclusion

Whilst not perfect due to the caching problem, the 16.10 release of Visual Studio has added some rich new functionality to help with writing and debugging source generators.