Could .NET Source Generator Attacks Be A Danger To Your Code?

In this post, I highlight the potential dangers of trusting third party .NET Source Generators and show ways to try and spot Supply Chain Attacks trying to inject malicious code into your code base.


This post is part of Dustin Moris Gorski’s .NET Advent Calendar. Go check out the other posts at https://dotnet.christmas/ 


Background

Since they were introduced in .NET 5, I have been a big fan of C# source  code generators. They are a powerful tool that (a) help avoid the need to write a lot of boiler plate code and (b) also help improve performance of your code by allowing for code to be generated at compile time for tasks that you would previously may have resorted to using reflection to perform.

If you are not familiar with source code generators, there are lots of places you can find out about them. Here are a few presentations on You Tube that you may find useful

Introduction to Source Generators 

Using Source Generators for Fun (and Maybe Profit)

Exploring Source Generators

These videos provide a lot of information as to why source generators are a great addition to .NET.  However …

With Great Power Comes Great Responsibility

I recently became aware of the potential dangers of using source code generators after reading two excellent blog posts.

The first is Mateusz Krzeszowiec’s VeraCode Blog which provides an overview of supply chain attacks and how source code generators can be used to generate potentially harmful code that gets baked into your software

The second is  Maarten Balliauw ‘s Blog Post that also describes the problem, but also shows how attributes can be used to try and hide the code from being inspected.

I highly recommend going and reading both of these before continuing, but if you want the TL;DR, here is my take on what these two blogs eloquently describe in great detail.

Supply Chain Attacks

As mentioned in these blogs, a supply chain attack uses what appears to be a harmless component or build process to generate code that is added as part of the Continuous Integration build pipeline. This injected code can later be used to scrape data from users of that software. The most notable recent attack was the SolarWinds exploit.

With source generators, you are potentially open to attack on two fronts.

The first is that the generator has the ability to inspect your code via the syntax tree and/or the semantic tree. Whilst unlikely, there is the risk that the analysis stage could keep a lookout for common coding patterns where usernames, passwords, connection details etc. could be scraped and logged.

However, the second is the more dangerous and that is to generate malicious code that ‘dials home’ information from users of your software. Maarten’s blog shows how this is done and what to look for in the generated code.

Defending Yourself Against Attacks

The NuGet team has a page that describes what to look for when assessing whether to use a NuGet package in general. However, given the risk of a rogue source generator creating malicious code, the packages need closer scrutiny.

In summary, here are a few things to consider before using a source generator from NuGet

  • Is the package from a reputable source? If the package is from a commercial vendor, this is fairly easy to verify. However, for open source projects, this is a bit harder, which leads to the next check
  • Is the source code available? If the package is open source, the source code will usually be available on GitHub. For source generators, it is always worth looking over the source code and checking what the generator is doing.
  • If the source code is not available, it may be worth considering using a tool such as JetBrains DotPeek to decompile the code. Two things to consider with this, though. The first is that this may be a breach of the license terms, so you need to tread carefully. The second is that the code may have been obfuscated, so will be harder to read and work out what it is doing.
  • For open source projects, can you be sure that the NuGet package published has been built from the code in the GitHub repository. Subject to licensing, you may want to consider forking the repo and creating your own build and manage within your own NuGet feed
  • Consider using tools to vet packages or static source code analysers to check on generated code.

Ensuring Generated Code Is Visible

Whilst ideally, you will have done the due diligence described above, one of the safeguards you can put in place is to make some changes to your csproj file to ensure that the generated code is visible, so that it can be viewed and tracked in source control.

Writing Compiler Generated Code into Files

By default, generated source code does not get written to files that you can see as the code is part of the Roslyn compile process. This makes it easy for bad actors providing code generators to hide under the radar.

To address this, there are two properties you can add to your project file that will make the generated code visible and trackable in source control

Project File With Code Generation Output using EmitCompilerGeneratedFiles and CompileGeneratedFileOutputPath

The first is the EmitCompilerGeneratedFiles property. This will write out any generated code files to the file system. It should be noted, that the contents of these files are effectively just a log of what has been added to the compilation process. The files themselves do not get used as part of the compilation.

By default, these files with be written to the obj folder in a path structure of

obj\BuildProfile\Platform\generated\generator assembly\generator namespace+name

Within this path there will be one or many files depending on how the source code generator has split the code into virtual files using the ‘hintname’ parameter to the AddSource method on the GeneratorExecutionContext at the end of the generator’s Execute method.

Whilst this gives some visibility, having the files in the obj folder is not really of help as this is usually excluded from source control.

This is where the second project property comes into play.

The CompilerGeneratedFilesOutputPath property allows you to specify a path. This will usually be relative to the consuming project’s path. I typically use a path of ‘GeneratedFiles’.

Beneath this path, the structure is \generator assembly\generator namespace+name, again with one or more files depending on how the generated divides the generated code.

As with the files within the obj folder, these files do not take part in the actual compilation. However, this is where things get a bit tricky as by default, any *.cs files within the project folder structure get included. This causes the compiler to generate a whole load of errors stating that there is duplicate definitions of code in the files that have been output.

To get around this, an extra section needs to be added to the project file.

Image showing the use of the Remove element within the Compile Element in the project fileAdding the this section excludes the generated files from compilation as they are there purely so that we can see the output in source control (as the actual generated code is already in the compilation pipeline in memory).

<ItemGroup>
<Compile Remove=”$(CompilerGeneratedFilesOutputPath)\**” />
</ItemGroup>

Now that we have the files in the project file structure, it can be added to source control and any changes tracked. This then means that the files form part of the code review process and can be checked for anything ‘dodgy’ going on in the code

View Generated Files in Visual Studio

Since VS2019 16.10 and now in VS2022, the Solution Explorer window can now drill down into generated files by expanding each source code generator listed under the Analyzers node in the Solution Explorer window.


You may be interested in my previous blog post where I show how to set up source generator debugging in Visual Studio 2019. Whilst a couple of screens differ in VS2022 (due to the new debugging profile window), the guide works as well


This feature does not require the EmitCompilerGeneratedFiles to be enabled as this uses the in-memory compiler generated code (which *should* be the same as the emitted files.

In the screen shot below, I enabled the ‘Show All Files’ button and expanded out both the compiler generated output in the Analyzers node and also the emitted output files in the GeneratedFiles folder

VS Solution Explorer showing generated files

One thing to also consider is that a sneaky attack may generate different code depending on whether the build is using a Debug or Release profile (as developers usually build locally in debug mode, whilst the CI build engine will be using release)

Demo Code

I have made the solution shown in the above image available in a GitHub repo at https://github.com/stevetalkscode/sourcegeneratorattacks

The solution illustrates the techniques described above using two source code generators.

The first is a simple generator that creates a class that has some of the danger signs to look out for based on Marten’s blog.

The second is a demonstration of the new System.Text.Json code generation that has been introduced with .NET 6 for generating serialisation and deserialisation code that would previously have been handled by reflection (and is still the default behaviour unless explicitly used as shown in the demo).

Conclusion

The above discussion may put you off using source generators. This should not be the case as this feature of .NET is incredibly powerful and is starting to be used by Microsoft in .NET 6 with the improvements made to Json serialisation, razor page compilation and logging.

But if you are using source generators that you (or your team) have not written, you need to be aware of the potential dangers of not verifying where the generator has come from and what it is doing to your code.

“Let’s be careful out there!”


While You Are Here …

If you have enjoyed this blog post, you may be interested in my talk that I presented at DDD East Midlands where I discuss my experiences with source code generation of various shades over the past 30 years.

Debugging C# Source Generators with Visual Studio 2019 16.10

Background

I’m a big fan of source generators that were added in C#9, but debugging has been a bit of a pain so far, involving forcing breakpoints in code and attaching the debugger during compilation.

With the RTM release of 16.10 Visual Studio 2019, things now get a bit easier with the introduction the new IsRoslynComponent element in the csproj file and a new debugger launch option of Roslyn Component.

At time of writing, this has not had much publicity, other than a short paragraph in the release notes and a short demo in an episode of Visual Studio Toolbox.

Therefore, to give some visibility to this new functionality, I have written this short step-by-step guide.

If you are new to source generators, there are lots of great resources out there, but a good starting point is this video from the ON.NET show.

Before Starting

In order to take advantage of the new debugging features in VS2019, you will need to make sure that you have the .NET Compiler Platform SDK component installed as part of Visual Studio.

This is shown as one of the components installed with the Visual Studio extension development workload.

Visual Studio Installer workload

Step By Step Guide

For this guide, I am assuming that you already have a source generator that you have written, but if not, I have put the source code for this guide on GitHub if you want to download a working example.

Step 1 – Add the IsRoslynComponent Property to Your Source Generator Project

The first step in making use of the new debugging functionality is to add a new entry to your project.

The property to add is IsRoslynComponent and the value should be set to true

You will need to do this in the text editor as there is no user interface to add this.

Screen shot of setting the IsRoslynComponent property in a project file

Whist in the project file, make sure that you have the latest versions of the core NuGet packages that you require / are useful to work with source generators.

The screen shot below shows the versions that are active at time of writing this post Source Generator Package Versions screen shot from project file

Step 2 – Add a Reference to Your Source Generator Project to a Consuming Project

For the purpose of this guide, I am using a project reference to consume the source generator.

When referencing a source generator, there are two additional attributes that need to be set to flag that this is an analyser that does not need to be distributed with the final assembly. These are

  • OutputItemType = “Analyzer” 
  • ReferenceOutputAssembly=”false”

Referencing the Source Generator project

Step 3 – Build All Projects and View Source Code

To keep this guide short, I am assuming that both the source generator project and the consuming project have no errors and will build.

Assuming that this is the case, there are two new features in VS2019 for source generators that are useful for viewing the generated code.

The first is in the Dependencies node of the consuming project where you will now find an entry in the Analyzers node of the tree for your source generator project. Under this, you can now see the generated files (Item 1 in the screen shot below)

Screen shot of VS2019 Source Generated Code in Solution Explorer view

There are two other things to notice in this screen shot.

Item 2 is the warning that this is a code generated file and cannot be edited (the code editor will not allow you to type into the window even if you try).

Item 3 highlights the editor showing where the file is located. Note that by default, this is in your local temp directory as specified by the environmental variable.

Now that the source code can be viewed in the code window, you can set a breakpoint in the generated code as you would with your own code.

The second new feature is that the code window treats the file as a regular code file, so you can do things like ‘Find All References’ from within the generated file. Conversely, you can use ‘Goto Definition’ from any usages of the generated members to get to the source generated file in the editor.

Step 4 – Prepare to Debug The Source Generator

This is where the ‘magic’ in VS2019 now kicks in!

First, in Solution Explorer, navigate to the source generator project and navigate to the properties dialog, then into the Debug tab.Screen shot of navigating to the Properties dialog from Solution ExplorerScreen shot of setting Roslyn Component as the launch setting for debugging a source generator project

At this point, when you open the Launch drop down, you will notice at the top there is now a Roslyn Component option which has now been enabled thanks to the IsRoslynComponent entry in your project file.

Having clicked on this, you can set which project you want to use to trigger the source generator (in our case, the consuming project) in the drop down in the panel. Then set use the context menu on the the source generator project to set it to be the start up project to use when you hit F5 to start debugging.Screen shot of setting Roslyn Component as the launch setting for debugging a source generator project

Step 5 – Start Debugging

In your source generator project, find an appropriate line to place a breakpoint in your code, inside either the Inititialize or Execute methods of your generator class.

You are now ready to hit F5 to start debugging!

What you will notice is that a console window pops up with the C# compiler being started. This is because you are now debugging within stage of the compiler that is using your code to generate the source code to be used elsewhere in the build.

Screen shot of debugging as source generator with compiler opened in console

Now, you can step through your source generator code as you would with any other code in the IDE.

Step 6 – Changing Code Requires a Restart of Visual Studio

If as part of the debugging process you find you need to change your code. When you recompile and start debugging again, you may find that your changes have not taken effect from within your Visual Studio session.

This is due to the way that Visual Studio caches analyzers and source code generators. In the 16.10 release, the flushing of the cache has not yet been fully addressed and therefore, you will still need to restart Visual Studio to see your code changes within the generator take effect.

If you find that you need to make changes iteratively to debug a problem, you may want to go back to including conditional statements in your code to launch the debugger and use the dotnet build command line to get the compiler to trigger the source code generation in order to debug the problem.

If you do need to do this, I take the following approach to avoid debugger code slipping into the release build.

(1) In the Configuration Manager, create a new build configuration based on the Debug configuration called DebugGenerator

(2) In the project’s build properties, create a conditional compilation symbol called DEBUGGENERATOR

(3) In the Initialize method of the source generator to be debugged, add the following code

(4) Instead of using Visual Studio to trigger a build, open a command line instance and use the following to start the build

dotnet build -c DebugGenerator --no-incremental

This will force the build to use the configuration that will trigger the debugger to launch. Usually when the compiler reaches the line to launch the debugger, you will be prompted to select an instance of Visual Studio. At this point, select the instance that you have just been editing.

Image of Just-in-Time Debugger Selector dialogAfter a short period of loading symbols, Visual Studio will break on the Debugger.Launch line. You can now set a breakpoint anywhere in your source generator project (if not already set) and use F5 to run to that line.

Note, I have used the –no-incremental switch to force a rebuild so that the debugger launch is triggered even if the code has not changed.

A Gotcha!

When I started playing with this new functionality, I loaded up an existing source generator that had been written by a colleague and found that the option to select Roslyn Component was not available, but worked when I created a new source generator project

After a few hours of trial and error by editing the project file, I found that the existing source generator had a reference to the Microsoft.Net.Compilers.Toolset NuGet package. Taking this out and restarting Visual Studio triggered the new functionality of VS to kick in.

If you look at the description of the package, it becomes clear where the problem arises. In short, it comes with its own set of compilers instead of using the default compiler. The current ‘live’ version is 3.9.0 which does not appear to support the IsRoslynComponent. The version required to work is still in pre-release – 3.10.0-3.final.

If you hit this snag, it is worth investigating why the package has been used and whether it can be removed given that

  • It is not intended for general consumption
  • It is only intended as a short-term fix when the out-of-the-box compiler is crashing and awaiting a fix from Microsoft

More details on why it should not be used can be found in this Stack Overflow answer from Jared Parsons.

Conclusion

Whilst not perfect due to the caching problem, the 16.10 release of Visual Studio has added some rich new functionality to help with writing and debugging source generators.

Using OpenApiReference To Generate Open API Client Code

This is not one of my usual blogs, but an aide-mémoire for myself that may be of use to other people who are using the OpenApiReference tooling in their C# projects to generate C# client code for HTTP APIs from Swagger/OpenApi definitions.

Background

I have been using Rico Suter’s brilliant NSwag for some time now to generate client code for working with HTTP API endpoints. Initially I was using the NSwag Studio application to create the C# code and placing the output into my project, but then I later found I could add the code generation into build process using the NSwag MSBuild task.

Recently though, I watched the ASP.NET Community Standup with Jon Galloway and Brady Gaster. In the last half hour or so, they discuss the Connected Services functionality in Visual Studio 2019 that sets up code generation of HTTP API endpoint clients for you.

This feature had passed me by and watching the video got my curiosity going as to whether the build chain that I have been using for the last couple of years could be simplified. Especially given that, behind the scenes, it is using NSwag to do the code generation.

Using Connected Services

Having watched the video above, I recommend reading Jon Galloway’s post Generating HTTP API clients using Visual Studio Connected Services As that post covers the introduction to using Connected Services, I won’t repeat the basics again here.

It is also worth reading the other blog posts in the series written by Brady Gaster:

One of the new things I learnt from the video and blog posts is to make sure that your OpenApi definitions in the source API include an OperationId (which you can set by overloads of  the HttpGet, HttpPost (etc) attributes on your Action method) to help the code generator assign a ‘sensible’ names to the calling method in the code generated client.

Purpose of This Post

Having started with using the Visual Studio dialogs to set up the Connected Service, the default options may not necessarily match with how you want the generated code to work in your particular project.

Having had a few years’ experience of using NSwag to generate code, I started to dig deeper into how to get the full customisation I have been used to from using the “full” NSwag experience but within the more friendly build experience of using the OpenApiReference project element.

One Gotcha To Be Aware Of!

If you use the Connected Services dialog in Visual Studio to create the connected service, you will hit a problem if you have used a Directory.Packages.props file to manage your NuGet packages centrally across your solution. The Connnected Services wizard (as at time of writing) tries to specific versions of NuGet packages.

This is part of a wider problem in Visual Studio (as at time of writing) where the NuGet Package Manager interaction clashes with the restrictions applied in Directory.Packages.props. However, this may be addressed in future versions as of the NuGet tooling and Visual Studio per this Wiki post.

If you are not familiar with using Directory.Packages.props, have a look at this blog post from Stuart Lang

Manually Updating the OpenApiReference Entry in your Project

There isn’t much documentation around on how to make adjustments to the OpenApiReference element that gets added to your csproj file, so hopefully this post will fill in some of the gaps until documentation is added to the Microsoft Docs site.

I have based this part of the post on looking through the source code at https://github.com/dotnet/aspnetcore/tree/main/src/Tools/Extensions.ApiDescription.Client and therefore some of my conclusions may be wrong, so proceed with caution if making changes to your project.

The main source of information is the Microsoft.Extensions.ApiDescription.Client.props file which defines the XML schema and includes comments that I have used here.

The OpenApiReference and OpenApiProjectReference Elements

These two elements can be added one or multiple times within an ItemGroup in your csproj file.

The main focus of this section is the OpenApiReference element that adds code generation to the current project for a specific OpenApi JSON definition.

The OpenApiProjectReference allows external project references to be added as well. More on this below,

The following attributes and sub-elements are the main areas of interest within the OpenApiReference element.

The props file makes references to other properties that live outside of the element that you can override within your csproj file.

As I haven’t used the TypeScript generator, I have focussed my commentary on the NSwagCSharp code generator.

Include Attribute (Required)

The contents of the Include attribute will depend on which element you are in.

For OpenApiReference this will be the path to the OpenApi/Swagger json file that will be the source the code generator will use.

For OpenApiProjectReference this will be the path to another project that is being referenced.

ClassName Element (Optional)

This is the name to give the class that will be generated. If not specified, the class will default to the name given in the OutputPath parameter (see below)

CodeGenerator Element (Required)

The default value is ‘NSwagCSharp’. This points to the NSwag C# client generator, more details of which below.

At time of writing, only C# and TypeScript are supported, and the value here must end with either “CSharp” or “TypeScript”. Builds will invoke a MSBuild target named “Generate(CodeGenerator)” to do actual code generation. More on this below.

Namespace Element (Optional)

This is the namespace to assign the generated class to. If not specified, the RootNamespace entry from your project will be used to put the class within your project’s namespace. You may choose to be more specific with the NSwag specific commands below.

Options Element (Optional)

These are the customisation instructions that will be passed to the code generator as command line options. See Customising The Generated Code with NSwag Commands below for details about usage with the NSwagCSharp generator

One of the problems I have been having with this element is that the contents are passed to the command line of the NSwagCSharp as-is and therefore you cannot include line breaks to make it more readable.

It would be nice if there was a new element that allows each command option to be listed as an XML sub-element in its own right that the MSBuild target concatenates and parses into the single command line to make editing the csproj file a bit easier.

Possible Options Declaration

OutputPath (Optional)

This is the path to place generated code into. It is up to the code generator as to whether to interpret the path as a filename or as a directory.

The Default filename or folder name is %(Filename)Client.[cs|ts].

Filenames and relative paths (if explicitly set) are combined with
$(OpenApiCodeDirectory). Final value is likely to be a path relative to
the client project.

GlobalPropertiesToRemove (OpenApiProjectReference Only – Optional)

This is a semicolon-separated list of global properties to remove in a ProjectReference item created for the OpenApiProjectReference. These properties, along with Configuration, Platform, RuntimeIdentifier and
TargetFrameworks, are also removed when invoking the ‘OpenApiGetDocuments’ target in the referenced project.

Other Property Elements

In the section above, there are references to other properties that get set within the props file.

The properties can be overridden within your csproj file, so for completeness, I have added some commentary here

OpenApiGenerateCodeOptions

The Options element above if not populated defaults to the contents of this element, which in of itself is empty by default.

As per my comment above for Options, this suffers the same problem of all command values needing to be on the same single line.

OpenApiGenerateCodeOnBuild

If this is set to ‘true’ (the default), code is generated for the OpenApiReference element and any OpenApiProjectReference items before the BeforeCompile target is invoked.

However, it may be that you do not want the generated called on every single build as you may have set up a CI pipeline where the target is explicitly invoked (via a command line or as a build target) as a build step before the main build. In that case, the value can be set to ‘false’

OpenApiGenerateCodeAtDesignTime

Similar to OpenApiGenerateCodeOnBuild above, but this time determines whether to generate code at design time as well as being part of a full build. This is set to true by default.

OpenApiBuildReferencedProjects

If set to ‘true’ (the default), any projects referenced in an ItemGroup containing one or many OpenApiProjectReference elements will get built before retrieving that project’s OpenAPI documents list (or generating code).

If set to ‘false’, you need to ensure the referenced projects are built before the current project within the solution or through other means (such as a build pipeline) but IDEs such as Visual Studio and Rider may get confused about the project dependency graph in this case.

OpenApiCodeDirectory

This is the default folder to place the generated code into. The value is interpreted relative to the project folder, unless already an absolute path. This forms part of the default OutputPath within the OpenApiReference above and the OpenApiProjectReference items.

The default value for this is BaseIntermediateOutputPath which is set elsewhere in your csproj file or is implicitly set by the SDK.

Customising The Generated Code with NSwag Commands

Here we get to the main reason I have written this post.

There is a huge amount of customisation that you can do to craft the generated code into a shape that suits you.

The easiest way to get an understanding of the levels of customisation is to use NSwag Studio to pay around with the various customisation options and see how the options affect the generated code.

Previously when I have been using the NSwag MSBuild task, I have pointed the task to an NSwag configuration json file saved from the NSwag Studio and let the build process get on with the job of doing the code generation as a pre-build task.

However, the OpenApiReference task adds a layer of abstraction that means that just using the NSwag configuration file is not an option. Instead, you need to pass the configuration as command line parameters via the <Options> element.

This can get a bit hairy for a couple of reasons.

  • Firstly, each command has to be added to one single line which can make your csproj file a bit unwieldy to view if you have a whole load of customisations that you want to make (scroll right, scroll right, scroll right again!)
  • Secondly, you need to know all the NSwag commands and the associated syntax to pass these to the Options element.

Options Syntax

Each option that you want to pass takes the form of a command line parameter which

  • starts with a forward slash
  • followed by the command
  • then a colon and then
  • the value to pass to the command

So, something like this: /ClientBaseClass:ClientBase

The format of the value depends on the value type of the command of which there are three common ones

  • boolean values are set with true or false. E.g. /GenerateOptionalParameters:true
  • string values are set with the string value as-is. E.g. /ClassStyle:Poco
  • string arrays are comma delimited lists of string values. E.g.
    /AdditionalNamespaceUsages:MyNamespace1,MyNamespace2,MyNamespace3

The following table is a GitHub gist copy from the GitHub repository I have set up for this and which I plan to update over time as I get a better understanding of each command and its effect on the generated code.

At time of writing, many of the descriptions have been lifted from the XML comments in the code from the NSwag repository on GitHub.

(Apologies the format of the imported markdown here is not great. I hope to make this a bit better later when I can find the time. You may want to go direct to the gist directly)

Conclusion

The new tooling makes the code generation build process itself a lot simpler, but there are a few hoops to jump through to customise the code generated.

I’ve been very impressed with the tooling and I look forward to seeing how to it progresses in the future.

I hope that this blog is of help to anyone else who wants to understand more about the customisation of the OpenApiReference tooling and plugs a gap in the lack of documentation currently available.