Could .NET Source Generator Attacks Be A Danger To Your Code?

In this post, I highlight the potential dangers of trusting third party .NET Source Generators and show ways to try and spot Supply Chain Attacks trying to inject malicious code into your code base.


This post is part of Dustin Moris Gorski’s .NET Advent Calendar. Go check out the other posts at https://dotnet.christmas/ 


Background

Since they were introduced in .NET 5, I have been a big fan of C# source  code generators. They are a powerful tool that (a) help avoid the need to write a lot of boiler plate code and (b) also help improve performance of your code by allowing for code to be generated at compile time for tasks that you would previously may have resorted to using reflection to perform.

If you are not familiar with source code generators, there are lots of places you can find out about them. Here are a few presentations on You Tube that you may find useful

Introduction to Source Generators 

Using Source Generators for Fun (and Maybe Profit)

Exploring Source Generators

These videos provide a lot of information as to why source generators are a great addition to .NET.  However …

With Great Power Comes Great Responsibility

I recently became aware of the potential dangers of using source code generators after reading two excellent blog posts.

The first is Mateusz Krzeszowiec’s VeraCode Blog which provides an overview of supply chain attacks and how source code generators can be used to generate potentially harmful code that gets baked into your software

The second is  Maarten Balliauw ‘s Blog Post that also describes the problem, but also shows how attributes can be used to try and hide the code from being inspected.

I highly recommend going and reading both of these before continuing, but if you want the TL;DR, here is my take on what these two blogs eloquently describe in great detail.

Supply Chain Attacks

As mentioned in these blogs, a supply chain attack uses what appears to be a harmless component or build process to generate code that is added as part of the Continuous Integration build pipeline. This injected code can later be used to scrape data from users of that software. The most notable recent attack was the SolarWinds exploit.

With source generators, you are potentially open to attack on two fronts.

The first is that the generator has the ability to inspect your code via the syntax tree and/or the semantic tree. Whilst unlikely, there is the risk that the analysis stage could keep a lookout for common coding patterns where usernames, passwords, connection details etc. could be scraped and logged.

However, the second is the more dangerous and that is to generate malicious code that ‘dials home’ information from users of your software. Maarten’s blog shows how this is done and what to look for in the generated code.

Defending Yourself Against Attacks

The NuGet team has a page that describes what to look for when assessing whether to use a NuGet package in general. However, given the risk of a rogue source generator creating malicious code, the packages need closer scrutiny.

In summary, here are a few things to consider before using a source generator from NuGet

  • Is the package from a reputable source? If the package is from a commercial vendor, this is fairly easy to verify. However, for open source projects, this is a bit harder, which leads to the next check
  • Is the source code available? If the package is open source, the source code will usually be available on GitHub. For source generators, it is always worth looking over the source code and checking what the generator is doing.
  • If the source code is not available, it may be worth considering using a tool such as JetBrains DotPeek to decompile the code. Two things to consider with this, though. The first is that this may be a breach of the license terms, so you need to tread carefully. The second is that the code may have been obfuscated, so will be harder to read and work out what it is doing.
  • For open source projects, can you be sure that the NuGet package published has been built from the code in the GitHub repository. Subject to licensing, you may want to consider forking the repo and creating your own build and manage within your own NuGet feed
  • Consider using tools to vet packages or static source code analysers to check on generated code.

Ensuring Generated Code Is Visible

Whilst ideally, you will have done the due diligence described above, one of the safeguards you can put in place is to make some changes to your csproj file to ensure that the generated code is visible, so that it can be viewed and tracked in source control.

Writing Compiler Generated Code into Files

By default, generated source code does not get written to files that you can see as the code is part of the Roslyn compile process. This makes it easy for bad actors providing code generators to hide under the radar.

To address this, there are two properties you can add to your project file that will make the generated code visible and trackable in source control

Project File With Code Generation Output using EmitCompilerGeneratedFiles and CompileGeneratedFileOutputPath

The first is the EmitCompilerGeneratedFiles property. This will write out any generated code files to the file system. It should be noted, that the contents of these files are effectively just a log of what has been added to the compilation process. The files themselves do not get used as part of the compilation.

By default, these files with be written to the obj folder in a path structure of

obj\BuildProfile\Platform\generated\generator assembly\generator namespace+name

Within this path there will be one or many files depending on how the source code generator has split the code into virtual files using the ‘hintname’ parameter to the AddSource method on the GeneratorExecutionContext at the end of the generator’s Execute method.

Whilst this gives some visibility, having the files in the obj folder is not really of help as this is usually excluded from source control.

This is where the second project property comes into play.

The CompilerGeneratedFilesOutputPath property allows you to specify a path. This will usually be relative to the consuming project’s path. I typically use a path of ‘GeneratedFiles’.

Beneath this path, the structure is \generator assembly\generator namespace+name, again with one or more files depending on how the generated divides the generated code.

As with the files within the obj folder, these files do not take part in the actual compilation. However, this is where things get a bit tricky as by default, any *.cs files within the project folder structure get included. This causes the compiler to generate a whole load of errors stating that there is duplicate definitions of code in the files that have been output.

To get around this, an extra section needs to be added to the project file.

Image showing the use of the Remove element within the Compile Element in the project fileAdding the this section excludes the generated files from compilation as they are there purely so that we can see the output in source control (as the actual generated code is already in the compilation pipeline in memory).

<ItemGroup>
<Compile Remove=”$(CompilerGeneratedFilesOutputPath)\**” />
</ItemGroup>

Now that we have the files in the project file structure, it can be added to source control and any changes tracked. This then means that the files form part of the code review process and can be checked for anything ‘dodgy’ going on in the code

View Generated Files in Visual Studio

Since VS2019 16.10 and now in VS2022, the Solution Explorer window can now drill down into generated files by expanding each source code generator listed under the Analyzers node in the Solution Explorer window.


You may be interested in my previous blog post where I show how to set up source generator debugging in Visual Studio 2019. Whilst a couple of screens differ in VS2022 (due to the new debugging profile window), the guide works as well


This feature does not require the EmitCompilerGeneratedFiles to be enabled as this uses the in-memory compiler generated code (which *should* be the same as the emitted files.

In the screen shot below, I enabled the ‘Show All Files’ button and expanded out both the compiler generated output in the Analyzers node and also the emitted output files in the GeneratedFiles folder

VS Solution Explorer showing generated files

One thing to also consider is that a sneaky attack may generate different code depending on whether the build is using a Debug or Release profile (as developers usually build locally in debug mode, whilst the CI build engine will be using release)

Demo Code

I have made the solution shown in the above image available in a GitHub repo at https://github.com/stevetalkscode/sourcegeneratorattacks

The solution illustrates the techniques described above using two source code generators.

The first is a simple generator that creates a class that has some of the danger signs to look out for based on Marten’s blog.

The second is a demonstration of the new System.Text.Json code generation that has been introduced with .NET 6 for generating serialisation and deserialisation code that would previously have been handled by reflection (and is still the default behaviour unless explicitly used as shown in the demo).

Conclusion

The above discussion may put you off using source generators. This should not be the case as this feature of .NET is incredibly powerful and is starting to be used by Microsoft in .NET 6 with the improvements made to Json serialisation, razor page compilation and logging.

But if you are using source generators that you (or your team) have not written, you need to be aware of the potential dangers of not verifying where the generator has come from and what it is doing to your code.

“Let’s be careful out there!”


While You Are Here …

If you have enjoyed this blog post, you may be interested in my talk that I presented at DDD East Midlands where I discuss my experiences with source code generation of various shades over the past 30 years.

Using OpenApiReference To Generate Open API Client Code

This is not one of my usual blogs, but an aide-mémoire for myself that may be of use to other people who are using the OpenApiReference tooling in their C# projects to generate C# client code for HTTP APIs from Swagger/OpenApi definitions.

Background

I have been using Rico Suter’s brilliant NSwag for some time now to generate client code for working with HTTP API endpoints. Initially I was using the NSwag Studio application to create the C# code and placing the output into my project, but then I later found I could add the code generation into build process using the NSwag MSBuild task.

Recently though, I watched the ASP.NET Community Standup with Jon Galloway and Brady Gaster. In the last half hour or so, they discuss the Connected Services functionality in Visual Studio 2019 that sets up code generation of HTTP API endpoint clients for you.

This feature had passed me by and watching the video got my curiosity going as to whether the build chain that I have been using for the last couple of years could be simplified. Especially given that, behind the scenes, it is using NSwag to do the code generation.

Using Connected Services

Having watched the video above, I recommend reading Jon Galloway’s post Generating HTTP API clients using Visual Studio Connected Services As that post covers the introduction to using Connected Services, I won’t repeat the basics again here.

It is also worth reading the other blog posts in the series written by Brady Gaster:

One of the new things I learnt from the video and blog posts is to make sure that your OpenApi definitions in the source API include an OperationId (which you can set by overloads of  the HttpGet, HttpPost (etc) attributes on your Action method) to help the code generator assign a ‘sensible’ names to the calling method in the code generated client.

Purpose of This Post

Having started with using the Visual Studio dialogs to set up the Connected Service, the default options may not necessarily match with how you want the generated code to work in your particular project.

Having had a few years’ experience of using NSwag to generate code, I started to dig deeper into how to get the full customisation I have been used to from using the “full” NSwag experience but within the more friendly build experience of using the OpenApiReference project element.

One Gotcha To Be Aware Of!

If you use the Connected Services dialog in Visual Studio to create the connected service, you will hit a problem if you have used a Directory.Packages.props file to manage your NuGet packages centrally across your solution. The Connnected Services wizard (as at time of writing) tries to specific versions of NuGet packages.

This is part of a wider problem in Visual Studio (as at time of writing) where the NuGet Package Manager interaction clashes with the restrictions applied in Directory.Packages.props. However, this may be addressed in future versions as of the NuGet tooling and Visual Studio per this Wiki post.

If you are not familiar with using Directory.Packages.props, have a look at this blog post from Stuart Lang

Manually Updating the OpenApiReference Entry in your Project

There isn’t much documentation around on how to make adjustments to the OpenApiReference element that gets added to your csproj file, so hopefully this post will fill in some of the gaps until documentation is added to the Microsoft Docs site.

I have based this part of the post on looking through the source code at https://github.com/dotnet/aspnetcore/tree/main/src/Tools/Extensions.ApiDescription.Client and therefore some of my conclusions may be wrong, so proceed with caution if making changes to your project.

The main source of information is the Microsoft.Extensions.ApiDescription.Client.props file which defines the XML schema and includes comments that I have used here.

The OpenApiReference and OpenApiProjectReference Elements

These two elements can be added one or multiple times within an ItemGroup in your csproj file.

The main focus of this section is the OpenApiReference element that adds code generation to the current project for a specific OpenApi JSON definition.

The OpenApiProjectReference allows external project references to be added as well. More on this below,

The following attributes and sub-elements are the main areas of interest within the OpenApiReference element.

The props file makes references to other properties that live outside of the element that you can override within your csproj file.

As I haven’t used the TypeScript generator, I have focussed my commentary on the NSwagCSharp code generator.

Include Attribute (Required)

The contents of the Include attribute will depend on which element you are in.

For OpenApiReference this will be the path to the OpenApi/Swagger json file that will be the source the code generator will use.

For OpenApiProjectReference this will be the path to another project that is being referenced.

ClassName Element (Optional)

This is the name to give the class that will be generated. If not specified, the class will default to the name given in the OutputPath parameter (see below)

CodeGenerator Element (Required)

The default value is ‘NSwagCSharp’. This points to the NSwag C# client generator, more details of which below.

At time of writing, only C# and TypeScript are supported, and the value here must end with either “CSharp” or “TypeScript”. Builds will invoke a MSBuild target named “Generate(CodeGenerator)” to do actual code generation. More on this below.

Namespace Element (Optional)

This is the namespace to assign the generated class to. If not specified, the RootNamespace entry from your project will be used to put the class within your project’s namespace. You may choose to be more specific with the NSwag specific commands below.

Options Element (Optional)

These are the customisation instructions that will be passed to the code generator as command line options. See Customising The Generated Code with NSwag Commands below for details about usage with the NSwagCSharp generator

One of the problems I have been having with this element is that the contents are passed to the command line of the NSwagCSharp as-is and therefore you cannot include line breaks to make it more readable.

It would be nice if there was a new element that allows each command option to be listed as an XML sub-element in its own right that the MSBuild target concatenates and parses into the single command line to make editing the csproj file a bit easier.

Possible Options Declaration

OutputPath (Optional)

This is the path to place generated code into. It is up to the code generator as to whether to interpret the path as a filename or as a directory.

The Default filename or folder name is %(Filename)Client.[cs|ts].

Filenames and relative paths (if explicitly set) are combined with
$(OpenApiCodeDirectory). Final value is likely to be a path relative to
the client project.

GlobalPropertiesToRemove (OpenApiProjectReference Only – Optional)

This is a semicolon-separated list of global properties to remove in a ProjectReference item created for the OpenApiProjectReference. These properties, along with Configuration, Platform, RuntimeIdentifier and
TargetFrameworks, are also removed when invoking the ‘OpenApiGetDocuments’ target in the referenced project.

Other Property Elements

In the section above, there are references to other properties that get set within the props file.

The properties can be overridden within your csproj file, so for completeness, I have added some commentary here

OpenApiGenerateCodeOptions

The Options element above if not populated defaults to the contents of this element, which in of itself is empty by default.

As per my comment above for Options, this suffers the same problem of all command values needing to be on the same single line.

OpenApiGenerateCodeOnBuild

If this is set to ‘true’ (the default), code is generated for the OpenApiReference element and any OpenApiProjectReference items before the BeforeCompile target is invoked.

However, it may be that you do not want the generated called on every single build as you may have set up a CI pipeline where the target is explicitly invoked (via a command line or as a build target) as a build step before the main build. In that case, the value can be set to ‘false’

OpenApiGenerateCodeAtDesignTime

Similar to OpenApiGenerateCodeOnBuild above, but this time determines whether to generate code at design time as well as being part of a full build. This is set to true by default.

OpenApiBuildReferencedProjects

If set to ‘true’ (the default), any projects referenced in an ItemGroup containing one or many OpenApiProjectReference elements will get built before retrieving that project’s OpenAPI documents list (or generating code).

If set to ‘false’, you need to ensure the referenced projects are built before the current project within the solution or through other means (such as a build pipeline) but IDEs such as Visual Studio and Rider may get confused about the project dependency graph in this case.

OpenApiCodeDirectory

This is the default folder to place the generated code into. The value is interpreted relative to the project folder, unless already an absolute path. This forms part of the default OutputPath within the OpenApiReference above and the OpenApiProjectReference items.

The default value for this is BaseIntermediateOutputPath which is set elsewhere in your csproj file or is implicitly set by the SDK.

Customising The Generated Code with NSwag Commands

Here we get to the main reason I have written this post.

There is a huge amount of customisation that you can do to craft the generated code into a shape that suits you.

The easiest way to get an understanding of the levels of customisation is to use NSwag Studio to pay around with the various customisation options and see how the options affect the generated code.

Previously when I have been using the NSwag MSBuild task, I have pointed the task to an NSwag configuration json file saved from the NSwag Studio and let the build process get on with the job of doing the code generation as a pre-build task.

However, the OpenApiReference task adds a layer of abstraction that means that just using the NSwag configuration file is not an option. Instead, you need to pass the configuration as command line parameters via the <Options> element.

This can get a bit hairy for a couple of reasons.

  • Firstly, each command has to be added to one single line which can make your csproj file a bit unwieldy to view if you have a whole load of customisations that you want to make (scroll right, scroll right, scroll right again!)
  • Secondly, you need to know all the NSwag commands and the associated syntax to pass these to the Options element.

Options Syntax

Each option that you want to pass takes the form of a command line parameter which

  • starts with a forward slash
  • followed by the command
  • then a colon and then
  • the value to pass to the command

So, something like this: /ClientBaseClass:ClientBase

The format of the value depends on the value type of the command of which there are three common ones

  • boolean values are set with true or false. E.g. /GenerateOptionalParameters:true
  • string values are set with the string value as-is. E.g. /ClassStyle:Poco
  • string arrays are comma delimited lists of string values. E.g.
    /AdditionalNamespaceUsages:MyNamespace1,MyNamespace2,MyNamespace3

The following table is a GitHub gist copy from the GitHub repository I have set up for this and which I plan to update over time as I get a better understanding of each command and its effect on the generated code.

At time of writing, many of the descriptions have been lifted from the XML comments in the code from the NSwag repository on GitHub.

(Apologies the format of the imported markdown here is not great. I hope to make this a bit better later when I can find the time. You may want to go direct to the gist directly)

Conclusion

The new tooling makes the code generation build process itself a lot simpler, but there are a few hoops to jump through to customise the code generated.

I’ve been very impressed with the tooling and I look forward to seeing how to it progresses in the future.

I hope that this blog is of help to anyone else who wants to understand more about the customisation of the OpenApiReference tooling and plugs a gap in the lack of documentation currently available.