Using OpenApiReference To Generate Open API Client Code

This is not one of my usual blogs, but an aide-mémoire for myself that may be of use to other people who are using the OpenApiReference tooling in their C# projects to generate C# client code for HTTP APIs from Swagger/OpenApi definitions.

Background

I have been using Rico Suter’s brilliant NSwag for some time now to generate client code for working with HTTP API endpoints. Initially I was using the NSwag Studio application to create the C# code and placing the output into my project, but then I later found I could add the code generation into build process using the NSwag MSBuild task.

Recently though, I watched the ASP.NET Community Standup with Jon Galloway and Brady Gaster. In the last half hour or so, they discuss the Connected Services functionality in Visual Studio 2019 that sets up code generation of HTTP API endpoint clients for you.

This feature had passed me by and watching the video got my curiosity going as to whether the build chain that I have been using for the last couple of years could be simplified. Especially given that, behind the scenes, it is using NSwag to do the code generation.

Using Connected Services

Having watched the video above, I recommend reading Jon Galloway’s post Generating HTTP API clients using Visual Studio Connected Services As that post covers the introduction to using Connected Services, I won’t repeat the basics again here.

It is also worth reading the other blog posts in the series written by Brady Gaster:

One of the new things I learnt from the video and blog posts is to make sure that your OpenApi definitions in the source API include an OperationId (which you can set by overloads of  the HttpGet, HttpPost (etc) attributes on your Action method) to help the code generator assign a ‘sensible’ names to the calling method in the code generated client.

Purpose of This Post

Having started with using the Visual Studio dialogs to set up the Connected Service, the default options may not necessarily match with how you want the generated code to work in your particular project.

Having had a few years’ experience of using NSwag to generate code, I started to dig deeper into how to get the full customisation I have been used to from using the “full” NSwag experience but within the more friendly build experience of using the OpenApiReference project element.

One Gotcha To Be Aware Of!

If you use the Connected Services dialog in Visual Studio to create the connected service, you will hit a problem if you have used a Directory.Packages.props file to manage your NuGet packages centrally across your solution. The Connnected Services wizard (as at time of writing) tries to specific versions of NuGet packages.

This is part of a wider problem in Visual Studio (as at time of writing) where the NuGet Package Manager interaction clashes with the restrictions applied in Directory.Packages.props. However, this may be addressed in future versions as of the NuGet tooling and Visual Studio per this Wiki post.

If you are not familiar with using Directory.Packages.props, have a look at this blog post from Stuart Lang

Manually Updating the OpenApiReference Entry in your Project

There isn’t much documentation around on how to make adjustments to the OpenApiReference element that gets added to your csproj file, so hopefully this post will fill in some of the gaps until documentation is added to the Microsoft Docs site.

I have based this part of the post on looking through the source code at https://github.com/dotnet/aspnetcore/tree/main/src/Tools/Extensions.ApiDescription.Client and therefore some of my conclusions may be wrong, so proceed with caution if making changes to your project.

The main source of information is the Microsoft.Extensions.ApiDescription.Client.props file which defines the XML schema and includes comments that I have used here.

The OpenApiReference and OpenApiProjectReference Elements

These two elements can be added one or multiple times within an ItemGroup in your csproj file.

The main focus of this section is the OpenApiReference element that adds code generation to the current project for a specific OpenApi JSON definition.

The OpenApiProjectReference allows external project references to be added as well. More on this below,

The following attributes and sub-elements are the main areas of interest within the OpenApiReference element.

The props file makes references to other properties that live outside of the element that you can override within your csproj file.

As I haven’t used the TypeScript generator, I have focussed my commentary on the NSwagCSharp code generator.

Include Attribute (Required)

The contents of the Include attribute will depend on which element you are in.

For OpenApiReference this will be the path to the OpenApi/Swagger json file that will be the source the code generator will use.

For OpenApiProjectReference this will be the path to another project that is being referenced.

ClassName Element (Optional)

This is the name to give the class that will be generated. If not specified, the class will default to the name given in the OutputPath parameter (see below)

CodeGenerator Element (Required)

The default value is ‘NSwagCSharp’. This points to the NSwag C# client generator, more details of which below.

At time of writing, only C# and TypeScript are supported, and the value here must end with either “CSharp” or “TypeScript”. Builds will invoke a MSBuild target named “Generate(CodeGenerator)” to do actual code generation. More on this below.

Namespace Element (Optional)

This is the namespace to assign the generated class to. If not specified, the RootNamespace entry from your project will be used to put the class within your project’s namespace. You may choose to be more specific with the NSwag specific commands below.

Options Element (Optional)

These are the customisation instructions that will be passed to the code generator as command line options. See Customising The Generated Code with NSwag Commands below for details about usage with the NSwagCSharp generator

One of the problems I have been having with this element is that the contents are passed to the command line of the NSwagCSharp as-is and therefore you cannot include line breaks to make it more readable.

It would be nice if there was a new element that allows each command option to be listed as an XML sub-element in its own right that the MSBuild target concatenates and parses into the single command line to make editing the csproj file a bit easier.

Possible Options Declaration

OutputPath (Optional)

This is the path to place generated code into. It is up to the code generator as to whether to interpret the path as a filename or as a directory.

The Default filename or folder name is %(Filename)Client.[cs|ts].

Filenames and relative paths (if explicitly set) are combined with
$(OpenApiCodeDirectory). Final value is likely to be a path relative to
the client project.

GlobalPropertiesToRemove (OpenApiProjectReference Only – Optional)

This is a semicolon-separated list of global properties to remove in a ProjectReference item created for the OpenApiProjectReference. These properties, along with Configuration, Platform, RuntimeIdentifier and
TargetFrameworks, are also removed when invoking the ‘OpenApiGetDocuments’ target in the referenced project.

Other Property Elements

In the section above, there are references to other properties that get set within the props file.

The properties can be overridden within your csproj file, so for completeness, I have added some commentary here

OpenApiGenerateCodeOptions

The Options element above if not populated defaults to the contents of this element, which in of itself is empty by default.

As per my comment above for Options, this suffers the same problem of all command values needing to be on the same single line.

OpenApiGenerateCodeOnBuild

If this is set to ‘true’ (the default), code is generated for the OpenApiReference element and any OpenApiProjectReference items before the BeforeCompile target is invoked.

However, it may be that you do not want the generated called on every single build as you may have set up a CI pipeline where the target is explicitly invoked (via a command line or as a build target) as a build step before the main build. In that case, the value can be set to ‘false’

OpenApiGenerateCodeAtDesignTime

Similar to OpenApiGenerateCodeOnBuild above, but this time determines whether to generate code at design time as well as being part of a full build. This is set to true by default.

OpenApiBuildReferencedProjects

If set to ‘true’ (the default), any projects referenced in an ItemGroup containing one or many OpenApiProjectReference elements will get built before retrieving that project’s OpenAPI documents list (or generating code).

If set to ‘false’, you need to ensure the referenced projects are built before the current project within the solution or through other means (such as a build pipeline) but IDEs such as Visual Studio and Rider may get confused about the project dependency graph in this case.

OpenApiCodeDirectory

This is the default folder to place the generated code into. The value is interpreted relative to the project folder, unless already an absolute path. This forms part of the default OutputPath within the OpenApiReference above and the OpenApiProjectReference items.

The default value for this is BaseIntermediateOutputPath which is set elsewhere in your csproj file or is implicitly set by the SDK.

Customising The Generated Code with NSwag Commands

Here we get to the main reason I have written this post.

There is a huge amount of customisation that you can do to craft the generated code into a shape that suits you.

The easiest way to get an understanding of the levels of customisation is to use NSwag Studio to pay around with the various customisation options and see how the options affect the generated code.

Previously when I have been using the NSwag MSBuild task, I have pointed the task to an NSwag configuration json file saved from the NSwag Studio and let the build process get on with the job of doing the code generation as a pre-build task.

However, the OpenApiReference task adds a layer of abstraction that means that just using the NSwag configuration file is not an option. Instead, you need to pass the configuration as command line parameters via the <Options> element.

This can get a bit hairy for a couple of reasons.

  • Firstly, each command has to be added to one single line which can make your csproj file a bit unwieldy to view if you have a whole load of customisations that you want to make (scroll right, scroll right, scroll right again!)
  • Secondly, you need to know all the NSwag commands and the associated syntax to pass these to the Options element.

Options Syntax

Each option that you want to pass takes the form of a command line parameter which

  • starts with a forward slash
  • followed by the command
  • then a colon and then
  • the value to pass to the command

So, something like this: /ClientBaseClass:ClientBase

The format of the value depends on the value type of the command of which there are three common ones

  • boolean values are set with true or false. E.g. /GenerateOptionalParameters:true
  • string values are set with the string value as-is. E.g. /ClassStyle:Poco
  • string arrays are comma delimited lists of string values. E.g.
    /AdditionalNamespaceUsages:MyNamespace1,MyNamespace2,MyNamespace3

The following table is a GitHub gist copy from the GitHub repository I have set up for this and which I plan to update over time as I get a better understanding of each command and its effect on the generated code.

At time of writing, many of the descriptions have been lifted from the XML comments in the code from the NSwag repository on GitHub.

(Apologies the format of the imported markdown here is not great. I hope to make this a bit better later when I can find the time. You may want to go direct to the gist directly)

Conclusion

The new tooling makes the code generation build process itself a lot simpler, but there are a few hoops to jump through to customise the code generated.

I’ve been very impressed with the tooling and I look forward to seeing how to it progresses in the future.

I hope that this blog is of help to anyone else who wants to understand more about the customisation of the OpenApiReference tooling and plugs a gap in the lack of documentation currently available.

Merging Multiple Git Repositories Into A Mono-Repo with PowerShell (Part 2)

Background

Following on from Part 1 where I give the background as to the reasons that I wanted to move to a single Git repository (also known as a mono-repo), this post provides a walk-through of the PowerShell script that I created to do the job.

The full script can be found at on GitHub in the MigrateToGitMonoRepo repository. The script make use of three ‘dummy’ repos that I have also created there. In addition, it also shows how to include repositories from other URLs by pulling in an archived Microsoft repository for MS-DOS.

Before Running the Script

There are a few things to be aware of when considering using the script.

The first is that I am neither a PowerShell nor Git expert. The script has been put together to achieve a goal I had and has been shared in the hope that it may be of use to other people (even if it is just my future self). I am sure there are more elegant ways of using both these tools, but the aim here was to get the job done as it is a ‘one-off’ process. Please feel free to fork the script and change it as much as you want for your own needs with my blessing.

The second thing to know is that the Git command line writes information to StdErr and therefore, when running, a lot of Git information will appear in red. All this ‘noise’ does make it hard to identify genuine errors. To this end, when developing and running the script, I used the PowerShell ISE to add breakpoints and step through the execution of code so I could spot when things were going wrong.

The last thing to be aware of is that there is no error handling within the script. For example, if a repo can’t be found or a branch specified for merging is not present, you may have unexpected results of empty temporary directories being rolled forward and then appearing as errors when Git tries to move and rename those directories.

With this said, the rest of the post will focus on how to use the script and some things I learnt along the way while writing it.

Initialising Variables

At the start of the script there are a number of variables that you will need to set.

The $GitTargetRoot and $GitTargetFolder refer to the file system directory structure. You may not want to have a double nested directory structure you can override this further down in the script. The reason I did this is that I like to have a single root for all my Git repos on the file system (C:\Git) and then a directory per repo under this.

The $GitArchiveFolder and $GitArchiveTags will be used as part of the paths in the target repo to respectively group all the existing branches and existing tags together so that there is less ‘noise’ when navigating to branches and tags created post-merge.

If all the existing repositories have the same root URL it can be set in the $originRoot variable. This can be overridden later on in the script to bring in repositories from disparate sources (in the script example, we pull in the MS-DOS archive repository from Microsoft’s GitHub account).

While the merge is in progress, it is important to avoid clashes with directory names on the file system and branch names in Git.

The $newMergeTarget and $TempFolderPrefix are used for the purpose of creating non-clashing versions of these. There is a clean up at the end of the script to rename temporary folders on the file system. The script does not automatically rename the target branch as this should be a manual process after the merge when ready to push to a new origin.

Define the Source Repositories and Merge Behaviour

The next stage in the script is to define all the existing repositories that you want to merge into a single repository. To keep the script agnostic in terms of PowerShell versions, I have used the pscustomobject type instead of using classes (supported from PowerShell 5 onwards).

In each entry, the following values should be set:

originRoot is usually left as an empty string to indicate that the root specified globally at the start of the script should be used. In the example, the last entry demonstrates pulling in a repo from a different origin.

repo is the repository within the origin. In the example I have three dummy repositories that I have created in my GitHub account that can be used as a trial run to get used to the script works before embarking on using your own repositories.

folder is the file system directory that the contents of the repository will be moved to once the migration is complete. This is used to ensure that there are no clashes between directories of the same name within different repositories. You are free to change how the overall hierarchy is structured once the migration is complete.

subDirectory is usually an empty string, but if you have several repositories that are you want to logically group together in the file system hierarchy, you can set folder to the same value, E.g. Archived and then use subDirectory to then define the target for each repo under that common area.

mergeBranch is the branch in the source repository that you want to merge into the common branch specified in $newMergeTarget. In most cases, this will be your ‘live’ branch with a name like main, master, develop or build. If left as an empty string, the repository will be included in the new mono-repo, but will effectively be orphaned into the archive branches and tags.

In my real-world case , the team had a few repositories that were created and had code committed, but the code never went anywhere, so not needed in the new main branch. However, we still want access to the contents for reference.

tempFolder is a belt-and-braces effort to ensure that there are no folder clashes if the new folder name in folder happens to exist in another repository while merging. The value here will be appended to the global $TempFolderPrefix with the intention of creating a unique directory name.

File System Clean Up

Before getting into the main process loop, the script does some cleaning up to ensure that previous runs of the script are deleted from the file system to ensure a clean run. You may want to change this if you want to compare results so that previous runs are archived by renaming the folder .

Once cleaned up, a new Git repository is created and an initial commit is created in the new branch. This is required so that Git merges can take place herein.

The Main Loop

With the array of source metadata created, we move into the main loop. I won’t go into a line by line breakdown here, but instead give an overview of the process.

The first thing to do for each repository is to set it as an origin and pull down all the branches to the file system. An important thing to note about the Git Pull is the –allow-unrelated-histories switch. Without this, Git will complain about no common histories to be able to merge.


As as aside, if your source repository is large, this may take some time. When developing the script, I thought something had gone wrong – it hadn’t – it was just slow.


With that done, we can then enter a loop of iterating through each branch and checking it out to its new branch name in the new repository (in effect, performing a logical move of the branch into an archive branch, but really this is just using branch naming conventions to create a logical hierarchy).

You may notice some pattern matching going on in this area of the script. The reason for this is that the Git branch -r command to list all the remote branches includes a line indicating where the orgin/HEAD is pointing. We do not need this as we are only interested in the actual branch names.Screen shot of Git output when listing remote branches

Once all the branches have been checked out and renamed, we return back to our common branch and remote the remote.

At this point, if we have specified a branch to merge into our common branch, the script will then

  • merge the specified branch, again using the –allow-unrelated-histories switch to let Git know that the merge has no common history to work with
  • create a temporary folder (as defined in the array of metadata) in the common branch
  • move the complete contents of the branch to that temporary folder

Care is needed in this last step once we have performed the first merge as the common folder will include previously merged repositories in their temporary folders. Therefore, to avoid these temporary folders being moved, we build up a list of the temporary folders we have created on each iteration and them to the exclude list that is fed into the Git mv command.

At this point, an error can creep in if the branch name specified in the item metadata does not exist in the source repository. When writing the script I received Git errors indicating there were no files to move and ended up with empty temporary folders littered around the new repository.

Again, you may choose to put some error handling in or, on the other hand, just correct the branch name and repeat the process from the start again.

Before moving to the next item in the metadata array, the script copies all the tags to the the logical folder of tags specified in $GitArchiveTags.

The Post Migration Clean Up

Once the migration has completed, there is a bit of tidying up to do.

If you remember, to avoid clashes between directories while the migration takes place, we used temporary directory names. We now need to do a sweep through to rename those temporary directory names to the intended destination names.

At this point, we are ready with the final mono-repo.

If you have run the script ‘as-is’ using my demo values, when you look on your file system, it should like like this

Screen shot of file system using the examples in the script

If you use a tool such as Atlassian SourceTree, you get a visual idea of what we have achieved with the merge process.

Screen shot of SourceTree view of the migrated repository using the examples in the script

Before Pushing to a Remote

With our migrated repository, we are now almost ready to push it up to a remote (be it GitHub, Azure DevOps, BitBucket et al).

However, at this point you may want to do some tidying up of renaming the __RepoMigration branch to main.

The repository is now in a state where you are ready to push it to a remote ‘as-is’. On the other hand, you may want to create an empty repository in the remote up front and then merge the migrated repository into it. If you do this, remember to use the # git pull –all –allow-unrelated-histories -v after adding the new remote.

At the end of the script, there is a commented out section that provides the commands I used to push up all the branches and tags created.

Alternatively, you may want to take manual control via the Git command line (or a GUI tool such as SourceTree).

Lessons Learnt

I have already mentioned earlier about problems with non-existent branches being specified, but there are other things to know.

My first piece of advice is to use the PowerShell Integrated Script Editor (ISE) to single step your way through the script using my dummy repositories to familiarise yourself with how the script works.

Once familiar, start with using one or two if your own repositories that are small and simple to migrate, to get a feel for how you want to merge branches into the new ‘main’ branch.

By single stepping, you will get instant feedback of errors occurring. As mentioned above, because Git writes to StdErr, it is hard to tease out the errors if running the script from start to finish.

Next, don’t automate pushing the results to your remote until you are happy that there is nothing missing and that the merges specified meet how you want to take the repository forward.

If you use a tool like SourceTree, don’t leave it running while the migration is taking place. Whilst it feels useful to graphically see what is happening while the script is running. it slows the process down and can in some cases cause the script to fail as files may become locked. Wait until the migration is complete and then open SourceTree to get a visual understanding of the changes made.

My last lesson is to have patience.

When I worked on this using real repositories, some of which had many years of histories, there are some heart-stopping moments when the repositories are being pulled down and it feels like something has gone wrong, but it hasn’t – it’s just Git doing its thing, albeit slowly!

Moving Forward

One of the downsides of mono-repos is size. In my real-world scenario that inspired this script and blog, the final migrated repo is 1.4GB in size. This is not massive compared to the likes of some well known mono-repos that are in the hundreds of gigabytes in size.

Once you have pushed the repository up to a remote, my advise is to clone the repo into a different local directory and only checkout the main branch (especially if you have a lot of orphaned archive branches that you don’t need to pull).

If disk size is still an issue, it is worth looking at the Git Virtual File System to limit the files that are pulled down to your local system

 

Conclusion

I hope that the two posts and the script are of help to people.

There is a lot of debate about the relative merits of poly-repo vs. mono-repo that I haven’t gone into. My view is to do what fits best and enables your team’ to work with minimal friction.

The reason for the migration that inspired this post was having difficulties in coordinating a release for a distributed monolith that was spread across several repositories. If you have many repos that have very little to do with one another (being true microservices or completely unrelated projects), there is probably no benefit to moving to a mono-repo.

In summary, to use a well worn cliché, “it depends”.

Merging Multiple Git Repositories Into A Mono-Repo with PowerShell (Part 1)

Following on from my last blog about the problems I had setting up Octopus Deploy with a service account, this is another DevOps related post that describes the approach I have taken to merging multiple Git repositories into a single Git repository (commonly known as a mono-repo).

Disclaimer

To be clear, I am not going to provide a wide ranging discussion about the relative merits and disadvantages of using a mono-repo for source control vs. having one repository per project (poly-repo).

In the end it comes down to what works best for the team to manage the overall code base by reducing friction.

If you have lots of disparate projects that have no impact on each other’s existence or a true micro-service architecture where each service is managed within its own repository, there is very little point in bringing these into a mono-repo.

If on the other hand you have a distributed monolith where a feature requests or bug fixes may be spread across several repositories and require synchronisation when negotiating their way through the CI/CD pipeline (or worse having to jump through hoops to develop or test in concert while developing on your local machine), then a move to mono-repo may be of benefit.

There is no ‘one size fits all’ and you may end up with a hybrid of some projects occupying their own repositories, whilst others live in one big repository.

Background

What prompted the need for a move to a mono-repo in my case was having to coordinate features within a distributed monolith where a feature request may span one, some or all of four key repositories and the only coordinating factor is a ticket number in the branch names used in each of the repositories.

This causes problems when having to context switch between multiple issues and making sure that

  • the correct branches are checked out in the repositories
  • configuration files are amended to point to appropriate local or remote instances of services
  • ensuring pull requests to branches monitored by TeamCity are coordinated as these also trigger Octopus Deploy to deploy to our common development environment
  • version numbers for different projects are understood and the inter-relationships are documented.

Now moving to a mono-repo is not going to solve all these problems, but it is the first step on the road.

(Moving To) A Whole New World

As described in my previous blog post, the team I am currently working with is in the process of completely rebuilding the CI/CD pipeline with the latest versions of Team City and Octopus Deploy.

This has provided the ideal opportunity to migrate from the current poly-repo structure to a new mono-repo. But how should we approach it?

Approaches to Consider

At it’s simplest, we could just take a copy of the current ‘live’ code and paste it into a new repository. The problem with this would be the loss of the ability to look at the (decade long) history in the context of the current repository. Instead, this would require hopping over to the existing repositories to view the history of files. We could live with, but is not ideal as it introduces friction of a different kind.

So, somehow, we need to try to migrate everything to one place, but this comes with complications.

In each of the current repositories, the source code is held at the root of each repo, so when trying to merge the repositories ‘as-is’, it introduces problems when trying to merge the contents of each of the existing repositories as it will cause no end of merge conflicts and muddy the code base. Therefore, the first thing we will need to do is to move the source code down from the root into dedicated (uniquely named) folders.

This could be done within the existing repositories before we think about merging repositories. However, this will mean having to revisit all the existing Team City projects to repoint the watched projects to the new folders. This also causes disruption to any current work that is in progress. So this approach should be ruled out.

There is also the problem of what to do with all the branches and tags in the old repositories. Ideally we want to also bring them along into the new repository, but we have a similar problem regarding trying to avoid naming conflicts (as I mentioned above, the branch names are the coordinating factor that we currently use, so these will be the same in each of the repositories where code has changed for a particular feature), so these will need renaming as well.

I’ll Tell You What I Want (What I Really, Really Want)

With all the above in mind, we need a migration plan that can accommodate the following requirements:

  • No changes required to the existing repositories
  • The full history needs to be migrated
  • All live branches need to be migrated
  • All tags need to be migrated
  • Avoid clashes between migrated repositories when brought into a single structure
  • Allow for a pre-migration to be run so that the new Team City can be set up without impacting the existing repositories and existing CI/CD pipeline
  • The process must be repeatable with minimum effort so that any problems can be identified and corrected, but also so that the new CI/CD pipeline can be built in preparation for a low-impact cut-over.

At first this seemed like a tall order, but ultimately what this boils down to is creating a new repository and then repeating the following steps for each legacy repository to be merged:

  • Pull each legacy repository the new repository
  • Rename the legacy branches and tags to they do not clash
  • Select a ‘live’ branch to merge into the main branch of the new repository and check it out
  • Move the content of the ‘live’ branch to a sub-folder that will not clash as other repositories are subsequently migrated
  • Merge the ‘live’ branch into the main branch

These steps can all be achieved by a combination of Git commands and file system commands which can be put together into a script.

In Part 2, I will show you how I created a PowerShell script to achieve the goal.