How to modernise Microsoft .NET applications – part II

Short summary from Part I

In Part I, I’ve discussed how I’ve analysed the applications landscape and what criteria were used to prioritise the modernisation efforts.

  • Upgrade to a Windows OS version with LTSC
  • Upgrade to a .NET Framework or .NET Core version with LTS
  • Classify you applications according to systems of a record, systems of differentiation and systems of innovation for a better prioritisation
  • Investigate options for reducing operational overhead (move to PaaS or SaaS solution, use containers, use DevOps)
  • Make an in depth application assessment looking for:
    • Source code availability
    • 3rd party components
    • Running versions of Windows OD, .NET, .NET Core
    • Type of application (desktop, web, Windows Service, IIS Service)
    • Technologies incompatible with moving the app into cloud
      • Local storage
      • Embedded logging
      • Embedded config parameters
      • State management
      • Databases
      • Hostname, DNS dependency, localhost dependency, etc
      • Rights for the application to be able to run properly
      • Application security (authentication and authorisation)
    • If you want to port the application, make a list of technologies that are used by the app and are not compatible with .NET Core and seek for alternatives
      • Windows Communication Foundation WCF
      • Windows Workflow WF
      • ASP.NET Web Forms
      • .NET Remoting
    • Check the use of entity framework

To further evaluate the applications, I’ve used a set of Microsoft tools that can provide an evaluation of the current application state. They are available as extensions to Visual Studio or standalone tools.

  • .NET Portability Analyser – is a tool that analyses assemblies and provides a detailed report on .NET APIs that are missing for the applications or libraries to be portable on your specified targeted .NET platforms. The Portability Analyser is offered as a Visual Studio Extension, which analyses one assembly per project.
  • .NET API Analyser – The .NET API Analyser is a Roslyn analyser that discovers potential compatibility risks for C# APIs on different platforms and detects calls to deprecated APIs and comes as a NuGet package.
  • .NET Framework Analyser – You can use the .NET Framework Analyser to find potential issues in your .NET Framework-based application code. This analyser finds potential issues and suggests fixes to them and also can highlight any issues that need to be addressed prior to moving to a new version of .NET Framework or .NET Core.

2. Select a modernisation approach

The approach to modernisation and the prioritisation of investment (development effort) have to be dictated by business priorities (business strategy) and requirements. Also, data gathered from the in depth assessment of the application have to be taken into account when establishing a modernisation approach.

In my landscape, the architectural drivers for modernisation were:

  • Improved stability – critical applications were having a high fault rate and the maintenance effort was pretty high due to deprecated technologies
  • New functionalities – had requests from business for new functionalities, which were cheaper (effort wise) to implement if we first modernised the app and moved it to cloud
  • Ability to respond to customer faster – had requests from customers (trough business demands) for new functionalities that couldn’t be provided in a short time frame, short enough to become a market advantage
  • Quicker bug fixes – as we own majority of the source code for our landscape, that wasn’t a very important driver but it can be added to the list
  • Improving scaling capabilities – this was an important driver for us, as N-tier applications are quite difficult to scale horizontally (taking advantage of the cloud elasticity capabilities) and vertical scaling has it’s limits (both technically and financially)
  • New market challenges – again, N-tier based applications are not very agile and keeping up with services offered by competitors is hard to do, especially in a dynamic sector like banking and finance

From an operational overhead point of view, following things had to be improved by modernisation:

  • Applications with obsolete functionalities, requiring large amount of support work – I’m having in the landscape old applications, which now are used only at a fraction of their functionalities but still critical therefore requiring high SLAs and dedicated support staff
  • Skills – well, not every average .NET developer we can hire still remembers .NET Remoting and how to work with it
  • Internal standards and IT strategy – as we are trying to move in an Agile direction (this implies also a change in our technical thinking, more APIs, REST services, containers, DevOps approach) obsolete technologies are becoming more than a technical debt, more like a barrier. Try to run a .NET Remoting app in a container or try to horizontally scale it

Cloud migration options for applications

According to industry specific guidelines (Gartner and others) we have the following options when migrating applications to cloud:

  • Lift-and-shift aka Rehosting
  • Revise
  • Replatform
  • Rearchitect
  • Refactor or Rebuild

Each of these methods has its trade-offs. Modernisation efforts fits into every one of the above mentioned approaches (except lift-and-shift, which is not a modernisation per se) as my long term goal is to move most of the applications to the cloud.

Lift-and-shift – It means that basically we have taken or apps from on premises data center VMs to cloud VMs (ok, plus some additional networking to support the infrastructure). This was the first wave of cloud migration in my landscape and it worked pretty well for the intended purpose which was to lower the operational costs and maintenance overhead. Also we have gained some increased uptime (it’s nice to have a VM Scale Set configured in a couple of minutes instead of hours or days) but that was not modernisation, merely tinkering around the edges of the application rather than make significant changes.

So, whenever you want to reduce the operational and support overhead and (or) cannot make significant changes at application code level (maybe you don’t have the source code anymore, maybe the app is making extensive use of obsolete technologies, maybe the cost vs benefits ratio is not making the business case) then lift-and-shift approach will do just fine.

Revise – In the first step, we tried to get away with minimal changes to the applications, goal being able to move them to PaaS cloud solutions (Azure Web Apps) and on the long term to use Windows Containers. What we had done for this was just to update .NET Framework to an LTS version and solve the eventual incompatibilities and also solve the no-go cloud technologies that were used by the app (hard coded config parameters, local file system usage).

Replatform – It’s kind of a middle ground between lift and shift and rearchitect of refactor. It involves a bit of modifications to the code to take advantage of the new cloud infrastructure. For example, for a part of my landscape I’ve decided to replatform, I’ve did some changes, like:

  • Instead of internal message queues we switched to Azure Service Bus
  • No more local file system storage allowed, switched everything to storage accounts
  • Use of Azure Web Apps deployment slots for test/dev/staging environments

In this change, we achieved some cost reductions (compute density for Web Apps it is anyway better than for individual VMs), better release management, increased SLA and lower operational overhead (no internal effort needed for Web Apps) and also laid the ground for the next wave of modernisation.

Rearchitect – It involves a complete rearchitecting of the app to better suit the cloud environment (Azure in my case). This involves significant alterations to the app. It has also the advantage that we can target specific cloud services (like AKS or AWS Elastic Beanstalk or AWS ECS). We are in process of doing just that with some of the most revenue intensive applications, using a microservices approach.

Rebuild – Basically, you will rewrite the app from scratch, trowing away the existing code base. In this case, first ensure there is a valid business case for it and support (and funding) from the stakeholders. I have some candidates for this kind of work, but the business case is not justifying it.

When choosing a modernisation approach, evaluate also if you can replace it with SaaS solution. If the app is supporting business processes that are making little value or they are not a differentiation (think of HR, facility management and so on) then it makes a lot of sense to just replace it with a SaaS offering.

Another viable option is to just retire an application. If the business process has been updated and no longer requires it or of the business is not using it anymore (or using just 10% out of it, maybe a report or two but you still support it) then it’s a good idea to just retire it. To quote Gregor Hohpe “If you never kill anything, you will live among zombies“.

Choose an Architectural Approach

As I advanced further with application assessment, it become clear that I should choose at lest one architectural approach for the modernisation efforts (actually, I’ve chosen two approaches, depending of the business value of the applications). I’ve had a list of constraints like:

  • Increase delivery agility
  • Increase applications capability to further innovate and sustain change
  • Lower running costs

So I was beginning to look into APIs and microservices and I came up with a concept inspired by Gartner’s MASA proposal.

MASA stands for Mesh Apps and Services Architecture and is an agile architecture composed of decoupled apps, mediated APIs and services. It includes architectural principles like decoupling apps components using APIs, create services of optimal granularity (see Domain Driven Design) and it advocates designing fit-for-purpose apps. Each component has an API (or at least will consume an API), and when all connected, they will form a mesh of interconnected services and applications.

MASA Concept

Using this approach, allowed us to simplify somehow the apps, to use .NET Framework or .NET Core in various apps and also enabled us to use polyglot persistence (with SQL and NoSQL).

We sliced the applications in kind of microservices. And I say kind of microservices because we didn’t followed exactly all the guidelines to segregate the apps down to the lowest level but actually to what was comfortable for the development teams and from requirements perspective. And we end up having macroservices.

A good reason for this is that we are starting from monolithic applications, some of them being quite big (think of a core banking app). In this case is not practically to go directly to by the book microservices, instead we are developing new functionalities as micro or macro services and in the same time we are extracting functionalities and redevelop them as standalone services (see strangler design pattern) .

Mediated APIs apply one or more mediators to manage communication between an API consumer and the service that implements the API. API mediation reduces the complexity of managing multiple back-end services and increases the choice of technologies and models used to build services and in this scope, another core component is an API Gateway.

What I can say is that going to a macroservices (or microservices) way is not an easy thing. We had long internal discussion about setting the scope border for each service, sometimes we got back to the drawing board, realising that initial scope is wrong (too broad, or too narrow). Also, when applying a strangler pattern and started to slice a monolith, data persistence problems and databases consistency along the old monolith and the service which is sharing most of the same persistence repository are hard to deal with.

As we started this process of modernisation about 9 months ago, we are now in full traction on it. The initial assessment and in depth analyses did take about 3 months and since then we are in full development phase. AAs we started with some low hanging fruits, now we already see some benefits.

Costs for running applications that have been modernised and moved to Azure (with full shebang on cloud functionalities) is definitively lower than before. Also support and maintenance overhead is lower and horizontal scaling works like a charm when things are done properly.

We have improved our SLAs and decreased the incidents, meaning we have now a happier business with willing to going further investing in this modernisation.

We also had many obstacles. For the first, is start to decide from where to start because in the previous landscape. monolithic apps were so tightly coupled that any change would brake a lot of things.

Also, when modernising an app that depends of other apps, a lot of interim solutions must be provided, until we could modernise also the other apps.

We had to implement an ODS (Operational Data Store) just for decoupling apps, in the first stage of the modernisation process (ODS was a good idea, we will keep it for good).

Overall, things are looking good, investment was worth making it and it started to pay off.

How to modernise Microsoft .NET applications – part I

As an architect in a company that is basically a “Microsoft shop”, most of my landscape (if not all) is made out of custom applications based on Microsoft technologies and .NET Framework, with versions starting from .NET 2.0 up to .NET 4.5, which are running along enterprise Microsoft applications like Sharepoint, Dynamics, Office 365 and so on.

The custom applications landscape has both desktop applications, built on a heavy client philosophy, making use of .NET Remoting functionalities (which are long deprecated) and more modern web applications, built using .NET 4.0.

As we are moving the landscape to Microsoft Azure cloud, I’m facing some challenges about bringing old applications to current century, making them ready for a cloud migration, deciding what will be migrated as lift and shift, where we will use replatform and what applications will be refactored.

In any of the above mentioned migration strategies, a major show stopper is , in many cases, the legacy technologies used by applications. Old versions of .NET Framework or obsolete technologies that are preventing us to even try a lift and shift approach must be dealt somehow and a long term approach to modernise and further support our business applications must be established.

Microsoft .NET Framework was released in 2002 and since then the company have invested heavily in the entire landscape up to the point where we have about 70% of the business applications portfolio based on .NET.

In 2019, Microsoft announced that its development effort will be focused on .NET Core, with the Windows .NET Framework moving to LTS (Long Term Support). The consequences are that only bug fixes and security patches will be provided and no new features will be added to .NET Framework.

All of our applications are still meeting their business requirements and business is heavily relying on them to perform and bring revenue. On the same time, we are trying to accelerate application delivery and move to cloud-native architectures for the new applications we are providing for the business and also to decide how old applications are going to be fitted in the cloud landscape. Also we intend to move away from the traditional n-tier application design and use modern architecture patterns like APIs, microservices and so on.

From a technical perspective, there are some standardised steps to be taken when reviewing the applications landscape.

Identify end of life platforms and .NET Framework versions for each application

When talking about application support and compatibility with newer .NET versions, the following criteria I’ve found out to be of significant importance:

  • NET Framework and .NET Core version support
  • Windows OS end of life
  • Cloud migration

.NET Framework and .NET Core version support

According to Microsoft, at the time of writing, their official support policy for .NET Framework I’ve summarised it in the table below:

MS .NET VersionCurrently supportedEnd of LifeLTS
.NET 1.0 and SPsNoYes, since 2009
.NET 2.0NoYes, since 2011
.NET 3.0NoYes, since 2011
.NET 3.5YesNoYes, up to 2028
.NET 4.0NoYes, since 2016
.NET 4.5NoYes, since 2016
.NET 4.5.1NoYes, since 2016
.NET 4.5.2Yes
.NET 4.6Yes
.NET 4.6.1Yes
.NET 4.6.2Yes
.NET 4.7Yes
.NET 4.7.1Yes
.NET 4.7.2Yes
.NET 4.8YesOngoing LTS
Reference: https://support.microsoft.com/en-us/lifecycle/search?alpha=.net%20framework

Microsoft .NET Core versions support

,NET Core VersionCurrently supportedEnd of lifeLTS
.NET Core 1.0NoSince 2019
.NET Core 1.1NoSince 2019
.NET Core 2.0NoSince 2018
.NET Core 2.1YesNoYes, up to 2021
.NET Core 2.2NoSince 2019
.NET Core 3.0NoSince March 2020
.NET Core 3.1YesYes, up to 2022
Reference: https://dotnet.microsoft.com/platform/support/policy/dotnet-core

After compiling the tables above, I’ve decided that applications which use a non LTS .NET Framework or .NET Core must be migrated to a LTS version. In some cases this is just a matter of recompile and redeploy the application (fortunate cases where the company is the owner of the source code). For others, where we have lost the knowledge about source code, requirements and so on, they are perfect candidates for assessment of application business value, consolidation of functionalities to another application or platform or just plain rewrite from scratch, if the business revenue justifies it.

The same applies to work in progress applications, where it is easier to switch to a LTS version of .NET or .NET Core because we were using anyway a newer version of .NET.

For the externally sourced applications, things are a bit more tricky because in some cases we have to wait for a new major version release from the vendor (if we are lucky enough that the vendor has something like this in his roadmap).

In other cases, the vendor may have a SaaS equivalent solution that might work for us (if all constraints allow the use of a SaaS solution instead of a hosted one).

If none of the above can be done, then it all comes to a business decision.

Windows OS versions support

Currently, the picture for Windows OS support looks like this:

Windows VersionCurrently supportedEnd of lifeLTSC
Windows Server 2019Yes, until January 2024 with Extended until January 2029
Available also as container image
NoYes
Windows Server 2016Yes, until January 2022 with Extended until January 2027
Available also as container image
NoYes
Windows Server 2012 R2Ended on October 2018. Extended until 2023Not yetNo
Windows Server 2012Ended on October 2018. Extended until 2023Not yetNo
Windows Server 2008 R2Ended on January 2015. Extended until January 2020YesNo
Reference: https://support.microsoft.com/en-us/lifecycle/search/1163

As you can see, Windows Server 2019 and 2016 are under support, with Long Term Servicing Channel, meaning we can rely on them for the next 5 to 8 years. Also, an important aspect is they are also available as Windows Container Images and we can take advantage of that trying to move some applications to Windows Containers, Doker and Kubernetes on Azure.

Cloud Migration

My end goal is to have most of the applications prepared to run in Azure cloud, in a hybrid environment. Of course, there were still be stuff on premises, stuff that can’t be moved to cloud due to regulations or just it is to expensive to refactor and run it in cloud.

Also, as we operate in many countries (Europe, South America and Africa), another goal is data center consolidation and for this to be efficiently accomplished, I have to have all applications modernised to be monitoring and automation friendly (Remember: Never send a human to do a machine’s job)

Now, after I’ve established a clear course of action regarding .NET platform and Windows OS versions, I had to come with a prioritised list of applications and a modernisation method for each of them.

The first step towards this is to identify which modernisation approach works best for each application, decision which is more than a technical one. First, you have to assess the business value for each application and then see which app will benefit the most from modernisation and also what kind of modernisation should be applied. Some high value, high revenue applications are worth investing in and makes sense to go on the microservices path but others will do just fine with a rehosting approach. Also, the effort for modernisation should be realistically taken into account and aligned with budgeting and projects portfolio (for the execution phase, TOGAF ADM might be a valuable tool).

In my case, for prioritising applications, I’ve used Gartner’s definition to create viewpoints by which the applications portfolio can be viewed, to identify potential modernisation candidates.

Gartner identifies three main application categories:

  • Systems of a record: capabilities that have a clear focus on standardisation or operational efficiency.
  • Systems of differentiation: business capabilities that enable unique processes or industry specific capabilities.
  • Systems of innovation: new business capabilities to address emerging business needs, new business opportunities and modes.

Whatever applications fall under systems of differentiation or systems of innovation are sure candidates for modernisation effort.

Actually, here the discussion is a bit broader, as many applications from those two categories are tightly coupled with some system of records, which in turn must be somehow touched to sustain the modernisation effort (for example, if a system of differentiation is a mobile banking application which is tightly coupled with CRM, which is a system of a record and CRM is an old monolithic application, for implementing, let’s say, real time push of data from CRM to mobile banking then CRM must be modified or another layer added in between).

Another goal of modernisation was to reduce the operational overhead associated with applications support and here there a couple of options:

  • Move to a PaaS solution, switching the overhead to a service provider
  • Migrate to a better cloud integrated platform like Azure Kubernetes Service – AKS and use all available functionalities for support, monitoring and maintenance
  • Long shot – adopt a DevOps approach. This wasn’t taken into account in the first phase of the modernisation as the organisation was not ready

For assessing each application current state and actions and effort estimation for modernisation, I’ve used a list of steps, like below.

1. In depth application assessment

  • Review the current state of the application
    • Do we have the latest version source code available? If yes, is this source checked in a version control tool? Is available and documented? Is there some knowledge about it in dev teams?
    • Is the source code functional? Can be built a running version of the application?
    • Do we know all 3rd party components and libraries used? Do we have them. in a build ready state?
    • Are the 3rd party components available in an LTS .NET Framework version? Are they available for .NET Core? Are they still maintained and supported?
    • Is the application using .NET Framework or .NET Core? And what version?
    • On what version of Windows is currently running? Does the version of Windows match our LTS requirements?
    • What kind o application are we assessing? Is a heavy client desktop application, a web app, a Windows Service or an IIS service?
  • Assess the use of technologies that are not compatible with a move to Azure cloud.
    • Legacy applications often used technologies that were appropriated for on premises intranet hosting but if you try to move them to cloud, those approaches are not working anymore. Issues that I’ve faced are listed below:
Technology Description Mitigation
Local storageApplication is storing files, either temporary or long term on a local files system, on path hard coded or configured – Remove absolute paths and migrate to block storage (Azure storage accounts)
Embedded loggingApplication uses non standard logging system and / or writes logs locally– Use a centralised logging system
– Use standard .NET logging components
– Use Azure Application Insights as much as possible
Embedded configuration parametersConfiguration parameters are stored in config files or hard coded in the application– Use a centralised config repository and modify the app to extract config parameters from the repository
– Make use of Azure Key Vault for securely storage of config parameters
For web applications, state is not explicitly managed or is managed in databasesState management for web application is not dealt with, leaving web servers to handle it or is stored in databases– If you want to take advantage of cloud horizontal scaling and elasticity capabilities, state must be taken out of the application itself and be stored and managed separately.
– Make use of Azure Redis cache or other in memory caching tools for state management
SQL DatabasesApplication is using SQL databases In some cases, if the application is not utterly complex or not heavily transactional, then it can make sense to try and change the database from an SQL based one to a NoSQL DB, like Cosmos DB for cloud hosting. A proper designed NoSQL data model can simplify the data access layer and also can come with a cost reduction in cloud hosting. Instead of having an SQL instance to pay for, you just pay for Cosmos DB capacity. Of course, in such a shift in DB technology there are also other things to consider (if app really needs a NoSQL DB, dev teams skills, cost estimation and so on)
Multicast broadcastingSending one to many messages on a network segment. Haven’t met the case yet.Change to message queues
Localhost IP addressesApplication is using localhost or 127.0.0.1 addressesEnsure that application can lookup it’s own hostname
Hostnames or DNS dependenciesHard coded addresses or URLs areusedSame as for config parameters, use a centralised repository or service discovery
Full trust code or admin userApplication requires elevated privileges to runIdentify what and why there such a need and change it
Application securityApplication is using a built in log in mechanismMake use on single sign on.
If moving to cloud is an option, then use Azure AD, or Azure B2C AD
  • .NET Standard is a formal specification of .NET APIs that are intended to be available on all .NET implementations and are included in modern .NET implementations as part of the Base Class Library (BCL).
    • NET Standard versions are of two types:
      • Additive: .NET Standard versions are logically concentric circles: higher versions incorporate all APIs from previous versions.
      • Immutable: Once shipped, .NET Standard versions are frozen
    • I advise to target the latest LTS version of .NET core and also the latest LTS version of .NET Framework (4.8)
  • If we decide to rewrite the application to .NET Core, let’s see what components are not compatible with .NET Core and will never be added to .NET Core
    • Windows Communication Foundation WCF: This a framework for building service-oriented applications. Using WCF, you can send data as asynchronous messages from one service endpoint to another. A service endpoint can be part of a continuously available service hosted by IIS, or it can be a service hosted in an application
    • ASP.NET Web Forms
    • .NET Remoting: .NET Remoting is a framework where you can invoke or consume methods or objects in a remote computer named the server from your computer, the client. .NET Remoting was superseded by WCF in later versions of .NET Framework, and only remains a component of .NET Framework for backward compatibility purposes
    • Windows Workflow WF: Technology that provides an API, an in-process workflow engine, and a rehostable designer to implement long-running processes as workflows within .NET applications

If you are using one of the above technologies, then alternatives must be searched for or a major rewrite of the application.

  • Check to see if any kind of entity framework is used

Official Definition: “Entity Framework is an object-relational mapper (O/RM) that enables .NET developers to work with a database using .NET objects. It eliminates the need for most of the data-access code that developers usually need to write.”

If you are not already using it, I strongly suggest using entity framework (or other O/RMs) because it helps saving a lot of time in the development process and also helps in standardising data access and manipulation for your applications.

As per the above figure, Entity Framework fits between the business entities (domain classes) and the database. It saves data stored in the properties of business entities and also retrieves data from the database and converts it to business entities objects automatically.

Also Entity Framework is available for .NET Core, as per below table.

End of part I.

Go to part II

Integrating Azure API Gateway (Azure APIM) with Application Gateway and multiple custom domains

Use case

I have a bunch of APIs that I want them exposed on Internet in a controlled and secure fashion.On the same time, for business reasons, I might want them published on Internet on multiple custom domains, like api.client1.com, api.client2.com and so on. Due to fact that a Premium Tier Azure APIM is pretty expensive, the use case requires the use of a single Azure APIM.

For exposing APIs on Internet, in a secure, controlled and protected fashion, there is the option to run Azure APIM in a VNET internal integration mode, meaning that APIs published on APIM are accessible only for requests coming from within the VNET.

For those APIs to also be available on Internet, Microsoft reference architecture states that an Application Gateway should be placed in from of Azure API Gateway, to handle incoming requests from Internet and forward them to the internal Azure APIM (VNET integrated). Also, another advantage of using an Application Gateway in from of APIM is that it has WAF and can protect the APIs from DDOS attacks, SQL injections and other OWASP issues. Last, but not least, Application Gateway allows multi path routes creation (like api.client1.com/backend, api.client1.com/frontend, allowing or denying requests to published APIs). Also, using an Application Gateway allows having more than one HTTPS listener, one for each custom domain we want to publish APIs for.

Microsoft reference architecture:

Having the above picture in mind, we can see that Azure APIM is running in internal VNET integration mode, with published APIs accessible from within the VNET. For outside access (from Internet), an Application Gateway is placed in front of it, allowing traffic to flow from Internet to APIs.

The solution

Now, let’s see how the proposed setup looks like:

First we have the VNET in which APIM is integrated to run, in internal mode.

  • Two subnets, one for APIM, one for Application Gateway
  • One Azure Private DNS Zone. As once integrated with a VNET, in internal mode, APIM doesn’t have anymore DNS resolution and also APIM does not answer by it’s IP address, only by name, we need a private DNS zone for name resolution inside VNET
  • Application Gateway with 2 ore more HTTPS listeners, one for each domain we want to publish APIs for (also we need multiple HTTPS listeners for APIM Developer Portal and Management APIs)
  • Application Gateway provides WAF and API protection against OWASP vulnerabilities
  • APIs to be published

In this configuration, all the calls going to APIM service pass trough Application Gateway (also, internal call, from within VNET can be made to pass trough Application Gateway using the private front-end). Application Gateway routes call using IP addresses but APIM responds only to the host names. Since there will be only one virtual IP for all endpoints exposed by APIM, we will use the Azure Private DNS Zone to make appropriate DNS A records for each endpoint (api, developer portal, etc).

The setup

First we will create a VNET with three subnets:

  • appgw-subnet, for Application Gateway
  • apim-subnet, for APIM
  • div-subnet, for other services, such a test VM

Once the VNET and subnets created, we can proceed to APIM creation, from the portal (I advise to stick with the Developer Tier for this example, as Premium Tier is pretty expensive and the rest of tiers don’t have VNET integration).

Next,we need to setup the custom domain for APIM. Default, it is created with <name you provided on creation>.azure-api.net. Even if we can still use it like this and set up custom domains only at Application Gateway level, it is better to use a custom internal domain. like api.xyz.com.

For this, we need three SSL certificates, following domains:

  • api.xyz.com
  • developer-portal.xyz.com
  • api-management.xyz.com

When setting up a custom domain for APIM, we need SSL certificates. You can use SSL App Service Certificates from Azure (but they are not free, are issued by GoDaddy and costs about 52 Euro/year) or a service like LetsEncrypt or SSLForFree. I’ve also tried self signed certificates, but the setup is not working, with some errors on APIM level (don’t know why, didn’t dig into this).

If you use Azure App Service Certificates:

  • Buy a certificate from Azure Portal (search in the search bar for SSL Certificates and choose the App Service Certificate option)
  • You will have to buy a certificate for each APIM endpoint (Gateway, Developer Portal, Management API and probably SCM)
  • After buying, store the certificate in a Key Vault and verify it, using the wizard provided by Azure
  • Now, for using this certificate in APIM and Application Gateway, we need the certificate PFX file and a password.
  • The following PowerShell script will extract the PFX and will generate the password:
$appServiceCertificateName = "<Your certificate name>"
$resourceGroupName = "<Resource group name, where you saved the certificate"
$azureLoginEmailId = (Get-AzureRmADUser -DisplayName "<Your Azure login display name>").UserPrincipalName
$subscriptionId = "<Your Azure Subscription ID>"


#login to Azure
Login-AzureRmAccount
Set-AzureRmContext -SubscriptionId $subscriptionId

#Get the KeyVault Resource Url and KeyVault Secret Name were the certificate is stored
$ascResource = Get-AzureRmResource -ResourceName $appServiceCertificateName -ResourceGroupName $resourceGroupName -ResourceType "Microsoft.CertificateRegistration/certificateOrders" -ApiVersion "2015-08-01"
$keyVaultId = "<Your key vault ID, where you stored the certificate>"
$keyVaultSecretName = "<KeyVault secret name, corresponding to the certificate>"

$certificateProperties=Get-Member -InputObject $ascResource.Properties.certificates[0] -MemberType NoteProperty
$certificateName = $certificateProperties[0].Name
$keyVaultId = $ascResource.Properties.certificates[0].$certificateName.KeyVaultId
$keyVaultSecretName = $ascResource.Properties.certificates[0].$certificateName.KeyVaultSecretName

#Split the resource URL of KeyVault and get KeyVaultName and KeyVaultResourceGroupName
$keyVaultIdParts = $keyVaultId.Split("/")
$keyVaultName = $keyVaultIdParts[$keyVaultIdParts.Length - 1]
$keyVaultResourceGroupName = $keyVaultIdParts[$keyVaultIdParts.Length - 5]

#Only users who can set the access policy and has the the right RBAC permissions can set the access policy on KeyVault, if the command fails contact the owner of the KeyVault
Set-AzureRmKeyVaultAccessPolicy -ResourceGroupName $keyVaultResourceGroupName -VaultName $keyVaultName -UserPrincipalName $azureLoginEmailId -PermissionsToSecrets get

#Getting the secret from the KeyVault
$secret = Get-AzureKeyVaultSecret -VaultName $keyVaultName -Name $keyVaultSecretName
$pfxCertObject=New-Object System.Security.Cryptography.X509Certificates.X509Certificate2 -ArgumentList @([Convert]::FromBase64String($secret.SecretValueText),"", [System.Security.Cryptography.X509Certificates.X509KeyStorageFlags]::Exportable)
$pfxPassword = -join ((65..90) + (97..122) + (48..57) | Get-Random -Count 50 | % {[char]$_})
$currentDirectory = (Get-Location -PSProvider FileSystem).ProviderPath
[Environment]::CurrentDirectory = (Get-Location -PSProvider FileSystem).ProviderPath
[io.file]::WriteAllBytes(".\appservicecertificate.pfx", $pfxCertObject.Export([System.Security.Cryptography.X509Certificates.X509ContentType]::Pkcs12, $pfxPassword))

Write-Host "Created an App Service Certificate copy at: $currentDirectory\appservicecertificate.pfx"

Write-Warning "For security reasons, do not store the PFX password. Use it directly from the console as required."

Write-Host "PFX password: $pfxPassword"
  • After you fill in the blanks in the script and run it (in a PowerShell console, with Run As Administrator), you will have a PFX file and a password for it. Save the password from the console, in a text file, we will use it a bit later.
  • Repeat the above steps for each custom domain, for all the APIM endpoints you want to expose and save PFX files and passwords

If you use SSLForFree or LetsEncrypt

  • After you obtain a certificate from SSLForFree, you can download it. The download has three files: 2 .crt files and a .key file. We need a .PFX file and a password.
  • Using a command line tool, like OpenSSL, following commands are needed:
For creating the PFX file from .crt and .key file:
openssl.exe pkcs12 -export -out api-xyz-com.pfx -inkey private.key -in certificate.crt
and do that for each custom domain you need

For creating .cer file from .crt file (it will be needed in Application Gateway)
openssl.exe x509 -inform pem -in certificate.crt -outform der -out api-xyz-com.cer

Now we have the certificates and .PFX files needed and you can proceed and configure custom domains for APIM. I will not detail this task, I assume it’s clear how this is done, using Azure Portal (link to MS docs here).

Next step, now that Azure APIM has been configured for custom domains, is to create a Private DNS Zone and link it to the initially created VNET. Again, the exact steps are beyond this article, you can find here relevant MS documentation.

One thing bear in mind, the private DNS Zone must be created for the same domain you used in APIM for custom domain. Let’s say that if you APIM Gateway endpoint is api.xyz.com, domain is xyz.com. So you will have to create the Private DNS Zone for xyz.com domain. And don’t forget to link the Private DNS zone to VNET after it has been created.

So far, VNET, along with its subnets, is created, SSL certificates are present, APIM is created and configured with custom domains.

Next step is to integrate APIM with the VNET, in internal mode.

This is done by going to Azure Portal -> Your APIM -> Deployment and infrastructure -> Virtual Network, from where you can choose the before created VNET and apim-subnet, as subnet. After hitting Save, it will take some time until done, for Development Tier.

Once APIM joined with VNET, now it’s time to create DNS A records in the private DNS zone. For this, first we need the Private IP of APIM, once joined to VNET.

For this, go to Azure Portal -> Your resource group -> Your APIM -> Properties and copy from there the Private Virtual IP

Now, go to the created Private DNS Zone and create A records for each APIM endpoint you want to use (Gateway, Developer Portal, Management API). The end result should look like this:

Now, APIM endpoint are accessible from within the VNET under the same custom domain. You can deploy a VM in the VNET and then try to access an API exposed by APIM using api.xyz.com as a hostname. If ain’t working, then something is wrong with the Private DNS zone.

Remember that if you change VNET network configuration after you joined APIM to VNET (like changing DNS servers, switching to custom provided instead of Azure’s) then you have to submit a call to Apply Network Configuration Updates of APIM Management API. Details in this link and in this one.

Status: we have the VNET and subnets configured, APIM up and running and configured for custom domains, Private DNS zone configured and linked to VNET, APIM also joined to VNET. Next, setup the Application Gateway.

First, create a Public IP in the resource group you are working in. Choose it as Standard because Application Gateway is also Standard Tier.

Next, start the Application Gateway creation and give it a name and choose it as Standard V2 (or WAF V2, WAF can be anyway enabled later).

In the next step, Frontends, select Frontend IP Address Type as Public and select from the dropdown the above created Public IP.

In the backends step, select “Add a backend pool” and fill in the wizard like below.

Remark: If you have configured the Private DNS zone and linked it to VNET, then APIM endpoints are accessible by host names from Application Gateway (because Application Gateway is in the same VNET as APIM and uses the Private DNS zone for name resolution). If you decide not to use Private DNS zone then you have to bear in mind that APIM backend will be accessible only by IP, so in the backend configuration window, above, instead of domain name you will have to fill in the APIM Private Virtual IP.

Another consequence of not using Private DNS zones will be that from other VMs or Azure components in the VNET, APIM endpoints will not be accessible by hostnames (there is no name resolution), only by Private Virtual IP, but in the same time, APIM is not answering requests coming to it’s IP address, only to hostnames. A way to circumvent this is to use the hosts file (on VMs with Linux or Windows) or to build your own custom DNS server and use it as DNS server for the entire VNET.

References on this topic: here and here.

Once backends step is completed, go to the next one, Configuration and add a routing rule.

  • Fill in a name for the routing rule
  • Fill in a name for the listener
  • Select “Public” for the Frontend IP
  • Protocol HTTPS, because we want our APIs to be available on Internet with HTTPS access only
  • Once selected HTTPS, a certificate must now be uploaded to AppGW to be able to perform it’s SSL wizardry. Upload the same pfx file you used for the custom domain configured in APIM (let’s say that for Gateway endpoint you uploaded in APIM a pfx certificate file named api.xyz.com.pfx, for the same domain. The same file will be uploaded here, in AppGW)
  • Fill in a name for the certificate and the password (same password used in APIM custom domains)
  • Listener type will be selected as multi site, because we want to expose on Internet not only api.xyz.com but also developer-portal.xyz.com and management-api.xyz.com
  • Next click on “Backend targets” tab
  • Select “Backend pool” as Target Type and select and the backend created earlier from “Backend target” dropdown
  • For HTTP settings click on “Add new”
  • Fill in a name for HTTP Setting
  • Select “Backend protocol” as HTTP (we will change it later)
  • On “Override with new host name” select “Yes” and:
    • If you have configured Private DNS zone, then select “Pick host name from backend target”
    • Else, select “Override …” and enter specific endpoint address. If you don’t have a form of name resolution on VNET and backend is configured with APIM Private Virtual IP and because APIM responds only to hostnames, not IP here we have to specify a hostname to be “mocked” by Application Gateway when it makes the request to APIM
  • Select “yes” for the custom probe. We will come back later to this to fully configure it

Now click “Save” and then finish the wizard and create the Application Gateway.

After Application Gateway has been created, open it in Azure Portal and go to Health Probes section and open the probe created by default (should be only one after Application Gateway creation).

In the “Path” setting, enter the following address:

/status-0123456789abcdef

This is a special link where Azure APIM can be interrogated if it’s healthy or not, by AppGw. Also, make sure that “Host” setting is correct, pointing to the correct APIM endpoint. Test and save and that’s about it so far.

Last thing that have to be done is to create a DNS A record in Azure Public DNS zones (or wherever you are hosting your domain) and point it to the public IP of the Application Gateway. See example below.

Right now we have a functional setup, the APIs exposed in APIM are now available on Internet, on a custom domain, api.xyz.com. There still are two problems.

1. No end to end SSL yet

We don’t have yet end to end SSL. Right now, SSL communication is enforced between the end user and Application Gateway. For the path between Application Gateway and APIM, we haven’t yet configured SSL, just plain HTTP. If you want to test now your API by calling it from Internet, you first have to set it up to answer to both HTTP and HTTPS (by default, it’s HTTPS only), from APIM (see below picture).

You can make now a test using curl and call the default API provided by APIM, using syntax:

curl -v -X GET "https://api.xyz.com/echo/resource?param1=sample&param2=10" -H "Ocp-Apim-Subscription-Key: your api subscription key"

I will get back to end to end SLL topic a bit later.

We have only one custom domain on Internet

Right now, we are exposing the APIM default example API on Internet under api.xyz.com. But what if we need to expose the same API, from the same APIM instance on another custom domain, api.abc.com?

Well, this is just a matter of creating another HTTPS listener, for the second domain, at Application Gateway level and bind it to the same HTTP setting and backend created for the initial domain. The only thing, besides listener, that must be created is a second Rule, where you actually bind the listener to the same backend and HTTP setting created earlier.

So proceed by creating a new SSL certificate for the second domain (api.abc.com, let’s say) and pfx, crt, and cer files, as we did in this initial setup.

Next, open the Application Gateway in Azure Portal, go to Listeners and create a new listener. Follow the same steps, regarding certificate and other settings as in the beginning of this article. What is different, for the second domain is that when choosing Listener type, Multisite you have to enter the host name for the second domain (api.abc.com)

Now, the only next thing that has to be created for the second domain to work is a routing rule.

From the Rules menu, choose to create a new routing rule.

In the “Listener” tab, select from dropdown the newly created listener (the one for the second domain). In the “Backend targets” tab, select as backend target the same backend pool created for the first domain and the same HTTP setting created for the first domain.

Basically, for the second domain, we will associate, at AppGw level, the same backend and HTTP setting from the first domain (I assume that we are doing now work for the same APIM endpoint, api. So api.abc.com will use the same settings as api.xyz.com, meaning it will go the same backend APIM endpoint).

Last step, for this new domain, add a DNS A record in Azure Public DNS zones (or wherever you are hosting your doamin) pointing it to the public IP of AppGw.

And so on, for each domain for the same APIM endpoint. If I remember correctly, Application Gateway V2 supports up to 100 SSL certificates (and listeners).

For other APIM endpoint (developer portal, management api) the procedure is quite similar. For each endpoint, a new backend must be created in AppGw, pointing to the correct APIM endpoint (one for developer-portal.xyz.com, one for api-management.xyz.com) and a new HTTP setting for each APIM endpoint.

An then create a new HTTPS listener for each endpoint (developer-portal.xyz.com, developer-portal.abc.com) and the routing rules to match each listener with the correct backend and HTTP settings.

End to end SSL

So far, the setup is working only partially on SSL. We have SSL only from the end user to the Application Gateway. For the path between AppGw and APIM, we haven’t configure SSL, we are still on plain HTTP.

For configuring SSL, we have to whitelist the SSL certificates from APIM custom domain (api.xyz.com) at AppGw level. The trick here is that on HTTPS setting, AppGw wants you to load a .cer file as a certificate, not a pfx file as we did for the listener.

As the .cer file obtained in the beginning of this article from the pfx file, using OpenSSL command is invalid for Application Gateway (don’t know why, it will spill out an error when trying to save HTTPS setting), we have to stick to Microsoft way of doing this.

From File Explorer, right click on the pfx file used to configure custom domain on APIM (in this example, api.xyz.com) and select “Install PFX”. You will need the password for the certificate and make sure that you mark “Mark this key as exportable” option.

From this step forward, please follow Microsoft guidelines of how to extract the needed .cer file, depending of what type of Application Gateway you use (V1 or V2). Guideline is here.

After you have obtained the .cer file,let’s go back to Application Gateway, in Azure portal and create a new HTTP setting.

When you create a new HTTP setting, this time select HTTPS in “Backend protocol” option, load the .cer file as certificate, set “Yes” for “Override with new host name” and select “Override with specific domain name” and use your domain. In this example domain is api.xyz.com. For the moment,leave “No” selected for “Use custom probe”, we will create it immediate afterwards. Click “Save” and way to go (remember the name you have given for this new HTTP seeting).

Next, go to “Health probe” and create new probe. This new probe is mostly identical with the first created, with some differences.

  • First, select HTTPS for “Protocol”
  • Path, don’t forget to add the specific path ( /status-0123456789abcdef )
  • In the last option, HTTP setting, select the HTTP setting name you have saved for the above created HTTP setting.
  • Click “Save”

Now, go back to the created HTTP setting and in “Custom probe” select “Yes” and select from the dropdown the name of the https probe created earlier.

Now, last step is to change the first created routing rule, to use the HTTPS setting we have created now, instead the old HTTP one.

Go to “Rules” and select to edit the routing rule. Once opened, in the “Backend targets” tab, in HTTP settings, select from the dropdown the second HTTP setting, created for HTTPS. Click save and that is about all.

To check that the new HTTPS setup is working, go to “Backend healh” and there everything should be green.

Conclusions

Now we have SSL working from the user who is calling the API up to APIM, so data in transit is protected by SSL.

Also, we have at least two different Internet visible domains (api.xyz.com and api.abc.com) who are pointing to the same API, exposed by the same APIM instance.

Having an Application Gateway in front of APIM, we can enable WAF and have the APIs fully protected against various attacks and OWASP vulnerabilities. (with some caveats, see the warnings in documentation).

We can also expose the other APIM endpoints (developer portal, management api, SCM) using the same procedure and also it can be done for multiple Internet visible domains.

If you want further protection for APIs, you can segregate them in external APIs (available from Internet) and internal APIs (available only from within the VNET) using paths like api.xyz.com/external/api1 and api.xyz.com/internal/api2 and configure Application Gateway to route only calls that have “external” in their URL path and to drop all calls that have “internal” in the path, making sure that no request originating from Internet can reach your internal APIs.

References

When I first started to dig into this setup, following links (blogs and Microsoft documentation) were useful to fully understand how AppGw and APIM are working toghether.