The Service Oriented Architecture (SOA) introduced a design paradigm, which talks about a highly decoupled service deployment where the services talk to each other over the network with a standardized message format, in a technology agnostic manner, irrespective of how each service is implemented. Each one of them has a well defined, published service description or a service interface. In practice, the message format was standardized over SOAP, which is a standard introduced by W3C in early 2000, which is also based on XML — the service description was standardized over WSDL, another W3C standard and the service discovery was standardized over UDDI — once again a W3C standard. All these were the fundamentals of SOAP-based web services and incrementally, the web services became synonymous to SOA — and that lead to its own downfall as an architectural pattern. The base principals of SOA started to fade. The WS-* stack (WS-Security, WS-Policy, WS-Security Policy, WS-Trust, WS-Federation, WS-Secure Conversation, WS-Reliable Messaging, WS-Atomic Transactions, WS-BPEL, etc…) introduced by OASIS, further made SOA complex enough, so that an average developer will find it hard to digest.

Now, after so many years, we have started a journey to go back in time to realize the base principals of SOA — and we call it as Microservices. A microservice will provide a focused, scoped, and modular approach for application design.

Microservices is one of the most popular buzzwords, along with the Internet of Things (IoT), containerization and now blockchain. Everyone talks about microservices and everyone wants to have microservices implemented. The term ‘microservice’ was first discussed at a software architects workshop in Venice, in May 2011. It’s being used to explain a common architectural style they’ve been witnessing for some time.

Building Microservices by Sam Newman is an awesome book, and as you read through it, you will realize, microservices is not just SOA done right! It’s not just about an architectural pattern — but a new culture building around an architectural pattern, driven by the primary goal — the faster deployment or the speed to production.

There are multiple perspectives in securing microservices:

Secure Development Lifecycle and Test Automation: The key driving force behind microservices is the speed to production. One should be able to introduce a change to a service, test it and instantly deploy it in production. To make sure we do not introduce security vulnerabilities at the code level, we need to have a proper plan for static code analysis and dynamic testing — and most importantly those tests should be part of the continuous delivery (CD) process. Any vulnerabilities should be identified early in the development lifecycle and should have shorter feedback cycles.

DevOps security: There are multiple microservices deployment patterns — but the most commonly used one is service-per-host model. The host does not necessarily mean a physical machine — most probably it would be a container (Docker). We need to worry about container-level security here. How do we isolate a container from other containers and what level of isolation we have between the container and the host operating system?

Application level security: How do we authenticate and access control users to microservices and how do we secure the communication channels between microservices?

This blog post presents a security model to address the challenges we face in securing microservices at the application level.

Monolithic vs. Microservices

In a monolithic application, where all the services are deployed in the same application server, the application server itself provides session management features. The interactions between services are local calls and all the services can share user’s login status. Each service (or the component) needs not to authenticate the user. Authentication will be done centrally at an interceptor, which intercepts all the service calls. Once the authentication is completed, how to pass the login context of the user, between the services (or components) varies from one platform to another. The following diagram shows the interactions between multiple components in a monolithic application.

In a Java EE environment the interceptor can be a servlet filter. This servlet filter will intercept all the requests coming to it’s registered context(s) and will enforce authentication. The service invoker should either carry valid credentials or a session token that can be mapped to a user. Once the servlet filter finds the user, it can create a login context and pass it to the downstream components. Each downstream component can identify the user from the login context to do any authorization.

Security becomes challenging in a microservices environment. In the microservices world, the services are scoped and deployed in multiple containers in a distributed setup. The service interactions are no more local, but remote, mostly over HTTP. The following diagram shows the interactions between multiple microservices.

The challenge here is, how we authenticate the user and the pass the login context between microservices, in a symmetric manner, and then how each microservice would authorize the user.

Securing Service-to-Service Communication

In this blog post I discuss two approaches to secure service-to-service communication. One is based on JWT and the other is based on TLS mutual authentication.

JSON Web Token (JWT)

JWT (JSON Web Token) defines a container to transport data between interested parties. It can be used to:

Propagate one’s identity between interested parties.

Propagate user entitlements between interested parties.

Transfer data securely between interested parties over an unsecured channel.

Assert one’s identity, given that the recipient of the JWT trusts the asserting party.

A signed JWT is known as a JWS (JSON Web Signature) and an encrypted JWT is known as a JWE (JSON Web Encryption). In fact a JWT does not exist itself — either it has to be a JWS or a JWE. It’s like an abstract class — the JWS and JWE are the concrete implementations.

The user context from one microservice to another can be passed along with a JWS. Since the JWS is signed by a key known to the upstream microservice, the JWS will carry both the end user identity (as claims in the JWT) and the identity of the upstream microservice (via the signature). To accept the JWS, the downstream microservices first need to validate the signature of the JWS against the public key embedded in the JWS itself. That’s not just enough — we need to check whether we trust that key or not. Trust between microservices can be established in multiple ways. One way is to provision the trusted certificates, by service, to each microservice. It’s no brainer to realize that this would not scale in a microservices deployment. The approach I would like to suggest is to build a private certificate authority (CA) and also there can be intermediate certificate authorities by different microservices teams, if a need arises. Now, rather than trusting each and every individual certificate, the downstream microservices will only trust either the root certificate authority or an intermediary. That will vastly reduce the overhead in certificate provisioning.

Cost of JWT Validation

Each microservice has to bear the cost of JWT validation, which also includes a cryptographic operation to validate the token signature. Caching the JWT at the microservices level against the data extracted out of it would reduce the impact of repetitive token validation. The cache expiration time must match the JWT expiration time. Once again the impact of caching would be quite low if the JWT expiration time is quite low.

Identifying the User

The JWT carries a parameter called sub in its claim-set, which represents the subject or the user who owns the JWT. In addition to the subject identifier, the JWT can also carry user attributes such as first_name, last_name, email, likewise. If any microservice needs to identify the user during its operations, this is the attribute it should look into. The value of the sub attribute is unique only for a given issuer. If you have a microservice, which accepts tokens from multiple issuers, then the uniqueness of the user should be decided as a combination of the issuer and the sub attribute.

The aud parameter in the JWT claim-set, specifies the intended audience of the token. It can be a single recipient or a set of recipients. Prior to any validation check, the token recipient must first see whether the particular JWT is issued for its use, if not should reject immediately. The token issuer should know, prior to issuing the token, who the intended recipient (or the recipients) of the token and the value of the aud parameter must be a pre-agreed value between the token issuer and the recipient. In a microservices environment, one can use a regular expression to validate the audience of the token. For example, the value of the aud in the token can be *.facilelogin.com while each recipient under the facilelogin.com domain can have its own aud values: foo.facilelogin.com, bar.facilelogin.com likewise.

TLS Mutual Authentication

Both in TLS mutual authentication and JWT-based approach, each microservice needs to have it’s own certificates. The difference between the two approaches is, in JWT-based authentication, the JWS can carry both the end user identity as well as the upstream service identity. With TLS mutual authentication, the end user identity has to be passed at the application level.

Certificate Revocation

In both the approaches we discussed above, certificate revocation is bit tricky. Certificate revocation is a harder problem to solve — though there are multiple options available:

CRL (Certification Revocation List / RFC 2459)

OCSP (Online Certificate Status Protocol / RFC 2560)

OCSP Stapling (RFC 6066)

OCSP Stapling Required (draft-hallambaker-muststaple-00)

The CRL is a not more often used technique. The client who initiates the TLS handshake has to get the long list of revoked certificates from the corresponding certificate authority (CA) and then check whether the server certificate is in the revoked certificate list. Instead of doing that for each request, the client can rather cache the CRL locally. Then you run into the problem that the security decisions are made based on stale data. When TLS mutual authentication is used, the server also has to do the same certificate verification against the client. Eventually the people recognized that CRLs are not going to work and started building something new, which is the OCSP.

In the OCSP world the things were little bit better than CRL. The TLS client can check the status of a specific certificate without downloading the whole list of revoked certificates from the certificate authority. In other words each time the client talks to a new downstream microservice, it has to talk to the corresponding OCSP responder to the validate the status of the server (or the service) certificate — and the server has to do the same against the client certificate. That creates one hell of a traffic on the OCSP responder. Once again clients still can cache the OCSP decision, but then again will lead to the same old problem of making decisions on stale data.

With OCSP stapling the client does not need to go to the OCSP responder each time when it talks to a downstream microservice. The downstream microservice will get the OCSP response from the corresponding OCSP responder and staple or attach the response to the certificate itself. Since the OCSP response is signed by the corresponding certificate authority, the client can accept it by validating the signature. This makes things little better, instead of the client, now the service has to talk to the OCSP responder. But in a mutual TLS authentication model, this won’t bring any additional benefits when compared with the plain OCSP.

With OCSP must stapling, the service (downstream microservice) gives a guarantee to the client (upstream microservice) that the OCSP response is attached to the service certificate it receives during the TLS handshake. In case the OCSP response is not attached to the certificate, rather than doing a soft failure, the client must immediately reject the connection.

Short-Lived Certificates

From the end user perspective the short-lived certificates behave in the same way as the normal certificates work today, instead the short-lived certificates have a very short expiration. The TLS client needs not to worry about doing CRL or OCSP validations against short-lived certificates, rather sticks into the expiration time, stamped on the certificate itself.

Netflix and Short-Lived Certificates

The challenge in short-lived certificate mostly lies on its deployment and maintenance. Automation is the goddess of rescue! Netflix suggests using a layered approach to build a short-lived certificate deployment. You would have a system identity or a long-lived credentials that resides in a TPM (Trusted Platform Module) or an SGX (Software Guard Extension) having lot of security on it. Then use that credentials to get a short-lived certificate. Then use the short-lived certificate for your microservice, which would be consumed by another microservice. Each microservice can refresh the short-lived certificates regularly using its long-lived credentials. Having the short-lived certificate is not just enough — the underlying platform which hosts the service (or the TLS terminator) should support dynamic updates to the server certificate. A lot of TLS terminators out there support dynamically reloading the server certificates, but not with zero downtime in most of the cases.

The Edge Security

The common pattern to expose a set of microservices to the rest of the world is via the API Gateway pattern. With the API Gateway pattern — the microservices, which need to be exposed out side would have a corresponding API in the API Gateway. Not all the microservices need to be exposed out from the API Gateway.

The end user’s access to the microservices (via an API), should be validated at the edge — or at the API Gateway. The most common pattern for securing APIs is OAuth 2.0.

OAuth 2.0

OAuth 2.0 is a framework for access delegation. It lets someone doing something on behalf of someone else. OAuth 2.0 introduces multiple grant types. A grant type in OAuth 2.0 explains the protocol, one (a client) should follow to get the resource owner’s consent to access a resource on behalf of him/her. Also, there are some grant types, which explain the protocol to get a token, just on behalf of himself/herself (client_credentials) — in other words, the client is also the resource owner. The following diagram explains OAuth 2.0 protocol at a very high-level. It describes the interactions between the OAuth client, the resource owner, the authorization server and the resource server.

Whoever wants to access a microservice via the API Gateway, must get a valid OAuth token first. A system can access a microservice, just by being itself — or on behalf of another user. For the latter case, an example would be, a user logs into a web app and now the web app accesses a microservice on behalf of the user who logged in.

Let’s see how the end-to-end communication works, as illustrated in the above figure:

The user logs into the web app/mobile app via the Identity Provider, which the web app/mobile app trusts via OpenID Connect (this can be SAML 2.0 too).

The web app gets a OAuth 2.0 access_token and an id_token. The id_token will identify the end user to the web app. If SAML 2.0 is used, then the web app needs to talk to the token endpoint of the OAuth authorization server it trusts and exchange the SAML token to an OAuth access_token, following the SAML 2.0 grant type for OAuth 2.0.

The web app invokes an API on be half of the end user — passing the access_token along with the API request.

API Gateway intercepts the request from the web app, extracts out the access_token, talks to the Token Exchange endpoint (or the STS), which will validate the access_token and then issues a JWT (signed by it) to the API Gateway. This JWT will also carry the user context. While STS validating the access_token it will talk to the corresponding OAuth authorization server via the introspection API.

The API Gateway will pass through the JWT along with request to the downstream microservices.

Each microservice will validate JWT it receives and then for the downstream service calls, it can create a new JWT signed by itself and sends it along with the request. Also another approach is to use a nested JWT — so the new JWT will also carry the previous JWT.

With this approach — only the API calls coming from the external clients will go through the API Gateway. When one microservice talks to another — that needs not to go through the gateway. Also — from a given microservice perspective, whether you get a request from an external client or another microservice, what you get is a JWT — so this is a symmetric security model.

Access Controlling

Authorization is a business function. Each microservice can decide the criteria to allow access to its operations. In the simplest form of authorization, we check whether a given user can perform a given action on a particular resource. The combination of an action and a resource, is termed as the permission. An authorization check evaluates whether a given user has the minimum set of required permissions to access a given resource. The resource can define who can perform, which actions on it. The declaration of the required permissions for a given resource can be done in multiple ways.

XACML (eXtensible Access Control Markup Language)

XACML is the de-facto standard for fine-grained access control. It introduces a way to represent the required set of permissions to access a resource, in a very fine-grained manner in an XML-based domain-specific language (DSL).

The above figure shows the XACML component architecture. The policy administrator first needs to define XACML policies via the PAP (Policy Administration Point) and those policies will get stored in the policy store. To check whether a given entity has the permission to access a given resource, the PEP (Policy Enforcement Point) has to intercept the access request, create a XACML request and send it to the XACML PDP (Policy Decision Point). The XACML request can carry any attributes that could help the decision-making process at the PDP. For example, it can include the subject identifier, the resource identifier and the action the given subject is going to perform on the resource. The microservice that needs to authorize the user, has to build a XACML request by extracting out the relevant attributes from the JWT and talk to the PDP . The PIP (Policy Information Point) comes into the picture when the PDP finds, certain attributes required for policy evaluation are missing in the XACML request. Then the PDP will talk to the PIP to find the corresponding missing attributes. The PIP can connect to relevant data stores, finds the attributes and then feeds those into the PDP.

Embedded PDP

There are certain drawbacks in the remote PDP model, that could easily violate base microservices principals:

Performance cost: Each time when it is required to do an access control check, the corresponding microservice has to talk to the PDP over the wire. With decision caching at the client side, the transport cost and the cost of the policy evaluation can be cut down. But with caching, we will make security decisions based on stale data.

The ownership of policy information points (PIP): Each microservice should have the ownership of its PIPs, which know from where to bring-in data required to do access controlling. With the above approach we are building a ‘monolithic’ PDP, which has all the PIPs — corresponding to all the microservices.

As illustrated in the above figure, the embedded PDP will follow an eventing model, where each microservice will subscribe to it’s interested topics to get the appropriate access control policies from the PAP — and then update the embedded PDP. You can have PAPs by microservices teams or may be one globally in a multi-tenanted mode. When a new policy is available or when there is a policy update, the PAP will publish an event to the corresponding topic(s).

This approach will also not violate the ‘immutable server’ concept in microservices. Immutable server means — you build servers or containers directly out of configuration loaded from a repository at the end of the continuous delivery process — and one should be able to build the same container again and again with the same configuration. So — we would not expect anyone to log into a server and do any configuration changes there. With the embedded PDP model — even though the server loads the corresponding policies while it’s running — still if we spin up a new container it too gets the same set of policies.

Before we wind up — there is another important question to answer. What is the role of the API Gateway under the context of authorization. We can have globally accessible access control policies — which are applicable to the end user, enforced at the gateway — but not the service level policies. The service level policies must be enforced at the service level.

Update (10/14/2017): There is an interesting discussion on this blog, on Hacker News: https://news.ycombinator.com/item?id=15460645. Please check that out too!