The same can be achieved with the way we are currently getting our secrets from Vault. For instance, all the services make a REST call to get temporary credentials to use the database. We could potentially abstract this part and make it a dedicated lambda function that we include as a step.

Continuous Delivery

There is also an opportunity to simplify and make the continuous delivery flow faster. By moving away from containers, we are removing a level of complexity and can focus on only delivering the executable JAR.

Action Plan

Like with all major changes, establishing a well defined plan is crucial to ensure a success. Ours was pretty straightforward:

Migrating one service at the time, starting with Tourist Set up the local development environment Create a new CI/CD pipeline Experimenting in both testing and UAT environments Learn, measure and improve Repeat for other services when an acceptable level of confidence is reached

First Attempt

A very naive first approach has been the following:

Add a ‘handleFunction’ entry-point to the code of one of our services (this function will be triggered by API Gateway) Upload the jar on AWS Lambda See what happens

The result wasn’t as bad as expected, the function was triggered. However, this is a completely different behavior — how were we supposed to run an app that relies on an embedded Jetty server and SpringBoot, from a simple function?

Turns out there is a workaround to start a SpringBoot app programmatically, yay!

As you might expect, there is a catch… At runtime, Spring will basically scan your code to perform dependency injection of the Beans and apply other configuration files. Guess what? It uses reflection and that takes time.

10 seconds from the moment I trigger the function (cold start) to the moment the app was ready. Way too long for a Lambda function.

Additionally, what is the point of keeping SpringBoot if we aren’t using its nice features? None.

Second Attempt

The other solution, my favorite, meant more work: cleaning the service from unnecessary code and libraries (including, but not limited to, SpringBoot).

Jetty & SpringBoot

With the app being started by API Gateway, we obviously no longer needed Jetty web server.

From handling different routes to automatically validating the DTOs using the Java Validation framework, Spring does all the heavy lifting for you. Unfortunately, these things will need to be done more or less manually from now on.

Regarding the route handling issue, API Gateway will forward the resource (e.g. /v1/tourist/{touristId}), so we can easily identify the code to execute in the Lambda. Writing an entry point in charge of binding the right resource to its code is no big deal.

When it comes to validating the annotated DTOs, using the Java API ourselves is pretty straightforward and only requires a dozen lines of code.

Hibernate & JPA

Prior to the switch, we were using Hibernate coupled to JPA (Java Persistence API) for the persistence layer. This is a very common solution and while it was doing the job, there were a few things we disliked about it.

Hibernate is a smart ORM (Object-Relational Mapping), that uses a lot of “magic”, but it’s also heavy. It keeps references to related objects and relies mainly on JPA annotations. This is fine, as long as you aren’t looking for something light (which is exactly what you’re looking for when running Lambda functions).

Don’t get me wrong, it’s also what you want when working with micro-services in containers, just not as much.

It works, but we felt like we didn’t really need to carry all the smart black magic to the serverless world.

We tested different ORMs — Exposed, Ebean and ActiveJDBC.

While the first two didn’t seem to be mature enough, the mindset behind ActiveJDBC and the fact that it seems to be more of a modern ORM than Hibernate convinced us to adopt it.

ActiveJDBC is a pass-through ORM, which means that it uses JDBC with more efficiency and the result is a faster, simpler and leaner framework.

Shadow

Shadow is a Gradle plugin that allows us to build what we call “fat jars”. Basically a JAR with a flat structure including the dependencies on the same level as the code of your function.

This plugin is necessary to ensure compatibility with AWS Lambda. Shadow is the equivalent of Shade for Maven.

Takeaways

The Java Validation Framework was previously included in SpringBoot. Therefore, it needed to be added as a dependency. While ConstraintValidator is part of the Java Validation Framework, it will need a validator provider. The most common one being HibernateValidator, meaning that even by getting rid of Hibernate ORM, we had to keep this module. As of now, ActiveJDBC doesn’t seem to be ready for Kotlin. Indeed, the instrumentation task (occurring after compilation) assumes that static members are inherited very much like Java, but it doesn’t work that way in Kotlin. Luckily, the workaround (while not ideal) is pretty simple — creating a java module and writing the entities in Java. As Kotlin and Java are 100% interoperable, this isn’t an issue. We were using PostgreSQL enums (not SQL standard). For some reason we couldn’t find a way to insert enums in the database using ActiveJDBC. Here, the workaround was to go back to using the character varying (varchar) type.

In the end, some of the workarounds are not ideal, but it’s important to keep the big picture in mind and ensure none of these small issues prevent us from moving forward and reaching our goal.

Local Environment

With Containers

Our requirements regarding the local development environment are simple:

As similar as possible from the testing, UAT and production environments. Fully working offline

No mystery here — since we were using Docker containers, docker-compose was the logical choice. The docker-compose builds and deploys six services in additional to a PostgreSQL database and a container with HashiCorp Vault.

With Lambdas

We chose to use SAM (Serverless Application Model) to deploy the function via CloudFormation, and SAM Local for local development.

In a nutshell, SAM Local will start a Docker container simulating API Gateway by using the SAM template used for deployment on AWS.

As we still need to build and start containers for the database and Vault, we still use docker-compose for these. We then make sure SAM Local runs its container in the same docker-network than these two.

Not a huge change in terms of usage there, which is good news.

Monitoring

Logging

Amazon provides CloudWatch, a very complete logging solution that works across a wide range of AWS products. It is supported by Lambda and works out of the box — there is no need to use a SDK or anything similar. It will automatically catch natively printed logs (at least using the Java runtime).

Tracing

When it comes to tracing functions, there is AWS X-Ray, providing end-to-end performance tracking. However, being a young tool (preview announced during the re:Invent 2016), X-Ray seem to be pretty restrictive in term of features.

The good news is that enabling it in Lambda is as easy as adding one line in the SAM deployment template.

Thundra

OpsGenie is currently working on a complete serverless monitoring solution named Thundra and we are following this really closely.

This is a topic we need to dig deeper into. We are expect to learn a lot more during step 5 of our course of action (Learn, measure and improve).

Overall, it seems that the global offering around Lambda monitoring is not as good as for EC2 instances, which is not surprising.

The Leaner The Better

Deploying lean functions is a recipe for successful usage of AWS Lambda.

Lambda Lifecycle

To understand the impact of an optimized code-base on the performances, it’s important to understand the lifecycle of a function:

Lambda function lifecycle.

We will experience a cold start if the function hasn’t been called for more or less 15 minutes or if the function has not been updated.

There are ways of keeping your functions warm by regularly pinging them. In my opinion, this is more of a hack than an actual solution.

By optimizing the code, we are able to make a difference in at least 2 of the 4 steps above.

Code Optimization

Optimizing your code starts with simple things such as instantiating only what you need, when you need it.

For instance, you can enjoy the benefits of lazy initialization principles.

In Kotlin, it’s as simple as:

val touristResource: TouristResource by lazy {TouristResource()}

instead of:

val touristResource = TouristResource()

The TouristResource class will only be initialized if used.

If you are not convinced, remember: with Lambda, you pay for what you use. Don’t overlook smart coding principles.

Trimming The Fat

Being meticulous with dependencies really helps produce leaner functions. In this case, the dependencies removed were everything related to SpringBoot, Hibernate, JPA as well as some other minor libraries.

Results

Build Time

Decreasing build time has been a good side effect of this pivot.

As I mentioned above, our build tool of choice is Gradle. By using the build-scan tool, we could compare the build time in two different scenarios — cold and warm build:

Before — cold: 18 seconds | warm: 6 seconds

After — cold: 9 seconds | warm: 4 seconds

This is mainly due to the fact that we extensively cut down the number of project dependencies (109 then against 43 now).

Fat Jar on a Diet

Reducing the dependencies obviously means reducing the overall size of the JAR. Indeed, we went from 40MB to 13.5MB.

Whilst this is extremely satisfying, once all our micro-services will be Lambda functions, we will be able to trim the size even more (by removing unused dependencies from our common library) and go below the 10MB mark.

Why is this important?

Coming from a mobile engineering background, where the size of the final executable matters, I personally attach massive importance to these details.

In this case, it will allow us to reduce costs by lowering our storage and bandwidth requirements, on S3 and Glacier.

Furthermore, it will implicitly contribute to making our continuous integration and delivery faster.

Definitely a non-negligible bonus, I’d say.

Integration & Delivery

Another massive difference in our continuous integration flow is that we can get rid of all the ceremony around Docker images. From building to pushing on our registry, to finally getting the digest of the last image in order to update the docker-compose template of the given environment.

From now on, all we need is to store the artifact on S3 and update the respective SAM template with the new version.

Refactoring

The flip side of this massive change involving code refactoring is that we are likely to introduce new bugs or undesired side effects.

While not relying on well-tested libraries might end up backfiring, we are pretty confident that the code is well-written enough to be reliable. So far, so good.

However, it would be naive to expect a completely seamless migration.

Onboarding

Moving to Lambda is also a chance to condense the onboarding process for new engineers. Everything is just simpler, easier to digest.

What’s Next?

Other Services

Once the Tourist function’s robustness is proven, the next natural step will be to migrate every other service, one by one. Our hope is to be able to do this before releasing our production environment.

Aurora Serverless

We were in the front row when Amazon introduced Aurora Serverless during the last re:Invent and we were all impressed and excited by the announcement.

Even though RDS is working as expected, the techie inside me can not help but think about pushing it even further.

In terms of coherence and continuity, if our entire backend ends up being serverless, it would make sense for the database to be as well. The rationale is pretty much the same: high scalability, maintenance reduction and pay-per-use pricing.

Scalability is an issue with RDS — if we run out of space, we have to manually change the size of the instance.

Not ideal by any means. Aurora Serverless would scale on-demand. Another big step forward to “NoOps”.

Obviously, there is a catch. Aurora Serverless is only in preview as for now and only supports MySQL. The support for PostgreSQL is announced for “later” in 2018. Hopefully, we would have an up-and-running production environment by then…

API Gateway Optimisation

When most of the requests are coming from the same region, there is a way to optimize API Gateway.

The idea is to create a regional API endpoint and therefore reduce the latency.

Because of the nature of the project we are working on, that could be something to consider.