I'm writing this, mostly because I'm finally settled on an approach that I don't hate. I'll briefly touch on some of the core principles and concepts, and ellaborate on each. I've been writing javascript serverless functions for quite some time now. Mostly commercially. So I thought some folks might find it useful for me to share my experiences and tips. Hopefully you can find some of this useful!

Architecture

This one's super important, probably the most important, because when done correctly, or well, it gives you options and the freedom to experiment. Having a more modular architecture, which separates your core functionality to the rest of your application, treating things such as databases, delivery types (http etc), as mere details. Means everything becomes very plug and play. This was probably my most important finding, and I wrote about it in more detail here.

As I mentioned in the article linked above, I've opted to apply Uncle Bob's clean architecture principles to my services. Splitting them down into three main components, deliveries (http, lambda, graphql, etc), use cases (your actual business logic), and entities (data objects which represent a business idea), repositories (used for fetching and organising data to and from entities with the database), and finally services (which represent an external service call, or potentially just house some business logic, or group several other services and repositories for example).

Your delivery, will extract any arguments or data passed to it by the caller, so any request body arguments perhaps, or GraphQL input arguments for example. It will then scrub any knowledge of the transport type from the data, and pass the data into the correct use case. Separating the business logic from the delivery, means you can quickly and easily create new deliveries which integrate with other client methods. If I wanted to write a gRPC service from my serverless service, I'd just write a new delivery, which calls the same use case, same business logic etc.

The use case takes dependencies, through dependency injection, and any runtime arguments it needs to complete its task. Use cases are often made up of calls to repositories, services, and sometimes other use cases. They are completely unaware of what called it, and what databases are used, and has no direct knowledge of any external service calls it's making.

Container registry

Abiding by good architecture principles and aiming to have code that's easy to test, means dealing with dependencies correctly. Gone are the days of just instantiating dependencies directly within your Lambda function:

const handler = () => { const s3 = new AWS.S3(); ... };

I like to create an abstraction for each resource. So instead of having an instance of s3, and interact directly with it in my use case, or within my business logic. I'll create a class or abstraction to wrap around a particular bucket, for example. If I have a documents bucket, I'll create a class called Documents , which represents that bucket, and I'll inject that into the constructor of another class which relies on 'documents' as a concept. After a while, you build up a patter of injecting dependencies in various ways, and it becomes apparent that you need to manage this in a sensible way. I found myself writing the same dependency trees in all my deliveries, which was tedious. So I stumbled upon awilix. I was familiar with the concept of dependency injection, but I'd never really utilised it in Javascript.

I began using Awilix to abstract the dependency tree, meaning I could resolve a class by name, which would be returned, already pre-configured and with everything it needs.

// container.js const awilix = require('awilix'); ... const container = createContainer(); container.register({ datastore: awilix.asValue(datastore), logger: awilix.asValue(logger), repository: awilix.asClass(UserRepository), }); return container;

In my deliveries:

// deliveries/express/find-user.js const container = require('../../container'); const usecase = require('../../usecases/find-user')( container.resolve('logger'), container.resolve('userRepository') ); const delivery = (req, res) => { const { id } = req.body; return usecase(id); };

Awilix will also automatically wire dependencies by name into the constructor of other classes registered in your container, which saves a lot of time and effort plumbing things together.

class UserRepository { constructor({ datastore }) { super(): this.datastore = datastore; } }

This approach gave me a lot of flexibility, because you only need to change one file, if you want to change the datastore for every repository, or the logger for every single delivery.

Paradigms

I'm sick of the OOP vs Functional debate, I've realised over time that there isn't a correct or incorrect answer, if one was objectively 'more correct' than the other, then it's likely everyone would have gone all in on the objectively correct paradigm. It also entirely depends on the language you're using. Since Javascript sort of does both, partially. That's how I tend to use Javascript.

I could be completely wrong on this, but over the years, I've found that OO is great for mapping concepts, domain models together, and holding state. Therefor I tend to use classes to give a name to a concept and map data to it. For example, entities, repositories, and services, things which deal with data and state, I tend to create classes for. Whereas deliveries and use cases, I tend to treat functionally. The way this ends up looking, I have functions, which have instances of classes, injected through a higher-order function. The functional code then interacts with the various objects and classes passes into it, in a functional manor. I may fetch a list of items from a repository class, map through them, filter them, and pass the results into another class which will store them somewhere, or put them in a bucket.

So, classes and objects become the data layer, which functional code taps into. This isn't fool proof, but I've found it to be useful, getting the best of both worlds.

Logging

Logging has been a thorn in my side for quite some time. Initially we fired requests off to an Elasticsearch cluster, but this became a bottleneck, and wasn't asynchronous in nature. We also tended to log things via console.log or console.error to stdout when required. Which meant trawling through Cloudwatch logs in a none uniform manor. This was painful and tedious. Our Kibana stack was useful, but came with its own stack of problems. Which I won't get into here.

It turns out one of the key things we were missing, was a way to standardise and format our stdout logs in such a way, that meant it was easy to search, and contained enough information to be properly useful.

Having a predictable, JSON format, meant we could use AWS Insight to create queries, which did all the trawling and searching for us. This was a hugely useful find. It may not replace our Kibana stack entirely, but it's certainly made life easier.

I opted to use log4js for structured logging, it's based off of the very popular Java log4j.

const usecase = (repository, logger) => async (id) => { logger.addContext('id', id); logger.info('fetching item'); try { return repository.fetch(id); } catch (e) { logger.error(`error fetching item: ${e.message}`); throw e; } }

log4js config:

// Format logs into json log4js.addLayout('json', config => logEvent => JSON.stringify(logEvent) + config.separator); log4js.configure({ appenders: { out: { type: 'stdout', layout: { type: 'json', separator: ',' } }, }, categories: { default: { appenders: ['out'], level: logLevels[STAGE] }, }, });

Then in Insight I can run queries such as:

fields @timestamp, @message | sort @timestamp desc | limit 50 | filter level.levelStr = "ERROR"

Insight will automatically index the fields of any JSON it finds in your query. Which allows you to query against those individual fields. Which is really cool. You can also save and export those queries into dashboards, alerts, etc. You can query across several functions, so you can span an entire service with your query. Which is massively useful. You can also export the data in CSV format if you want to issue reports and share the errors with others.

log4js also supports multiple output types, so you can integrate with central logging, as well as standard out. Nice!

Testing

For testing, I tend to write unit tests, which test the core business logic. I use Jest. Because I separate the business logic away from deliveries and repositories etc, I can pass in mocked versions of those into my use case, and test the core business logic very easily. Some what of a trivial example, but you get the picture:

const logger = jest.fn(); const repository = jest.fn(); it('should fetch all users', async () => { repository.findAll = jest.fn().mockResolvedValue(mockUsers); const users = await usecase(repository, logger)(); expect(users).toBe(mockUsers); });

Running locally

Because the transport layer is split into deliveries, it's really easy to write new types of deliveries. I attempted to use various AppSync emulators etc, but found them to be clunky. I now use Apollo Server locally, and write a delivery for the apollo resolvers to the use cases. For none GraphQL endpoints/services, I use serverless-offline.

Deployments

For personal projects, I quite often use CircleCI, I create a build pipeline for each service, which for the most part for serverless projects, just does $ yarn && yarm test && sls deploy 🤷🏻‍♂️. The more AWS solution I tend to use is, each project contains a cloudformation stack, which creates a codepipeline, and codebuild stack, with the appropriate git triggers set-up.

Infrastructure

I use Serverless very heavily, it does everything I need, and allows me to avoid a lot of Cloudformation scripting. Of course you still need to do some of that, so for stateless dependencies, such as queues and sns topics, I'll create those under Resources within my serverless.yml . But for stateful resources, such as s3, or DynamoDB, I'll create separate infrastructure cloudformation stacks, or Terraform, which is what I tend to go for in my personal projects. I much prefer writing actual code to represent infrastructure, rather than millions of lines of YAML. I've began looking into AWS's Cloudformation CDK, which actually lets your write code to build your infrastructure. Which is really cool!

Service Discover

If you have a lot of services, and resources connecting those resources, the last thing you want to do is hard-code ARN's all over the place. In order to allow for maximum modularity, you should consider using some kind of service discovery. In essence, service discovery is just a big, distributed look-up of where all your services are, with friendly names. I guess a bit like 118118 or 192 but for your microservices.

For this I started using Cloudmap very heavily. Cloudmap is AWS's service discovery offering, which allows your to register, not just your services (lambda, ECS containers, Fargate containers, etc), but also bits of infrastructure, such as SQS Queues, DynamoDB tables etc as well. This is really powerful, and it allowed me to write some useful tools and abstractions on top of the different types of interactions, centered around the service discovery, to make life really easy.

const user = await Discover.request('platform.users->create', newUser); await Discover.queue('platform.new-user-queue', user); await Discover.publish('platform.new-user', user);

We used this library to standardise the way we find and interact with other services, which are made up of disparate services and technologies.

You can find those libraries here as we open sourced them!

Conclusion

The most important thing I've learnt, after a few years later after creating my first Lambda, is that serverless isn't exceptional, in the sense that it doesn't mean you don't have to worry about architecture anymore. The tendency is for people to just write functions like reams of scripts, without any architectural adherence. That's not how you should see serverless, you should see serverless as a new way of executing your code, a new way of interacting with code. But good architectural principles remain a constant. Regardless of whether your runtime is a container, or a lambda. It doesn't change the way you should construct your programs.

Finally, get your structured logging correct early on, you'll save lots of time and headache further down the line.

Hat-tip

The author of a quite frankly stunning framework, emailed me after the first post I wrote on this subject, to show me what he'd done, and I think it's a fantastic example of applying clean architecture principles to serverless, so go check this out.

Sponsor me on Patreon to support more content like this.

If you found this useful, and you use an ad-blocker (who can blame you). Please consider chucking me a couple of quid for my time and effort. Cheers! https://monzo.me/ewanvalentine