At Lab Digital we often create bespoke e-commerce solutions for our clients. For most projects we use Django Oscar in combination with Wagtail to create solutions which integrates nicely in existing IT landscapes. Since the beginning of last year however we are also using Commercetools for a number of high profile cases.

When we started our first projects using Commercetools we ran into two points were we needed to create a solution: there was no SDK for Python, our language of choice. And two, there was no solution to manage the Commercetools configuration using an easy and repeatable process.

Although we already wrote about how we solved these two issues (see the blogs about the Python SDK and the Terraform provider for Commercetools), we wanted to focus this blog on how we started to use code generation.

The first thing we did was create a proof of concept for the Terraform provider. Before you can start writing a Terraform provider it is best practice to create an SDK for the upstream API you want to use, and then leverage that SDK in the Terraform provider code. Since Terraform is written in Go we needed to create that SDK in Go (and also learn Go :-)).

While writing the code using the Commercetools API Documentation we quickly were bothered about the repetitiveness of it. It was mostly defining structs and some logic to handle discriminator objects (for polymorphism). Since Commercetools is also rapidly adding new features we also had to keep up to date with new additions to the API. This made us realise pretty quickly that for longterm maintainability we needed to generate the code instead of writing it ourselves. This feeling was also confirmed by the Commercetools team when we did a hackathon with them.

Existing solutions

The Commercetools API specification is written down as RAML (RESTful API Modeling Language). It can be seen as an alternative for Swagger / Open API. Below is an example of a type definition using the RAML syntax.

Type definition in RAML (extract)

Commercetools uses a hook to automatically convert RAML files to Swagger, so our first approach was to use one of the readily available swagger code generation utilities to generate the SDK based on that. It however turned out that the generated swagger files were not valid and that existing code generators couldn’t process it. It also turned out that most RAML code generators only work with Raml 0.8 while the specifications were written in Raml 1.0 (this seems to be a pretty common issue in the RAML ecosystem). So no luck using any of the available methods, which led us to investigate the option to do just do everything ourselves for this one specific use-case.

Code generation using Python

Before generating the code we needed to decide which code we wanted to generate. This is one of the benefits of doing everything yourself; you can generate the code exactly to your liking. I’m a big fan of both the attrs and the marshmallow libraries, so that was an easy decision. The attrs library allows you to easily define data classes and marshmallow handles the serialisation and deserialisation between JSON and the data classes.

Example target code

The next step was parsing the two RAML files provided by Commercetools. The RAML files are spread out over multiple files and referenced with the !include operator By using the yaml.add_constructor method this was easily solved and we ended up with the complete structure in memory. This allowed us to loop through all the types defined in the RAML and create internal objects for each type which referenced each other.

Now we were ready for actually generating the code based on the data objects we had. The first approach was to use templates using jinja2, however this quickly resulted in quite complex templates which were hard to maintain. The second approach was to use the Python AST module in combination with astunparse to convert the AST to Python code.

ast module in combination with astunparse

For each type defined the specification we needed to generate two Python classes. The Marshmallow Schema and the Attr definition.

The hardest part about this project was getting the discriminator fields working correctly. Marshmallow has a Nested field type which allows you to specify the schema for the nested data object. We created a custom Field class (Discriminator) and used the same approach as the Nested field but instead of hardcoding the Schema we select one based on a specific field value (the discriminator field). This resulted in the following code

Marshmallow Discriminator field

One other point that is worth mentioning is that we needed to create the fields in a specific order so that schema base classes are defined before the subclasses (in case of type inheritance).

The final code can be viewed at https://github.com/labd/commercetools-python-sdk/tree/master/codegen. which generates the types.py and the schemas.py files. It isn’t the most clean code I’ve ever written, and part of that it is due to the experimental nature of the project. So one of the next steps for us should be refactoring the code and adding a bunch of unit tests. But for now it works, and when the RAML files are updated and we run make generate that rewrites the code, it feels like magic :-)

Code generation using Go

With what we learned creating the Python code generator and all the benefits we experience that it gives, we decided to also generate the Go code (and replacing the hand written code we had before). The nice thing about Go is that serialisation/deserialisation is part of the standard library. That means that the equivalent Go code for the earlier Price example is only the following:

We used the same approach as Python, we read the RAML files in Go and create internal data objects before calling the code generation functions.

For generating the code we used the jennifer package. I found this a joy to work with. For example writing a valid Go file is as simple as the following:

The hard part in the case of Go was mimicking the type inheritance used in Commercetools in combination with the discriminator fields. The solution we found was to implement a custom UnmarshalJSON method and use the mapstructure library to unmarshal the nested object based on the discriminator field.

A simplified example of the solution:

Example target code in Go

Creating the code generator for Go was considerable less work. The initial version took less than two days to create and we already saved that time adding new features.

Conclusion

Creating a tool to generate code based on API specifications is actually really rewarding work. The amount of time you save compared to writing all class/struct definitions yourself based on the documentation and then keeping that all up-to-date is worth the investment alone. But even more important is the fact that writing a code generator is more fun than writing thousands of strict definitions by hand :-)

All the code we wrote for this is open source and available on our GitHub account. If you enjoy working with Python or Go and love to work on stuff like this then we are of course hiring :-). Drop me an e-mail or visit https://vacatures.labdigital.nl/?lang=en