HTTP API Design Part 3: Bodies

This is the third of four articles on HTTP API Design. These articles are based on content from my recent book Advanced Microservices. This article looks at the formatting of request and response bodies.

JSON (JavaScript Object Notation)

JSON is the preferred serialization format for most popular HTTP based APIs today. The specification for JSON is extremely simple, after a few minutes you'll become an expert. The format itself is quite terse and is easily to serialize to and deserialize from native objects in many languages. Essentially it exists as objects of key/value pairs, where values can be strings, numbers, booleans, null, or arrays/objects of more values.

This ease of use / terseness contrasts with other data formats such as XML which often requires complex serialization logic and verbose syntax for querying data. The following is an example JSON document:

{ "strings": "value", "numbers": 12, "moreNumbers": -0.1, "booleans": true, "nullable": null, "array": [1, "abc", false], "object": { "very": "deep" } }

Attribute Name Casing

When using JSON, or really any format, one consideration that comes up and is often overlooked is which attribute name case to choose. The three choices you have are as follows:

snake_case : This format uses the most bytes but is quickest to read

: This format uses the most bytes but is quickest to read PascalCase : This format uses the most shift presses

: This format uses the most shift presses camelCase : Slightly less shift presses

Neither format is actually better or worse than the others. Some are more common in different settings, like PascalCase can be pretty common in the Microsoft .NET world, and snake_case or camelCase can be more prolific in open source projects.

Whatever format you choose to support it is important to be consistent throughout your requests and responses. Never mix types as it will be confusing for developers to remember which format to use in which situation. Even if your service is a façade for several different services and those services mix types, your service will need to choose a single format and translate properties. It's common for projects to match the same case of their external API with internal variable or property names, however there is no benefit to the consumer to do so.

Booleans

Whenever naming booleans, always use happy / positive words. Switching between positive or negative counterparts is confusing and will require the developer to frequently look at documentations. Here are some examples:

Use enabled instead of disabled

instead of Use public instead of private

When naming boolean properties, it's usually overkill to signify that a property is boolean with the name. As an example, there's no need to call a property iscool or coolflag, simply calling it cool will suffice. If you do choose to use a flag, make it consistent throughout the API.

There are a few different standards for serializing a timestamp. However one format rules above all of them, and that format is ISO 8601. Here are a few examples of this format:

"2017-06-15T04:23:46+00:00" : Using a numeric offset from UTC

: Using a numeric offset from UTC "2017-06-15T04:23:46Z" : Using “Zulu” UTC time

: Using “Zulu” UTC time "2017-06-15T04:23:46.987Z" : Variable precision for milliseconds

This format is always represented as a string. One nice feature is that, assuming each timestamp is in the same timezone, when sorted alphabetically they will be sorted based on occurrence. They are also very easy to read for humans, and yet are fairly terse.

One common alternative is the Unix Epoch, which is an unreadable and ambiguous format. An example of this format is 1493268311123, which includes millisecond precision, which is different than 1493268311, which does not. It's impossible to know the precision without containing Out Of Band information, such as stating the precision ahead of time in documentation.

Identifiers

Always represent all identifiers as strings. Even if your identifier is treated as an integer internally, e.g. it's an always-incrementing number, still transmit this value as a string. The following is a tweet from chess.com. Their API transmits ID's as integers which eventually lead to downtime in their iOS application:

“32-bit iOS devices are experiencing issues due to limitations interpreting game IDs over 2,147,483,647. Fix should be out in 48 hours :)” -- @chesscom

Essentially there are two classes of identifiers to choose from. Each class has its pro's and con's. Feel free to choose the best ID type based on your use case on a per-collection basis; mixing and maxing should not have ill-consequences in the client (a string is a string).

Incremental (e.g. Integer, Base62): Efficient to store

(e.g. Integer, Base62): Efficient to store Random (e.g. UUID): Difficult to guess values or count collection size

Random IDs have an additional property that can make them appealing; in highly distributed situations where you don't want a central location to keep track of IDs and increment them one can simply generate an ID and be done. If the ID size has enough entropy then collisions should be rare.

Versioning

When constructing and maintaining an API it is common to make changes which will cause incompatibilities in consumers. When these changes are made it's vital that we allow old clients to continue working with the old version of the API while supporting a new version in parallel. This allows clients to migrate to the new version when an opportune time comes.

Here are three popular methods for versioning an HTTP API:

URL Segment (LinkedIn, Google+, Twitter): https://api.example.org/v1/*

(LinkedIn, Google+, Twitter): Accept Header (GitHub): Accept: application/json+v1

(GitHub): Custom Header (Joyent CloudAPI): X-Api-Version: 1

New versions of APIs only need to be made when operations that would normally work with the old version would then fail. As an example, adding a new optional property to resources or a new collection shouldn't require a version bump. Changing a values type, adding a required parameter, removing a parameter or collection, changing default filtering options (e.g. from default Infinity to 100 results per page) are all examples of backwards breaking changes.

When you do deprecate an old API version, give consumers ample time and notice. Try to proactively notify them of the change and even provide a migration guide if possible. Always give a deadline if the old API will be disabled entirely.

This article is based on content from my book Advanced Microservices.There's also an accompanying HTTP API Design Presentation.