[TOC]

Intro

This post op­er­ates on a few shared as­sump­tion­s. So, we need to ex­plic­it­ly state them, or oth­er­wise you will read things that are more or less ra­tio­nal but they will ap­pear to be garbage.

APIs are good

Many APIs are web APIs

Many web APIs con­sume and pro­duce JSON

JSON is good

JSON is bet­ter if you know what will be in it

So, JSON Schema is a way to in­crease the num­ber of times in your life that JSON is bet­ter in that way, there­fore mak­ing you hap­pi­er.

So, let's do a quick in­tro on JSON Schema. You can al­ways read a much longer and sure­ly bet­ter one from which I stole most ex­am­ples at Un­der­stand­ing JSON Schema. lat­er (or right now, it's your time, la­dy, I am not the boss of you).

Schemas

So, a JSON Schema de­scribes your da­ta. Here is the sim­plest schema, that match­es any­thing:

{ }

Scary, uh? Here's a more re­stric­tive one:

{ "type": "string" }

That means "a thing, which is a string." So this is valid: "foo" and this isn't 42 Usually, on APIs you exchange JSON objects (dictionaries for you pythonistas), so this is more like you will see in real life:

{ "type": "object", "properties": { "street_address": { "type": "string" }, "city": { "type": "string" }, "state": { "type": "string" } }, "required": ["street_address", "city", "state"] }

That means "it's an ob­jec­t", that has in­side it "street_ad­dress", "c­i­ty" and "s­tate", and they are all re­quired.

Let's sup­pose that's all we need to know about schemas. Again, be­fore you ac­tu­al­ly use them in anger you need to go and read Un­der­stand­ing JSON Schema. for now just as­sume there is a thing called a JSON Schema, and that it can be used to de­fine what your da­ta is sup­posed to look like, and that it's de­fined some­thing like we saw here, in JSON. Cool?

Using schemas

Of course schemas are use­less if you don't use them. You will use them as part of the "con­trac­t" your API prom­ises to ful­fil­l. So, now you need to val­i­date things against it. For that, in python, we can use json­schema

It's pret­ty sim­ple! Here is a "ful­l" ex­am­ple.

import jsonschema schema = { "type": "object", "properties": { "street_address": {"type": "string"}, "city": {"type": "string"}, "state": {"type": "string"}, }, "required": ["street_address", "city", "state"] } jsonschema.validate({ "street_address": "foo", "city": "bar", "state": "foobar" }, schema)

If the da­ta does­n't val­i­date, jsonchema will raise an ex­cep­tion, like this:

>>> jsonschema.validate({ ... "street_address": "foo", ... "city": "bar", ... }, schema) Traceback (most recent call last): File "<stdin>", line 4, in <module> File "jsonschema/validators.py", line 541, in validate cls(schema, *args, **kwargs).validate(instance) File "jsonschema/validators.py", line 130, in validate raise error jsonschema.exceptions.ValidationError: 'state' is a required property Failed validating 'required' in schema: {'properties': {'city': {'type': 'string'}, 'state': {'type': 'string'}, 'street_address': {'type': 'string'}}, 'required': ['street_address', 'city', 'state'], 'type': 'object'} On instance: {'city': 'bar', 'street_address': 'foo'}

Hey, that is a pret­ty nice de­scrip­tion of what is wrong with that da­ta. That is how you use a JSON schema. Now, where would you use it?

Getting value out of schemas

Schemas are use­less if not used. They are worth­less if you don't get val­ue out of us­ing them.

These are some ways they add val­ue to your code:

You can use them in your web app end­point, to val­i­date things.

You can use them in your client code, to val­i­date you are not send­ing garbage.

You can use a fuzzer to feed da­ta that is tech­ni­cal­ly valid to your end­point, and make sure things don't ex­plode in in­ter­est­ing ways.

But here is the most val­ue you can ex­tract of JSON schemas:

You can dis­cuss the con­tract be­tween com­po­nents in un­am­bigu­ous terms and en­force the con­tract once it's in place.

We are de­vs. We dis­cuss via branch­es, and com­ments in code re­view. JSON Schema turns a vague ar­gu­ment about doc­u­men­ta­tion in­to a fac­t-based dis­cus­sion of da­ta. And we are much, much bet­ter at do­ing the lat­ter than we are at do­ing the for­mer. Dis­cuss the con­tract­s.

Since the doc­u­ment de­scrib­ing (this part of) the con­tract is ac­tu­al­ly used as part of the API def­i­ni­tions in the code, that means the doc­u­ment can nev­er be left be­hind. Ev­ery change in the code that changes the con­tract is ob­vi­ous and re­quires an ex­plic­it rene­go­ti­a­tion. You can't break API by ac­ci­den­t, and you can't break API and hope no­body will no­tice. En­force the con­tract­s.

Fi­nal­ly, you can ver­sion the con­trac­t. Use that along with API ver­sion­ing and voilá, you know how to man­age change! Ver­sion your con­tract­s.

Dis­cuss your con­tracts

En­force your con­tracts

Ver­sion your con­tracts

So now you can stop wor­ry­ing and love JSON Schema as well.