First of all, we looked around and found that there is already a Python library for this — swagger-conformance , but it seemed to be abandoned. We needed support for Open API and more flexibility with strategies for data generation than swagger-conformance has. We also found a recent library — hypothesis-jsonschema , built by one of the Hypothesis core developers, Zac Hatfield-Dodds. I am entirely grateful to people who took the time to develop these libraries. With their efforts, testing in Python has become more exciting, inspiring and enjoyable.

Since Open API is based on JSON Schema it was a close match, but not exactly what we needed. Having these findings we decided to build our own library on top of Hypothesis, hypothesis-jsonschema and pytest , which will target the Open API and Swagger specifications.

This is how the Schemathesis project started a few months ago in our Testing Platform team at Kiwi.com. The idea is to:

Convert Open API & Swagger definitions to JSON Schema;

Use hypothesis-jsonschema to get proper Hypothesis strategies;

to get proper Hypothesis strategies; Use these strategies in CLI & hand-written tests.

It generates test data that conforms to the schema and makes a relevant network call to a running app and checks if it crashes or if the received response conforms to the schema.

We still have a lot of interesting things to implement such as:

Generating invalid data;

Schema generation from other specifications;

Schema generation from WSGI applications;

Application-side instrumentation for better debugging;

Targeted data fuzzing based on code coverage or other parameters.

Even now it has helped us to improve our applications and fight against certain classes of defects. Let me show you a few examples of how Schemathesis works and what errors it can find. For this purpose I have created a sample project that implements simple API for bookings, the source code is here — https://github.com/Stranger6667/schemathesis-example. It has defects, that are usually not that obvious from the first glance and we will find them with Schemathesis.

There are two endpoints:

POST /api/bookings — creates a new booking

— creates a new booking GET /api/bookings/{booking_id}/ — get a booking by ID

For the rest of the article, I assume this project running on 127.0.0.1:8080 .

Schemathesis could be used as a command-line application or in Python tests, both options have their own advantages and disadvantages and I will mention them in the next few paragraphs.

Let’s start with CLI and I will try to create a new booking. The booking model has just a few fields, which are described in the following schema and a database table:

Open API 3 definition of Booking model

Definition of the relevant table in the database

Handlers & models:

Have you spotted a flaw that might crash the application with an unhandled error?

We need to run Schemathesis against our API’s specific endpoint:



-M POST

-E /bookings/

http://0.0.0.0:8080/api/openapi.json $ schemathesis run-M POST-E /bookings/

These two options, --method and —-endpoint allow you to run tests only on endpoints that are interesting for you.

Schemathesis CLI will compose a simple Python code so you can reproduce the error easily and will remember it in the Hypothesis Internal Database, so it will be used in a subsequent run. A traceback in the server output unveils the troublesome parameter:

File "/example/views.py", line 13, in create_booking

request.app["db"], booking_id=body["id"], name=body["name"], is_active=body["is_active"]

KeyError: 'id'

The fix is simple, we need to make id and other parameters required in the schema:

Let’s re-run the last command and check if everything is OK:

Again! The exception on the application side:

asyncpg.exceptions.UniqueViolationError: duplicate key value violates unique constraint "bookings_pkey"

DETAIL: Key (id)=(0) already exists.

It seems like I didn’t consider that the user can try to create the same booking twice! However, things like that are common on production — double-clicking, retrying on failure, etc.

We often can’t imagine how our applications will be used after deployment, but PBT can help with discovering what logic is missing in the implementation.

Alternatively, Schemathesis provides a way to integrate its features in the usual Python test suites. The other endpoint may seem straightforward — take an object from the database and serialize it, but it also contains a mistake.

Open API 3 definitions

The central element of Schemathesis in-code usage is a schema instance. It provides test parametrization, selecting endpoints to test and other configuration options.

There are multiple ways to create the schema and all of them could be found under schemathesis.from_<something> pattern. Usually, it is much better to have an application as a pytest fixture, so it could be started on-demand (and schemathesis.from_pytest_fixture will help to make it so), but for simplicity, I will follow my assumption of the application running locally on 8080 port:

Each test with this schema.parametrize decorator should accept a case fixture, that contains attributes required by the schema and extra information to make a relevant network request. It could look like this:

Case.call makes a request with this data to the running app via requests .

And the tests could be run with pytest (but unittest from the standard library is supported as well):

$ pytest test_example.py -v

Server-side exception:

asyncpg.exceptions.DataError: invalid input for query argument $1: 2147483648 (value out of int32 range)