We have more ways to manage dependencies in Python applications than ever. But how do they fare in production? Unfortunately this topic turned out to be quite polarizing and was at the center of a lot of heated debates. This is my attempt at an opinionated review through a DevOps lens.

2018 is closing to an end and a lot happened since the best and only way to pin your dependencies was running

$ pip freeze >requirements.txt

pip-tools have been around for a while and automated that very same process including updates.

However, in the past year, two new contenders entered the scene that tried to fix packaging for good:

With pip-tools still working fine and being maintained by Jazz Band, we now have inevitably more alternatives to manage our dependencies than ever. But how do they compare if your job is to run Python applications on servers?

Update from one year later (2019-11-06): My upcoming criticism of Pipenv and Poetry have stood the test of time: Poetry hasn’t added the feature I need and Pipenv hasn’t even released a new version since then. So minor wordsmithing aside, I’ve only updated the part on how I use pip-tools which I love more than ever.

So feel free to jump ahead, if all you want is my approach to modern application dependency management.

My Context, or: Putting Python Apps on Servers in 60s

Whether we like it or not, the best Python build fragment for server platforms is the good old venv – possibly compressed using tools like pex or shiv.

Such a build fragment can for example be built and packaged using Docker’s multi-stage builds or just tar‘ed up and moved to the target server.

Thinking you can cheat yourself out of virtual environments using Docker leads to huge containers that ship with build environments, version conflicts with your system Python libraries, or wonky attempts to move around whole Python installations or user directories that will eventually fall over because some CPython internals changed or pip install --user was never intended for something like that.

Of course, with some dedication you can work around all of the above, but ask yourself why you’re so keen to ditch a reliable and blessed way and what amount of bloat, complexity, and sketchiness is it worth it to attain that goal.

Docker is a great way to distribute virtual environments – but not a replacement of them.

Requirements

What all of this means is that whatever solution I pick, it needs to provide the following features:

Let me specify my immediate dependencies (e.g. Django) for at least two environments (production and development, the latter usually being production plus test tools), resolve the dependency tree for me (i.e. recursively determine my dependencies’ dependencies) and lock all of them with their versions and ideally hashes, update all dependencies in one go and update all locks automatically and independently for each environment, integrate somehow with tox so I can run my tests and verify that the update didn’t break anything, and finally allow me to install a project with all its locked dependencies into a virtual environment of my choosing.

So for example I use dockerized build servers to create a virtual environment at /app , install the application with all its dependencies into it, and then

COPY --from=build /app /app

it into a minimal production container.

When I’m building Debian packages, I’ll do the exact same thing in the same build container, except the target path of the virtual environment becomes something like /vrmd/name-of-the-app and the whole thing gets packaged using fpm and uploaded to our package servers.

DISCLAIMER: The following technical opinions are mine alone and if you use them as a weapon to attack people who try to improve the packaging situation you’re objectively a bad person. Please be nice.

Pipenv

I usually try to follow community best practices and so I looked at the PyPA’s Pipenv first. One notable feature is that it introduces Pipfile and Pipfile.lock as means to declare and lock dependencies for the first time in a mainstream package.

And Pipenv tries really hard to do everything for you. It doesn’t just manage your dependencies, it also takes care of your virtual environments, and along the way tries to guess what you might want (like detecting whether you’re in an active virtual environment, installing missing interpreter versions using pyenv, or checking for preexisting requirements.txt s).

If you look into Pipenv’s patched and vendor directories, you’ll realize how it achieves that: they took what’s battle tested, put a nice interface in front of it, and patched what didn’t work. In a way, it carries the burden of many years of Python packaging in its guts. And yes: there‘s a patched pip-tools inside of Pipenv.

While being user-friendly and relying on mature work sounds great, it’s also its biggest weakness: Pipenv took a much bigger bite than the maintainers can realistically swallow and it grew so complex that no mortal has a chance to understand – let alone control – it. If you look at their changelog, maintenance mostly means intense fire fighting.

I have used Pipenv for almost a year and still remember the dread of updating it. There were times when each release introduced a new breaking regression – including the last two as of me writing this.

And that were the final two straws that broke my camel’s back. I have really tried to make this work and I have the utmost respect and sympathy for the maintainers that have to fight a hydra of complexity which – if anything – is growing more necks. But I personally have lost the faith that this project ever becomes stable enough to trust my sleep on it.

On backchannels I’ve heard from friends at various $BIG_CORPS that they also had to back away because they ran into blocking bugs that sometimes just got closed without ceremony.

As it stands today, I unfortunately have to disagree with the decision to use Pipenv as the example in the Python packaging guide which effectively is read as a blessing by an authority even though it never was meant as one. It is a project that looks great on paper but cannot hold its own weight. I’m afraid a rewrite from the ground up is what it would take to put it on solid footing.

If you’re more optimistic than I am and want to try it for yourself, Pipenv actually fulfills all needs outlined above:

Install immediate dependencies using pipenv install Django or pipenv install --dev pytest . They are locked automatically on installation. pipenv update --dev While Pipenv has suggestions on direct usage with tox, I prefer the old way and transform my Pipfiles into old school requirements.txt`s. Updating/re-locking all dependencies then looks like this: $ pipenv update --dev ... $ pipenv lock -r >requirements/main.txt ... $ pipenv lock --dev -r >requirements/dev.txt ... `` Which can be used in `tox` as usual: [testenv] deps = -rrequirements/main.txt -rrequirements/dev.txt commands = coverage run --parallel -m pytest {posargs} coverage combine coverage report `` Thanks to the approach taken in the step 4, you don’t have to touch your build/deploy system at all and possible Pipenv bugs only affect you in development which is usually a lot less critical. I appreciate a lot that Pipenv lets me export the state into a common standard and doesn’t force me to use it throughout all of my workflows.

Poetry

The next project I tried was Poetry which has a very different approach. Instead of reusing what’s already there, it makes a self-written dependency resolver its core. That may sound like a straightforward problem but in reality it’s anything but.

Poetry also embraces the standard pyproject.toml file for both immediate and intermediate dependencies. Additionally it offers first class support for packaging Python libraries and uploading them to PyPI. It emphatically does not use setuptools to achieve any of that.

Given that it’s modern and tries to do a clean cut in Python packaging, I am intrigued by this in my opinion under-appreciated project. I’m also a big fan of it not insisting to manage my ‌virtual environments and found Pipenv’s use of pew – which uses sub-shells – annoying.

Unfortunately Poetry fails point 5 of my requirements. As of writing (Poetry 0.12.17 and 1.0.0b3), there is no way to tell Poetry to install the current package along with its dependencies into an explicitly specified virtual environment. Mind you, that if you run poetry install within an active virtual environment, it will do the right thing. However that is problematic on build servers and in CI.

Since pyproject.toml is just TOML, it would be trivial to extract a requirements.txt from it (and Poetry 1.0 will support it out of the box!). However it’d feel like going against the grain. More importantly, if you go all in on Poetry, you also drop your setup.py . If your Python applications aren’t packages you’re not gonna care, but mine are and they become uninstallable.

All of this is possible to work around, but I’m not comfortable to build core workflows around kludges. But unlike Pipenv, I can see Poetry become my tool of choice, because it only lacks one feature and the rest appears to be very solid with a slick UX. Furthermore its scope is narrow enough to give me confidence that the complexity will not get out of hand anytime soon.

So I hope Sébastien will either add a way to install into explicitly specified virtual environments or even offer some first class support for bundling applications into virtual environments, shivs, etc. Then I will check back and report.

Turns out, pip-tools still work great and are actively maintained with regular releases!

I couldn’t get pip-sync to do what it’s supposed to do, but as long as you keep a few things in your mind, pip-compile does its job perfectly. So if you want to take the low-tech lipstick-on-a-pig approach and don’t mind adding your dependencies into a file by hand, you’re golden.

I especially like how it gives me complete freedom on how I structure my dependencies and how many types of dependencies I have: not just dev and main . To keep my project root directories light, I create a dedicated directory for my requirements files:

requirements/ ├── dev.in ├── dev.txt ├── main.in └── main.txt

The .in files are my dependencies and the .txt files are the pinned and hashed results of pip-compile . I stole this approach from warehouse that has many more requirements files than the two in this case.

To update the compiled .txt files, I use two Makefile targets:

update-deps: pip install --upgrade pip-tools pip setuptools pip-compile --upgrade --build-isolation --generate-hashes --output-file requirements/main.txt requirements/main.in pip-compile --upgrade --build-isolation --generate-hashes --output-file requirements/dev.txt requirements/dev.in init: pip install --editable . pip install --upgrade -r requirements/main.txt -r requirements/dev.txt rm -rf .tox update: update-deps init .PHONY: update-deps init update

Important: pip-compile has no proper resolver. Thus it has to run under the same Python version as the project it’s locking and in the same environment, or else conditional dependencies will not work correctly.

Which is why I install pip-tools into my development virtual environments and not globally, as you can see in the update-deps target.

Between “breaks regularly” and “close but no cigar”, pip-tools are still the best. And while they’re not perfect, if it’s good enough for all the companies that I know of alone, it’s probably also good enough for you.

I for one am not holding my breath for new tools for these kinds of workflows.