There have been a few recent articles reflecting on the current status of the Python packaging ecosystem from an end user perspective, so it seems worthwhile for me to write-up my perspective as one of the lead architects for that ecosystem on how I characterise the overall problem space of software publication and distribution, where I think we are at the moment, and where I'd like to see us go in the future.

For context, the specific articles I'm replying to are:

These are all excellent pieces considering the problem space from different perspectives, so if you'd like to learn more about the topics I cover here, I highly recommend reading them.

My core software ecosystem design philosophy Since it heavily influences the way I think about packaging system design in general, it's worth stating my core design philosophy explicitly: As a software consumer, I should be able to consume libraries, frameworks, and applications in the binary format of my choice, regardless of whether or not the relevant software publishers directly publish in that format

As a software publisher working in the Python ecosystem, I should be able to publish my software once, in a single source-based format, and have it be automatically consumable in any binary format my users care to use This is emphatically not the way many software packaging systems work - for a great many systems, the publication format and the consumption format are tightly coupled, and the folks managing the publication format or the consumption format actively seek to use it as a lever of control over a commercial market (think operating system vendor controlled application stores, especially for mobile devices). While we're unlikely to ever pursue the specific design documented in the rest of the PEP (hence the "Deferred" status), the "Development, Distribution, and Deployment of Python Software" section of PEP 426 provides additional details on how this philosophy applies in practice. I'll also note that while I now work on software supply chain management tooling at Red Hat, that wasn't the case when I first started actively participating in the upstream Python packaging ecosystem design process. Back then I was working on Red Hat's main hardware integration testing system, and growing increasingly frustrated with the level of effort involved in integrating new Python level dependencies into Beaker's RPM based development and deployment model. Getting actively involved in tackling these problems on the Python upstream side of things then led to also getting more actively involved in addressing them on the Red Hat downstream side.

The key conundrum When talking about the design of software packaging ecosystems, it's very easy to fall into the trap of only considering the "direct to peer developers" use case, where the software consumer we're attempting to reach is another developer working in the same problem domain that we are, using a similar set of development tools. Common examples of this include: Linux distro developers publishing software for use by other contributors to the same Linux distro ecosystem

Web service developers publishing software for use by other web service developers

Data scientists publishing software for use by other data scientists In these more constrained contexts, you can frequently get away with using a single toolchain for both publication and consumption: Linux: just use the system package manager for the relevant distro

Web services: just use the Python Packaging Authority's twine for publication and pip for consumption

Data science: just use conda for everything For newer languages that start in one particular domain with a preferred package manager and expand outwards from there, the apparent simplicity arising from this homogeneity of use cases may frequently be attributed as an essential property of the design of the package manager, but that perception of inherent simplicity will typically fade if the language is able to successfully expand beyond the original niche its default package manager was designed to handle. In the case of Python, for example, distutils was designed as a consistent build interface for Linux distro package management, setuptools for plugin management in the Open Source Application Foundation's Chandler project, pip for dependency management in web service development, and conda for local language-independent environment management in data science. distutils and setuptools haven't fared especially well from a usability perspective when pushed beyond their original design parameters (hence the current efforts to make it easier to use full-fledged build systems like Scons and Meson as an alternative when publishing Python packages), while pip and conda both seem to be doing a better job of accommodating increases in their scope of application. This history helps illustrate that where things really have the potential to get complicated (even beyond the inherent challenges of domain-specific software distribution) is when you start needing to cross domain boundaries. For example, as the lead maintainer of contextlib in the Python standard library, I'm also the maintainer of the contextlib2 backport project on PyPI. That's not a domain specific utility - folks may need it regardless of whether they're using a self-built Python runtime, a pre-built Windows or Mac OS X binary they downloaded from python.org, a pre-built binary from a Linux distribution, a CPython runtime from some other redistributor (homebrew, pyenv, Enthought Canopy, ActiveState, Continuum Analytics, AWS Lambda, Azure Machine Learning, etc), or perhaps even a different Python runtime entirely (PyPy, PyPy.js, Jython, IronPython, MicroPython, VOC, Batavia, etc). Fortunately for me, I don't need to worry about all that complexity in the wider ecosystem when I'm specifically wearing my contextlib2 maintainer hat - I just publish an sdist and a universal wheel file to PyPI, and the rest of the ecosystem has everything it needs to take care of redistribution and end user consumption without any further input from me. However, contextlib2 is a pure Python project that only depends on the standard library, so it's pretty much the simplest possible case from a tooling perspective (the only reason I needed to upgrade from distutils to setuptools was so I could publish my own wheel files, and the only reason I haven't switched to using the much simpler pure-Python-only flit instead of either of them is that that doesn't yet easily support publishing backwards compatible setup.py based sdists). This means that things get significantly more complex once we start wanting to use and depend on components written in languages other than Python, so that's the broader context I'll consider next.

Platform management or plugin management? When it comes to handling the software distribution problem in general, there are two main ways of approaching it: design a plugin management system that doesn't concern itself with the management of the application framework that runs the plugins

design a platform component manager that not only manages the plugins themselves, but also the application frameworks that run them This "plugin manager or platform component manager?" question shows up over and over again in software distribution architecture designs, but the case of most relevance to Python developers is in the contrasting approaches that pip and conda have adopted to handling the problem of external dependencies for Python projects: pip is a plugin manager for Python runtimes. Once you have a Python runtime (any Python runtime), pip can help you add pieces to it. However, by design, it won't help you manage the underlying Python runtime (just as it wouldn't make any sense to try to install Mozilla Firefox as a Firefox Add-On, or Google Chrome as a Chrome Extension)

conda, by contrast, is a component manager for a cross-platform platform that provides its own Python runtimes (as well as runtimes for other languages). This means that you can get pre-integrated components, rather than having to do your own integration between plugins obtained via pip and language runtimes obtained via other means What this means is that pip, on its own, is not in any way a direct alternative to conda. To get comparable capabilities to those offered by conda, you have to add in a mechanism for obtaining the underlying language runtimes, which means the alternatives are combinations like: apt-get + pip

dnf + pip

yum + pip

pyenv + pip

homebrew (Mac OS X) + pip

python.org Windows installer + pip

Enthought Canopy

ActiveState's Python runtime + PyPM This is the main reason why "just use conda" is excellent advice to any prospective Pythonista that isn't already using one of the platform component managers mentioned above: giving that answer replaces an otherwise operating system dependent or Python specific answer to the runtime management problem with a cross-platform and (at least somewhat) language neutral one. It's an especially good answer for Windows users, as chocalatey/OneGet/Windows Package Management isn't remotely comparable to pyenv or homebrew at this point in time, other runtime managers don't work on Windows, and getting folks bootstrapped with MinGW, Cygwin or the new (still experimental) Windows Subsystem for Linux is just another hurdle to place between them and whatever goal they're learning Python for in the first place. However, conda's pre-integration based approach to tackling the external dependency problem is also why "just use conda for everything" isn't a sufficient answer for the Python software ecosystem as a whole. If you're working on an operating system component for Fedora, Debian, or any other distro, you actually want to be using the system provided Python runtime, and hence need to be able to readily convert your upstream Python dependencies into policy compliant system dependencies. Similarly, if you're wanting to support folks that deploy to a preconfigured Python environment in services like AWS Lambda, Azure Cloud Functions, Heroku, OpenShift or Cloud Foundry, or that use alternative Python runtimes like PyPy or MicroPython, then you need a publication technology that doesn't tightly couple your releases to a specific version of the underlying language runtime. As a result, pip and conda end up existing at slightly different points in the system integration pipeline: Publishing and consuming Python software with pip is a matter of "bring your own Python runtime". This has the benefit that you can readily bring your own runtime (and manage it using whichever tools make sense for your use case), but also has the downside that you must supply your own runtime (which can sometimes prove to be a significant barrier to entry for new Python users, as well as being a pain for cross-platform environment management).

Like Linux system package managers before it, conda takes away the requirement to supply your own Python runtime by providing one for you. This is great if you don't have any particular preference as to which runtime you want to use, but if you do need to use a different runtime for some reason, you're likely to end up fighting against the tooling, rather than having it help you. (If you're tempted to answer "Just add another interpreter to the pre-integrated set!" here, keep in mind that doing so without the aid of a runtime independent plugin manager like pip acts as a multiplier on the platform level integration testing needed, which can be a significant cost even when it's automated)