[Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

Here's a quick summary of the main things that are going to happen in Distutils, and Distribute, and a few words on virtualenv and pip. (there is much much more work going on, but I don't want to drown people with details) = Distutils = Distutils is a package manager and competes with OS package managers. This is a good thing because, unless you are developing a library or an application that will only run one specific system that has its own packaging system like Debian, you will be able to reach much more people. Of course the goal is to avoid making the work of a Debian packager (or any other OS that has a package manager) too hard. In other words, re-packaging a Distutils-based project should be easy and Distutils should not get in their way (or as less as possible). But right now Distutils is incomplete in many ways and we are trying to fix'em. == What's installed ? what's the installation format ? how to uninstall ? == First, it's an incomplete package manager : you can install a distribution using it, but there's no way to list installed distributions. Worst, you can't uninstall a distribution. PEP 376 resolves this, and once it's finished, the goal is to include the APIs described there into Distutils itself and into the pkgutil module in stdlib. Notice that there's an implementation at http://bitbucket.org/tarek/pep376 that is kept up to date with PEP 376 so people can see what we are talking about. Another problem that popped during the last years is the fact that, in the same site-packages, depending on the tool that was used to install a Distribution, and depending if this distribution uses Distutils or Setuptools, you can have different installation formats. End-users end up with zipped eggs (one file), unzipped eggs (one self-contained format in a directory) and regular Distutils (packages and modules in site-packages). And the Metadata are also located in many different places depending on the installation format used. That can't be. there's no point to keep various installation format in the *same* site-packages directory. PEP 376 also resolves this by describing a *unique* format that works for all. Once this is finished, Distutils will implement it by changing the install command accordingly. - Work left to do in PEP 376 : restrict its scope to a disk-based, file-based site-packages. - Goal: 2.7 / 3.2 == Dependencies == The other feature that makes a packaging system nice is dependencies. e.g. a way to list in a distribution, the distributions it requires to run. As a matter of fact, PEP 314 has introduced in the Metadata new fields for this purpose ("Requires", "Provides and "Obsoletes"). So, you can write things like "Requires: lxml >= 2.2.1", meaning that your distribution requires lxml 2.2.1 or a newer version to run. But this was just description fields and Distutils was not providing any feature based on these new fields. In fact, no third-party tool either provided a feature based on those fields. Setuptools provided "easy_install" a script that looks for the dependencies and install them, by querying the Python Package Index (PyPI). But this feature was implemented with its own metadata: you can add an "install_requires" option in the setup() call in setup.py, and it will end up in a "requires.txt" file at installation time that is located alongside the Metadata for you distribution. So the goal is to review PEP 314 and update the Metadata w.r.t. the setuptools feedback and community usage. Once it's done, Distutils will implement this new metadata version and promote its usage. Promoting its usage means that Distutils will provide some APIs to work with these APIs, like a version comparison algorithm. And while we're at it, we need to work out some inconsistency with the "Author" and "Maintainer" fields. (The latter doesn't exists in the Metadata but exists on setup.py side). - Work left to do in PEP 314 : finish PEP 386, finish the discussion on the "maintainer" field. - Goal: 2.7 / 3.2 == Version comparison == Once you provide dependency fields in the metadata, you need to provide a version scheme: a way to compare two versions. Distutils has two version comparison algorithms that are not used in its code and in only one place in the stdlib where it could be removed with no pain. One version scheme is "strict" and one is "loose". And Setuptools has another one, which is more heuristic (it will deal with any version string and compare it, wether it's wrong or not). PEP 386 goal is to describe a version scheme that can be used by all and if we can meet a consensus there, we can move on and add it as a reference in the update done in PEP 314, besides the dependencies fields. Then, in Distutils we can deprecate the existing version comparison algorithms and provide a new one based on PEP 386 and promote its usage. One very important point: we will not force the community to use the scheme described in PEP 386, but *there is* already a de-facto convention on version schemes at PyPI if you use Pip or easy_install, so let's have a documented standard for this, and a reference implementation in Distutils. There's an implementation at http://bitbucket.org/tarek/distutilsversion that is kept up-to-date with PEP 386. - Work left to do in PEP 386 : another round with the community - Goal: 2.7 / 3.2 == The fate of setup.py, and static metadata == Setup.py is a CLI to create distribution, install them etc. You can also use it to retrieve the metadata of a distribution. For example you can call "python setup.py --name" and the name will be displayed. That's fine. That's great for developers. But there's a major flaw: it's Python code. It's a problem because, depending on the complexity of this file, an OS packager that just wants to get the metadata for the platform he's working on, will run arbitrary code that mught do unwanted things (or even that light not work) So we are going to separate the metadata description from setup.py, in a static configuration file, that can be open and read by anyone without running any code. The only problem with this is the fact that some metadata fields might depend on the execution environment. For instance, once "Requires" is re-defined and re-introduced via PEP 314, we will have cases where "pywin32" will be a dependency to have only on win32 systems. So we've worked on that lately in Distutils-SIG and came up with a micro-language, based on a ConfigParser file, that allows writing metadata fields that depends on sys.platform etc. I won't detail the syntax here but the idea is that the interpretation of this file can be done with a vanilla Python without running arbitrary code. In other words : we will be able to get the metadata for a distribution without having to install it or to run any setup.py command. One use case is the ability to list all dependencies a distribution requires for a given platform, just by querying PyPI. So I am adding this in Distutils for 2.7. Of course setup.py stays, and this is backward compatible. - Work left to do : publish the final syntax, and do the implementation - Goal: 2.7 / 3.2 == The fate of bdist_* commands == During last Pycon summit we said that we would remove commands like bdist_rpm because Python is unable, due to its release cycle, to do a good work there. Here's an example: I have from time to time cryptic issues in the issue tracker from people from Fedora (or any rpm-based system), and I have all the pain in the world for these very specific problems to do the proper fix unless some RPM expert helps around. And by the time it's detected then fixed, it can be year(s) before it's available on their side. That's why, depending on the communities, commands like bdist_rpm are just totally ignored, and OS packager have their own tools. So the best way to handle this is to ask these communities to build their own tool and to encourage them to use Distutils as a basis for that. This does not concern bdist_* commands for win32 because those are very stable and don't change too much: Windows doesn't have a package manager that would require these commands to evolve with it. Anyways, when we said that we would remove bdist_rpm, this was very controversial because some people use it and love it. So what is going to happen is a status-quo: no bdist_* command will be removed but no new bdist_* command wil be added. That's why I've encouraged Andrew and Garry, that are working on a bdist_deb command, to keep it in the "stdeb" project, and eventually we will refer to it in the Distutils documentation if this bdist_deb comply with Distutils standard. It doesn't right now because it uses a custom version of the Distribution class (through Setuptools) that doesn't behave like Distutils' one anymore. For Distutils, I'll add some documentation explaining this, and a section that will list community-driven commands. - Work left to do : write the documentation - Goal: 2.7 / 3.2 = Distribute = I won't explain here again why we have forked, I think it's obvious to anyone here now. I'll rather explain what we are planning in Distribute and how it will interact with Distutils. Distribute has two branches: - 0.6.x : provides a Setuptools-0.6c9 compatible version - 0.7.x : will provide a refactoring == 0.6.x == Not "much" is going to happen here, we want this branch to be helpful to the community *today* by addressing the 40-or-so bugs that were found in Setuptools and never fixed. This is eventually happen soon because its development is fast : there are up to 5 commiters that are working on it very often (and the number grows weekly.) The biggest issue with this branch is that it is providing the same packages and modules setuptools does, and this requires some bootstrapping work where we make sure once Distribute is installed, all Distribution that requires Setuptools will continue to work. This is done by faking the metadata of Setuptools 0.6c9. That's the only way we found to do this. There's one major thing though: thanks to the work of Lennart, Alex, Martin, this branch supports Python 3, which is great to have to speed up Py3 adoption. The goal of the 0.6.x is to remove as much bugs as we can, and try if possible to remove the patches done on Distutils. We will support 0.6.x maintenance for years and we will promote its usage everywhere instead of Setuptools. Some new commands are added there, when they are helpful and don't interact with the rest. I am thinking about "upload_docs" that let you upload documentation to PyPI. The goal is to move it to Distutils at some point, if the documentation feature of PyPI stays and starts to be used. == 0.7.x == We've started to refactor Distribute with this roadmap in mind (and no, as someone said, it's not vaporware, we've done a lot already) - 0.7.x can be installed and used with 0.6.x - easy_install is going to be deprecated ! use Pip ! - the version system will be deprecated, in favor of the one in Distutils - no more Distutils monkey-patch that happens once you use the code (things like 'from distutils import cmd; cmd.Command = CustomCommand') - no more custom site.py (that is: if something misses in Python's site.py we'll add it there instead of patching it) - no more namespaced packages system, if PEP 381 (namespaces package support) makes it to 2.7 - The code is splitted in many packages and might be distributed under several distributions. - distribute.resources: that's the old pkg_resources, but reorganized in clean, pep-8 modules. This package will only contain the query APIs and will focus on being PEP 376 compatible. We will promote its usage and see if Pip wants to use it as a basis. And maybe PyPM once it's open source ? (<hint> <hint>). It will probably shrink a lot though, once the stdlib provides PEP 376 support. - distribute.entrypoints: that's the old pkg_resources entry points system, but on its own. it uses distribute.resources - distribute.index: that's package_index and a few other things. everything required to interact with PyPI. We will promote its usage and see if Pip wants to use it as a basis. - distribute.core (might be renamed to main): that's everything else, and uses the other packages. Goal: A first release before (or when) Python 2.7 / 3.2 is out. = Virtualenv and the multiple version support in Distribute = (I am not saying "We" here because this part was not discussed yet with everyone) Virtualenv allows you to create an isolated environment to install some distribution without polluting the main site-packages, a bit like a user site-packages. My opinion is that this tool exists only because Python doesn't support the installation of multiple versions for the same distributions. But if PEP 376 and PEP 386 support are added in Python, we're not far from being able to provide multiple version support with the help of importlib. Setuptools provided a multiple version support but I don't like its implementation and the way its works. I would like to create a new site-packages format that can contains several versions of the same distribution, and : - a special import system using importlib that would automatically pick the latest version, thanks to PEP 376. - an API to force at runtime a specific version (that would be located at the beginning of all imports, like __future__) - a layout that is compatible with the way OS packagers works with python packages Goal: a prototype asap (one was started under the "VSP" name (virtual site-packages) but not finished yet) Regards Tarek -- Tarek Ziadé | http://ziade.org | オープンソースはすごい! | 开源传万世，因有你参与