A guide to Python packaging

Front and center of a successful open source project is the packaging. A key ingredient to good packaging is versioning. Because the project is open source, you will want to publish your package to realize the many benefits the open source community offers. Different platforms and languages have different mechanisms for packaging, but this article focuses specifically on Python and its packaging ecosystem. The article discusses packaging mechanics to give you a foundation on which to grow and provides enough practical examples to get you started immediately.

Why worry about packaging?

Beyond just being the right thing to do, there are three practical reasons to package your software:

Ease of use

Stability (with versioning)

Distribution

It's an act of consideration to your users to make your application as effortless as possible to install. Packaging makes your software more accessible and easier to install. If it's easier to install, it will be easier for users to start using your software. By publishing your package on the Python Package Index (PyPI), you'll make it easily accessible through utilities like pip or easy_install . (See Related topics for links to more information on these tools.)

In addition, by versioning your packages, you enable your users to "pin" the dependency in their project on your software to a particular version. For example, pinning Pinax to the 0.9a2.dev1017 version would be expressed as:

Pinax==0.9a2.dev1017

This would enforce that the project used the 0.9a2.dev1017 release of Pinax.

Versioning ensures greater stability should you release changes to your software later that might have breaking interfaces. It allows your users to know exactly what they are getting and makes it easier for them to track differences in releases. Furthermore, project developers can know exactly what they are coding against.

A common method for publishing packages to PyPI (or your own distribution server) is to create a source distribution to upload. A source distribution is a standard way of packaging the source of your project as a distributable unit. There are ways to create binary distributions, but for the sake of open source, it makes sense also to distribute your source. Creating source distributions makes it easy for people to use tools that will look up the software on the Internet, download it, and install it all automatically. This process helps not only with local development but also with deployments of your software.

So, by making it easier for users to integrate and install your software, using good versioning that allows a reliable pinning technique, and then publishing your package for greater distribution, you will have a greater chance of your project being successful and gaining wider adoption. Wider adoption may lead to more contributors—something every open source developer surely desires.

Anatomy of a setup.py file

One of the purposes of the setup.py script is to serve as the executable you can run to package your software and upload it to distribution servers. The setup.py script can vary quite a bit in content as you browse around popular Python repositories. This article focuses on the basics. See the Related topics section to explore further on your own.

You can use the setup.py file for many different tasks, but here you create one that will enable you to run the following commands:

python setup.py register python setup.py sdist upload

The first command, register , takes the information supplied in the setup() function within the setup.py script and creates an entry on PyPI for your package. It won't upload anything; rather, it creates the metadata about your project so that you can subsequently upload and host releases there. The next two commands are chained together: sdist upload builds a source distribution, and then uploads it to PyPI. There are a few prerequisites, however, such as setting up your .pypirc configuration file and actually writing the contents of setup.py.

First, configure your .pypirc file. This should reside in your home directory, which will vary depending on your operating system. On UNIX®, Linux®, and Mac OS X, you can get there by typing cd ~/ . The contents of the file should contain your PyPI credentials, as shown in Listing 1.

Listing 1. A typical .pypirc file

[distutils] index-servers = pypi [pypi] username:xxxxxxxxxxxxx password:xxxxxxxxxxxxx

Next, go to PyPI and register for an account (don't worry: it's free). Put the same user name and password you created on PyPI in your .pypirc file, and make sure the file is named ~/.pypirc.

Now, in writing your setup.py script, you have to decide what you want displayed on the PyPI index page, and what do you want to name your project. Start by copying a template for setup.py that I use for projects (see Listing 2). Skipping the imports and functions, look at the bottom of the template and what you need to change to suit your project. See Related topics for a link to the full script.

Listing 2. The setup.py template

PACKAGE = "" NAME = "" DESCRIPTION = "" AUTHOR = "" AUTHOR_EMAIL = "" URL = "" VERSION = __import__(PACKAGE).__version__ setup( name=NAME, version=VERSION, description=DESCRIPTION, long_description=read("README.rst"), author=AUTHOR, author_email=AUTHOR_EMAIL, license="BSD", url=URL, packages=find_packages(exclude=["tests.*", "tests"]), package_data=find_package_data( PACKAGE, only_in_packages=False ), classifiers=[ "Development Status :: 3 - Alpha", "Environment :: Web Environment", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Framework :: Django", ], zip_safe=False, )

First, notice that this template expects your project to have two different files. The first is used for the long_description : It reads the contents of the README.rst file that's in the same directory as setup.py and passes the contents as a string to the long_description parameter. This file populates the landing page on PyPI, so it's a good idea to briefly describe the project and show examples of usage in this file. The second file is the package's __init__.py file. It's not explicitly mentioned here, but the line that sets the VERSION variable imports your package; and when it does, Python needs an __init__.py file and expects a variable defined in that module called __version__ . For now, just set it as a string:

# __init__.py __version__ = "0.1"

Now, let's look at the rest of the inputs:

PACKAGE is the Python package in your project. It's the top-level folder containing the __init__.py module that should be in the same directory as your setup.py file—for example: /- |- README.rst |- setup.py |- dogs |- __init__.py |- catcher.py So, dogs would be your package here.

NAME is usually similar to or the same as your PACKAGE name but can be whatever you want. The NAME is what people will refer to your software as, the name under which your software is listed in PyPI and—more importantly—under which users will install it (for example, pip install NAME ).

). DESCRIPTION is just a short description of your project. A sentence will suffice.

AUTHOR and AUTHOR_EMAIL are what they sound like: your name and email address. This information is optional, but it's good practice to supply an email address if people want to reach you about the project.

URL is the URL for the project. This URL may be a project website, the Github repository, or whatever URL you want. Again, this information is optional.

You may want to provide the license and classifiers also. For more information on creating a setup.py file, check out the Python documentation. (See Related topics.)

Versioning

Versioning is easily be a topic unto itself, but it is worth mentioning in the context of packaging, as good packaging involves proper versioning. Versioning is a form of communication with your users: It allows your users to build more stability and reliability into their products, as well. Through versioning, you are telling your users that you have changed something and are giving explicit boundaries for where those changes occurred.

You can find a standard for versioning Python packages in Python Enhancement Proposal (PEP) 386. (See Related topics.) It spells out rules that are pragmatic. Even if you don't read and understand or even agree with the PEP, it would be wise to follow it, as it's what more and more Python developers are used to seeing.

In addition, versioning is not just for stable releases that you upload to PyPI but is also useful for development releases using the devNN suffix. It's not typically good to upload these dev versions to PyPI, but you can still make them publicly available by setting up your own public (or private) distribution server; then, users who want to use the bleeding-edge version can reference that in their pip requirements.txt file. Here are a few examples of versioning:

1.0.1 # 1.0.1 final release 1.0.2a # 1.0.2 Alpha (for Alpha, after Dev releases) 1.0.2a.dev5 # 1.0.2 Alpha, Dev release #5

Publishing

People are not generally going to find and install your software without it being published. Most of the time, you will want to publish your packages on PyPI. After you set up your .pypirc configuration file, the upload command you pass to setup.py transmits you package to PyPI. Typically, you do so in conjunction with building a source distribution:

python setup.py sdist upload

If you are using your own distribution server, add a section for authorization in your .pypirc file for this new location, and refer to it by name when uploading:

python setup.py sdist upload -r mydist

Set up your own distribution server

The primary reason for using your own distribution server in open source is to provide a place to publish dev releases, as PyPI should really just consist of stable releases. For example, you probably want:

pip install MyPackage

... to install the latest stable release found on PyPI. However, if you add later dev releases, that command will end up installing the latest release period, which means your dev release. It's generally good always to pin a release, but not all users will do this. Therefore, ensure that not specifying a version number always returns the latest stable release.

One way to have your cake (only expose stable releases for default use of pip ) and to eat it too (enable users to install packaged dev releases) is to host your own distribution server. The Pinax project does this for all its dev releases at http://dist.pinaxproject.com. (See Related topics.)

The distribution server is just an index served up over Hypertext Transfer Protocol (HTTP) of files on your server. It should have the following file structure:

/index-name/package-name/package-name-version.tar.gz

You can then make the server private if you so desire by configuring Basic-Auth on your web server. You may want to add some facility to upload source distributions, as well. To do so, you need to add code to handle the upload, parse the file name, and create the directory paths to match the scheme above. This structure is in place for the Pinax project, which hosts several repositories.

pip and virtualenv

Although this article has focused primarily on packaging, this section describes consuming packages, providing a bit of appreciation for what good packaging and versioning give your users.

pip is a tool that you can install directly, but I recommend using it as part of virtualenv . (See Related topics.) I recommend using virtualenv for everything related to Python, as it keeps your Python environments clean. Much as a virtual machine can allow you to run multiple operating systems side by side, virtualenv s allow you to run multiple Python environments side by side. I don't install anything in my system Python but rather create a new virtualenv for each new project or utility I work on.

Now that you have installed virtualenv , you can play for a moment:

$ mkvirtualenv —no-site-packages testing $ pip install Pinax $ pip freeze|grep Pinax $ pip uninstall Pinax $ pip install —extra-index-url=http://dist.pinaxproject.com/fresh-start/ Pinax==0.9a2.dev1017 $ pip freeze|grep Pinax

Notice that the first pip installation downloaded and installed from PyPI. pip freeze shows all versions of packages installed in your current virtualenv . pip uninstall does exactly what you think it does: remove itself from the virtualenv . Next, you install a dev version from the fresh-start repository at http://dist.pinaxproject.com to get the development version of Pinax version 0.9a2.dev1017.

No going to websites, downloading tarballs, and symlinking code to a site-package. (That is how I used to do it, and it caused many problems.) Your users get all this as a result of good packaging, publishing, and versioning of your project.

Conclusion

The bottom line is that it's well worth your while to spend some time learning the art and science of packaging. You'll gain greater adoption by users because of the ease of installation and stability that versioning your packages gives them. Using the template setup.py provided in Related topics and covered in this article, you should be able to add packaging to your project quickly and easily. Communicating to your users through proper versioning is considerate of your users, making it easy for them to track changes from release to release. Finally, as pip and virtualenv gain wider adoption, reliance on published packages—whether on PyPI or on your own distribution servers—increases. Therefore, make sure you publish the projects that you want to share with the world.

I hope that this article has provided enough to get you started. The Related topics should provide documentation to help you dive deeper. If you have questions, don't hesitate to hop onto Freenode and find me in such chat rooms as #pinax and #django-social (with the nickname "paltman") or on Twitter (@paltman).

Downloadable resources

Related topics