If you listen to the podcast episode mentioned earlier, you’ll realize how critical the PyPI and Read The Docs systems really are. The packaging index alone is ferrying data around the vicinity of 40TB a month, with an infrastructure cost of $40k/month, donated by Rackspace. If Rackspace decided to stop funding this project, or PyPI were to go down in any way, what will the rest of us do? Yes there are contingencies but I don’t like resting one of the most basic Python features on the good will of just one donor. No offense to Rackspace, but there are always unforeseen circumstances, and I’m sure they would appreciate some partners.

These folks have been looking at possible sources of donations and income for a while, but there are no silver bullets. Even more so when you consider that some of the typical solutions for these types of projects (like advertisements) could be detrimental to the main use cases of the systems themselves.

If a company could pay for having their packages featured or prioritized in some way, or if you went to PyPI looking for an AirBnB module and found a set of ads that follow you around the internet trying to sell you on a Caribbean vacation, you may find it annoying and avoid visiting the site in the future. Any possible solutions will require careful consideration of their impacts on the users.

Given they are open to suggestions, I wanted to add what I could to the discussion, so below is an idea to help monetize PyPI without the use of advertising.

The Idea

I’ve worked in large corporations for the majority of my career, becoming acutely aware not only of the infrastructure needs of these businesses, but also of the security and compartmentalization requirements that keep trade secrets and intellectual property private and secure, as well as protect the business from ill-intended intruders that want to wreak havoc wherever they can.

In my experience over the many years of provisioning, testing and deploying, a lot of work and effort has gone into a number of package management systems — especially within operating systems — that provide varying degrees of functionality, all with the same purpose of centralizing distribution of software (usually in compiled binary form) across large swaths of systems, along with some implementation of dependency management.

However, it wasn’t until a different episode from Talk Python — one of the earlier ones, I think Episode 23 — that I heard of someone using pip for internal deployments. Ever since then, I’ve done my best to push that in my organizations, and boy oh boy has it made my life easier.

Spin up a docker container with a server — I use the pypiserver package — and after a few home directory config files, just python setup.py sdist upload -r your_server to push a package and pip install --extra-index-url your_server to install it wherever you’re deploying.

Of course, this is fairly simple, insecure and limiting. In big business, it’s usually only accepted for smaller segregated teams and less important / non-critical projects. If you can get away with it at all.

Enter what I’m calling the Trusted Package Index. Businesses always have a need for an on-premises, secure, encrypted and highly available distribution mechanism of compiled binaries. Together with setuptools providing various install capabilities that can cover non-python code just as well, it seems like we could put together a decent product. Something like a Docker Trusted Registry.

If you take warehouse , add LDAP and Active Directory support for authenticating both package uploads and downloads, provide relevant monitoring statistics, backend storage encryption, high availability, a replication mechanism, plus the ability to integrate webhooks from GitHub Enterprise, GitLab or any other source control system that can trigger package rebuilds, I think you could put together a fairly decent appliance with enough appeal to a number of businesses willing to pay a modest yearly fee for it, especially if you can find a way to provide a support plan along with it.

This is just a suggestion, developing it will take time and effort that may be hard to find, but not only would it serve as a source of funding, it may also help develop a few infrastructure advancements to distribute network load, like balancing the real PyPI across multiple volunteer sites running replication.