Python 3 adoption

Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

"If you can't measure it, you can't migrate it" was the title of the next presentation, which came from Glyph Lefkowitz, who is the maintainer of the Twisted event-driven network framework. "I famously have issues with Python 3", he said, but what he mainly wants is for there to be one Python. If that is Python 2.8 and Python 3 is dropped, that would be fine, as would just having Python 3.

Nobody is working on Python 2 at this point. Interested developers cannot just go off and do the work themselves, since the core developers (which he and others often refer to as "python-dev" after the main CPython-development mailing list) are actively resisting any significant improvements to Python 2 at this point. That is because of a "fiat decision" by python-dev, not because there are technical reasons why it couldn't be done.

Beyond that, "nobody uses Python 3", at least for some definition of "nobody". There are three versions of Python in use at this point: 2.6, 2.7, and everything else. Based on some Python Package Index (PyPI) data that he gathered (which was a few months old; he also admitted the methodology he used was far from perfect), Python 2.7 makes up the majority of the downloads, while 2.6 has a significant but far smaller chunk, as well. All the other Python versions together had a smaller slice than even 2.6.

He pointed to the "Can I use Python 3?" web site, which showed 9,900 of 55,000 packages ported. But he typed in one he uses (Sentry) and it is blocked by nine dependencies that have not yet been ported to Python 3. There is a "long tail problem", in that there are lots of packages that need porting and, because of interdependencies, plenty of things won't work until most of those packages are ported. But there are lots of small packages that are essentially unmaintained at this point; they work fine for Python 2 so the developers aren't putting out updates, much less porting them to Python 3.

Lefkowitz said he spends a "lot of time worrying about Python 3", but other maintainers are just giving up. New programmers who are learning Python 3 get "really mad about packages that don't work". They go on Reddit and Hacker News to scream about it. That causes maintainers to just quietly drop out of the community sometimes. He talked to several who did not want to be identified that were burnt out by the continual harassment from those who want Python 3 support. The problem is "not being healed from the top", he said.

There are a number of examples where the project is not communicating well about deprecated features or packages, he said. PIL (Python Imaging Library) is still mentioned in official documentation, even though it has been officially supplanted by Pillow for years. That fact is not unambiguously communicated to users, who then make their projects dependent on an outdated (and unavailable for Python 3) package.

He also has some ideas on things that could be done to make Python users happier. To start with, there needs to be a better story for build artifacts. The Go language has a great support for building and sharing programs. Basically, any Go binary that is built can be copied elsewhere and just run. But static linking as the solution for binary distribution is an idea that has been around for a long time, one attendee said.

Lefkowitz noted that 6th graders who are building a game using Pygame just want to be able to share the games they make with their friends. Users don't want a Python distribution, they just want some kind of single file they can send to others. Guido van Rossum asked if these 6th-grade Pygame programmers were switching to Go, and Lefkowitz said that they weren't. Mostly they were switching to some specialized Java teaching tool that had some limited solution to this problem (it would allow sharing with others using that same environment). "We could do that for Python", he said, so that users can share a Pygame demo without becoming experts in cross-platform dynamic linking.

Performance is another area where Python could improve. People naively think that when moving to Python 3 they will suddenly get 100x the performance, which is obviously not the case. He suggested making PyPy3 the default for Python 3 so that people "stop publishing benchmarks that make Python look terrible".

Better tools should be on the list as well, he said. The Python debugger (pdb) looks "pretty sad compared to VisualVM".

But it doesn't actually matter if the Python community adopts any of those ideas. There is no way to measure if any of them have any impact on adoption or use. To start with, python-dev needs to admit that there is a problem. In addition, the harassment of library maintainers needs to stop; it would be good if some high-profile developers stepped in once in a while to say that on Reddit and elsewhere. In terms of measurement, the project needs to decide on what "solved" looks like (in terms of metrics) then drive a feedback loop for that solution.

Another thing the project should do is to release at least ten times as often as it does now (which is 18-24 months between major releases and around six months between minor releases). The current release cadence comes from a "different geologic era". Some startups are releasing 1000x as often as the current Python pace.

The problem with too few reviewers may be alleviated by a faster release cycle, Lefkowitz said. Twisted went from one release every two years to one release per quarter, so he has direct experience with increasing the frequency of releases. What Twisted found was that people move more quickly from contributors to reviewers to core developers when those cycles are shorter.

It requires more automation and more process, in faster, lighter-weight forms. The "boring maintenance fixes" will come much faster under that model. That allows new contributors to see their code in a release that much more quickly. The "slower stuff" (new features and so on) can still come along at the same basic rate.

He offered up a few simple metrics that could be used to measure and compare how Python 3 is doing. He would like to see python-dev come to some consensus on which metrics make sense and how they should be measured. For example, the PyPI numbers might be a reasonable metric, though they may be skewed by automated CI systems constantly downloading particular versions.

Another metric might be to measure the average number of blockers as reported by caniusepython3.com. The number of projects ported per month might be another. The project could even consider user satisfaction surveys to see if people are happy with Python 3. He would like to see further discussion of this on the python-dev mailing list.

Coghlan noted that one other factor might be where users are getting their Python. Since the Linux distributions are not shipping Python 3 by default (yet, mostly), that may be holding Python 3 back some in the Linux world.

Several others wanted to discuss the packaging issue. Thomas Wouters noted that there is a place for python-dev to do something about packaging, but that any such effort probably ought to include someone who is teaching 6th graders so that their perspective can be heard. Brett Cannon pointed to the Education Summit that was scheduled for the next day as a possible place to find out what is needed. Lefkowitz said that was important to do, because many have ideas on how to create some kind of Python executable bundle, but it requires knowledge from core developers to determine which of those ideas are viable.

That is the essence of the problem, Van Rossum said. The people who know what needs to be done and the people who can do it are disjoint sets. That is as true for Language Summit attendees as it will be for Education Summit attendees. Beyond that, the Distutils special interest group (SIG) is "the tar pit of SIGs".

People are already doing similar things using tools like py2exe and others, Lefkowitz said. It would be good to get them together to agree that there is a distribution problem for Python programs. Each of the solutions has its own way of tracking imports, collecting them up, and copying them around, so it would be good to come up with something common.

Barry Warsaw described a Twitter tool and format called PEX that takes a "well-written setup.py " and turns that into a kind of executable. It contains the Python interpreter, shared libraries, and imported modules needed to run the program. It "seems like the right direction" for packaging and distributing Python programs.

Łukasz Langa said that Facebook has something similar. It is "hacky but it works". It collects all of the shared library files into a single file, collects the imported modules, zips all of that up, and prepends a Bash script onto the front so that it executes like any other program. Startup time is kind of long, however. Google also has a tool with the same intent, Wouters said.

Lefkowitz concluded by saying that he thought python-dev should provide some leadership or at least point a finger in the right direction. Getting a widely adopted solution could drive the adoption of Python 3, he said. Van Rossum suggested that someone create an informational PEP to start working on the problem.