Improving performance in Python 2.7

Benefits for LWN subscribers The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

Backporting a major performance improvement from Python 3 to Python 2 might seem to be in the "no-brainer" category, but things are not quite that simple. Python 2 (in the form of Python 2.7.x) is in a "no new features" mode that would normally preclude large changes, but 2.7 will be around for a lot longer than was previously envisioned. That makes the core development team more willing to consider this kind of patch, especially since it seems to come with a promise of more contributions. As with any topic that touches on the 2 vs. 3 question, though, it led to a long discussion and some dissent.

Vamsi Parasa of the Server Scripting Languages Optimization team at Intel posted a patch to change the switch statement that executes Python bytecode in the CPython interpreter to use computed goto s instead. GCC has support for the feature, which has already been used as an optimization in Python 3. As Eli Bendersky explained in a 2012 blog post, the enormous (2000+ line) switch statement for interpreting Python bytecodes can be made 15-20% faster by changing it to use computed goto s. There are two reasons for that: computed goto s avoid a bounds check that is required by the C99 standard for switch and, perhaps more significantly, the CPU can do better branch prediction, which reduces expensive pipeline flushes.

Several core developers spoke up in favor of the patch, but Berker Peksağ was not so sure. He complained that "performance improvements are not bug fixes" and that it would be better to spend time making Python 3 better. But there is more to this patch than meets the eye, Nick Coghlan explained:

Internal performance improvements, by contrast, don't hurt end users at all beyond the stability risks, and in this case, the request to make the change is being accompanied by the offer to assist with ongoing maintenance (including engaging an experienced core developer to help coach Intel contributors through the contribution process). So when folks ask "What changed?" in relation to this request, what changed is the fact that it isn't expected to be a one off contribution, but rather part of a broader effort focused on improving the performance of both Python 2 and Python 3, including contributions to ongoing maintenance activities.

As was noted in an email exchange tacked onto Parasa's post, Intel has hired Python core developer David Murray's company to help out navigating the Python development process with contributions it wants to make. But those contributions "came with one string attached: that the Python 2.7 branch be opened up for performance improvements in addition to bug fixes", Coghlan said. Because the proposal is "backed by a credible offer of ongoing contributions to CPython maintenance and support", that makes it different than other performance enhancements (or features) offered up for Python 2.7 along the way, he continued.

But accepting the offer does signal something of a shift in the way Python 2.7 will be maintained going forward. Encouraging more developers who are being paid to do the "boring" parts of Python maintenance (bug fixes and performance enhancements for 2.7, mostly) will allow volunteers to concentrate on the more interesting bits. Coghlan again:

Giving the nod to an increased corporate developer presence in Python 2 maintenance should eventually let volunteers stop worrying about even Python 2.7 bug fix changes with a clear conscience, confident that as volunteer efforts drop away redistributors and other folks with an institutional interest will pick up the slack with paid development time. "Do the fun stuff for free, figure out a way to get paid for the boring-but-necessary stuff (or leave those tasks to someone else that's getting paid to handle them)" is a good sustainable approach to open source development, while trying to do it *all* for free is a fast path to burnout.

Python's benevolent dictator for life (BDFL) Guido van Rossum was "strongly in favor" of the patch noting that it could save companies like his employer, Dropbox, "a lot of money". Dropbox has been slow to move to Python 3 but regularly updates to the latest Python 2.7 release, he said. But Victor Stinner wondered if it made more sense to put that effort into Python 3 in the hopes of getting more migration to that version of Python.

Van Rossum was clearly not happy with the idea of crippling Python 2 to somehow promote Python 3:

However this talk of "wasting our time with Python 2" needs to stop, and if you think that making Python 2 less attractive will encourage people to migrate to Python 3, think again. Companies like Intel are *contributing* by offering this backport up publicly.

As Larry Hastings pointed out, though, the question had already become moot. Python 2.7 release manager Benjamin Peterson had merged the patch for 2.7.11, which is slated for release in December.

The rate of Python 3 adoption and the best way to encourage it are often somewhat contentious subjects in Python circles. Though many have bemoaned the lack of new features in Python 2.7, a large part of the decision to do that has been driven by a lack of core developers interested in continuing to work on that branch. Adding more paid developers should help alleviate some of those concerns. It seems unlikely to lead to any wholesale changes to 2.7, but there may be enhancements that can be made in the next few years. Meanwhile, new features like async/await, type hints, and others will continue to be added to Python 3 to hopefully provide a carrot to draw more and more to that version of the language. Or at least that is the plan ...