Python moratorium and the future of 2.x

Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

On November 9, Python BDFL ("Benevolent Dictator For Life") Guido van Rossum froze the Python language's syntax and grammar in their current form for at least the upcoming Python 2.7 and 3.2 releases, and possibly for longer still. This move is intended to slow things down, giving the larger Python community a chance to catch up with the latest Python 3.x releases.

The idea of freezing the language was originally proposed by Van Rossum in October on the python-ideas list and discussed on LWN. There are three primary arguments for the freeze, all described in the original proposal:

Letting alternate implementations, IDEs, catch up: [...] frequent changes to the language cause pain for implementors of alternate implementations (Jython, IronPython, PyPy, and others probably already in the wings) at little or no benefit to the average user [...]

Encouraging the transition to Python 3.x: The main goal of the Python development community at this point should be to get widespread acceptance of Python 3000. There is tons of work to be done before we can be comfortable about Python 3.x, mostly in creating solid ports of those 3rd party libraries that must be ported to Py3k before other libraries and applications can be ported.

Redirecting effort to the standard library and the CPython implementation: Development in the standard library is valuable and much less likely to be a stumbling block for alternate language implementations. I also want to exclude details of the CPython implementation, including the C API from being completely frozen — for example, if someone came up with (otherwise acceptable) changes to get rid of the [Global Interpreter Lock] I wouldn't object.

The proposal turned into PEP 3003, "Python Language Moratorium", which is more definite about what cannot be changed:

New built-ins

Language syntax

The grammar file essentially becomes immutable apart from ambiguity fixes.

The grammar file essentially becomes immutable apart from ambiguity fixes. General language semantics

The language operates as-is with only specific exemptions ...

The language operates as-is with only specific exemptions ... New __future__ imports

These are explicitly forbidden, as they effectively change the language syntax and/or semantics (albeit using a compiler directive).

Adding a new method to a built-in type will still be open for consideration, and so is changing language semantics that turn out to be ambiguous or difficult to implement. Python's C API can be changed in any way that doesn't impose grammar or semantic changes, and the modules in the standard library are still fair game for improvement.

The duration of the freeze is given in the PEP as "a period of at least two years from the release of Python 3.1." Python 3.1 was released on June 27 2009, so the freeze would extend until at least June 2011. Van Rossum later clarified the duration on python-dev, writing "In particular, the moratorium would include Python 3.2 (to be released 18-24 months after 3.1) but (unless explicitly extended) allow Python 3.3 to once again include language changes."

Most responses to the moratorium idea were favorable, but those who had objections felt those objections very strongly. Steven D'Aprano wrote:

A moratorium isn't cost-free. With the back-end free to change, patches will go stale over 2+ years. People will lose interest or otherwise move on. Those with good ideas but little patience will be discouraged. I fully expect that, human nature being as it is, those proposing a change, good or bad, will be told not to bother wasting their time, there's a moratorium on at least as often as they'll be encouraged to bide their time while the moratorium is on. A moratorium turns Python's conservativeness up to 11. If Python already has a reputation for being conservative in the features it accepts — and I think it does — then a moratorium risks giving the impression that Python has become the language of choice for old guys sitting on their porch yelling at the damn kids to get off the lawn. That's a plus for Cobol. I don't think it is a plus for Python.

The 2-to-3 transition

One of the reasons for the moratorium is the developers' increasing concern at the slow speed of the user community's transition away from Python 2.x. The moratorium thread led to a larger discussion of where Python 3.x stands.

Progress on the transition can be roughly measured by looking at the third-party packages available for Python 3.x. Only about 100 of the 8000 packages listed on the Python Package Index claim to be compatible with Python 3, and many significant packages have not yet been ported (Numeric Python, MySQLdb, PyGTk), making it impossible for users to port their in-house code or application. Few Linux distributions have even packaged a Python 3.x release yet.

For the Python development community, it's tempting to nudge the users toward Python 3 by discouraging them from using Python 2. The Python developers have been dividing their attention between the 2.x and 3.x branches for a few years now, and a significant number of them would like to refocus their attention on a single branch. Given the slow uptake of Python 3, though, it's difficult to know when Python 2 development can stop. The primary suggestions in the recent discussion were:

Declare Python 2.6 the last 2.x release. Declare Python 2.7 the last 2.x release. After Python 2.7, continue with a few more releases (2.8, 2.9, etc.). Declare the 3.x branch an experimental version, call it dead, and begin back-porting features to the 2.x branch.

Abandoning the 3.x branch had very few supporters. Retroactively declaring 2.6 the final release was also not popular, because people have been continuing to apply and backport improvements on the assumption that there was going to be a 2.7 release.

As Skip Montanaro phrased it:

2.6.0 was released over a year ago and there has been no effort to suppress bug fix or feature additions to trunk since then. If you call 2.6 "the end of 2.x" you'll have wasted a year of work on 2.7 with about a month to go before the first 2.7 alpha release. If you want to accelerate release of 2.7 (fewer alphas, compressed schedule, etc) that's fine, but I don't think you can turn back the clock at this point and decree that 2.7 is dead.

A significant amount of work has already been committed to the 2.7 branch, as can be seen by reading "What's New in Python 2.7" or the more detailed NEWS file. New features include an ordered dictionary type, support for using multiple context managers in a single with statement, more accurate numeric conversions and printing, and several features backported from Python 3.1.

Clearly a 2.7 release will happen, and manager Benjamin Peterson's draft release schedule projects a 2.7 final release in June 2010. There's no clear consensus on whether to continue making further releases after 2.7. Post-2.7 releases could continue to bring 2.x and 3.x into closer compatibility and improve porting tools such as the 2to3 script, while keeping existing 2.x users happy with bugfixes and a few new features, but this work does cost effort and time. Brett Cannon stated his case for calling an end with 2.7:

[...] I think a decent number of us no longer want to maintain the 2.x series. Honestly, if we go past 2.7 I am simply going to stop backporting features and bug fixes. It's just too much work keeping so many branches fixed.

Raymond Hettinger argued that imposing an end-of-life is unpleasant for users:

I do not buy into the several premises that have arisen in this thread. [First premise:] For 3.x to succeed, something bad has to happen to 2.x. (which in my book translates to intentionally harming 2.x users, either through neglect or force, in order to bait them into switching to 3.x).

Hettinger is unmoved by the argument that maintaining 2.x takes up a lot of time, arguing that backporting a feature is relatively quick compared to the time required to implement it in the first place. He's also concerned that 3.x still needs more polishing, and concludes:

In all these matters, I think the users should get a vote. And that vote should be cast with their decision to stay with 2.x, or switch to 3.x, or try to support both.

Assessment

Declaring such a long-term freeze on the language's evolution is a surprising step, and not one that developer groups often choose. Languages defined by an official standard, such as C, C++, or Lisp, are forced to evolve very slowly because of the slow standardization process, but Python is not so minutely specified. D'Aprano makes a good point that the developers are already pretty conservative; most suggestions for language changes are rejected. On the other hand, switching to Python 3.x is a big jump for users and book authors; temporarily halting further evolution may at least give them the sense they're not aiming for a constantly shifting target.

It's probably premature to call the transition to Python 3.x a failure, or even behind schedule. These transitions invariably take a lot of time and proceed slowly. Many Linux distributions have adopted Python for writing their administrative tools, making the interpreter critical to the release process. Distribution maintainers will therefore be very conservative about upgrading the Python version. It's a chicken-and-egg problem; third-party developers who stick to their distribution's packages can't use Python 3 yet, which means they don't port their code to Python 3, which gives distributions little incentive to package it. Eventually the community will switch, but it'll take a few years. The most helpful course for the Python developers is probably to demonstrate and document how applications can be ported to Python 3, as Martin von Löwis has done by experimentally porting Django to Python 3.x, and where possible get the resulting patches accepted by upstream.

It remains to be seen if a volunteer development group's efforts can be successfully redirected by declaring certain lines of development to be unwelcome. Volunteers want to work on tasks that are interesting, or amusing, or relevant to their own projects. The moratorium may lead to a perception that Python development is stalled, and developers may start up DVCS-hosted branches of Python that contain more radical changes, or move on to some other project that's more entertaining.

The nearest parallel might be the code freezes for versions 2.4 and 2.6 of the Linux kernel. The code freeze for Linux 2.4 was declared in December 1999, and 2.5.0 didn't open for new development until November 2001, nearly two years later. The long duration of the freeze led to a lot of pressure to bend the rules to get in one more feature or driver update.

Python's code freeze will be of similar length and there may be similar pressure to slip in just one little change. However, freezing the language still leaves lots of room to improve the standard library and the CPython implementation, enhance developer tools, and explore other areas not covered by the moratorium. Perhaps these tasks are enough of an outlet for creative energy to keep people interested.

