Guido on Python

Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

Guido van Rossum's EuroPython 2015 keynote was billed as part prepared remarks, part Q&A, but he changed all that when he stepped up on the stage. Instead, it would all be Q&A, but he did prime the pump with some questions (and answers) of his own before taking audience questions. Topics included Python 3 (and 3.5), why there won't be a 2.8, why there are so many open bugs, PyPy, and what he hates about Python.

Django Girls

Van Rossum's first question to himself was about what he thought of Django Girls—the subject of the previous day's keynote. It was a great talk, he said, and he loved the storytelling. There will be none of that in his talk, nor any "pretty slides". He was stunned when he "heard that Ola ... or Ola ... had drawn the squirrels and badgers" for those slides.

Another aspect that he liked was their statement that they didn't know what they were doing. It reminded him of when he started working on Python 25 years ago. He didn't know what he was doing then, either. For example, he had no idea that a programming language needed a community with lots of different roles.

He was also quite impressed by the "strong brand" they have created in one year. "I predict that Ola and Ola, and Django Girls, will go really far."

Python versions

Shifting gears, his next query was on why developers would switch to Python 3. "Why can't you just give up on Python 3?", he asked himself rhetorically. But he is not saying that people should switch. He does want them to, but it is "a lot of hard work that could be spent on other things", such as features for their application or web site. Python 2.7 is not dead yet and will be supported with security fixes and, perhaps, security features for the next five years. Porting to Python 3 is a lot of grunge work, so why bother?

Python 3 is a "much better language" than Python 2, for one thing. It is much easier to teach. For example, the Django Girls workshops are all based on Python 3. That wouldn't have been possible if the Django developers hadn't done the grunge work to port the framework. That makes for a more pleasant first experience with the language (and the framework).

It is also getting better over time. There is "lots of cool new stuff" in Python 3.5, for example. Python 2 is a fine language and will remain exactly what it is, which is kind of asymptotically approaching 2.7-perfect, he said. The only way to benefit from all of the work that the core developers are doing is by moving to Python 3.

The perennial question of why not make a Python 2.8 release was next up, though Van Rossum noted that maybe that particular question was getting somewhat dated at this point. Python 2.8 wouldn't solve any of the problems that make people want it. Either it would have no new features, which means there would be no reason not to just stay at 2.7, or the floodgates get opened for backports from Python 3. That would make porting to 2.8 nearly what is needed to port to 3.

Unicode is, of course, the big hurdle to moving to Python 3. But, "enough is enough". So Python 2 is in a state where it will get no new features. That allows the core developers to focus their energy on making Python 3 better.

He then moved on to Python 3.5, which is due in September. He would have trouble choosing a favorite feature from the release because there are "way too many cool things" in it. For example, the performance improvement coming from os.scandir() is great, but it probably won't even be noticed by most. There is a small group of Python users that will be happy to see the new matrix multiplication operator. Numeric packages like NumPy and others will be able to start using it, which will allow writing matrix multiplication in a "much more natural fashion than calling a function".

Perhaps his favorite 3.5 feature should be type hints, since it is a PEP he worked on himself. It took a lot of work for the PEP to get accepted, which is a little bizarre since he is the benevolent dictator for life (BDFL) and could accept his own PEP. But he wanted to have independent review and acceptance of the PEP, which Mark Shannon was graciously willing to provide as the BDFL delegate, he said.

If you caught him unaware, though, he probably would name the new async and await keywords for coroutines as his favorite. It was the last PEP accepted for 3.5 and it provides a more natural way of specifying coroutines.

Open bugs

Someone recently asked him about all of the open bugs in the Python bug tracker. If you pick a random open bug, you will find that it probably has patches attached to it, a bunch of discussion, and even renowned core developers saying that the patches need to go in, but the bug is still not fixed. Is it because of an old boys' network or lame core developers? What needs to be done to get those patches applied?

The situation is the same for any large project, he said. Bugs that can't be closed right away, due to things like misread documentation, tend to pile up. They can be hard to reproduce due to the hardware or environment, for example. But those kinds of bugs don't have patches attached.

There are also bugs that are feature proposals that do have patches attached, but there is a general hesitation to accept changes like that because there is concern that they aren't useful, won't mesh with other similar language features, or that they will cause backward incompatibilities. It is hard to take patches without breaking things all the time.

In addition, the core developers all have a lot on their plates and no one is paid to merge patches for core Python. So if none of the core team cares about a particular patch or feature, they may not find the time to shepherd it through the merging process.

In a company, things are a little different. People are paid to do some amount of grunge work, but with open source you have to volunteer to do unpleasant tasks. Some core developers have been doing that for so long that they want a break from that kind of grunge work. These are some of the many reasons that there are lots of open bugs with long histories in the bug tracker.

Lastly, there is a statistical effect that many overlook. If you pick any bug at random, including closed bugs, you would likely get a closed bug. Many bugs are closed quickly and bugs that are easy to fix tend to get fixed quickly. But the average lifetime of an open bug grows linearly with the age of the project, he said.

The GIL

Someone from the audience asked about the global interpreter lock (GIL), looking for more insight into the problem and how it is being addressed. Van Rossum asked back with a grin: "How much time have you got?" He gave a brief history of how the GIL came about. Well after Python was born, computers started getting more cores. When threads are running on separate cores, there are race conditions when two or more try to update the same object, especially with respect to the reference counts that are used in Python for garbage collection.

One possible solution would be for each object to have its own lock that would protect its data from multiple access. It turns out, though, that even when there is no contention for the locks, doing all of the locking and unlocking is expensive. Some experiments showed a 2x performance decrease for single-threaded programs that didn't need the locking at all. That means there are only benefits when three or more threads and cores are being used.

So, the GIL was born (though that name came about long after it was added to the interpreter). It is a single lock that effectively locks all objects at once, so that all object accesses are serialized. The problem is that now, 10 or 15 years later, there are multicore processors everywhere and people would like to take advantage of them without having to do multiprocessing (i.e. separate communicating processes rather than threads).

If you were to design a new language today, he said, you would make it without mutable (changeable) objects, or with limited mutability. From the audience, though, came: "That would not be Python." Van Rossum agreed: "You took the words out of my mouth." There are various ongoing efforts to get around the GIL, including the PyPy software transactional memory (STM) work and PyParallel. Other developers are also "banging their head against that wall until it breaks". If anyone has ideas on how to remove the GIL but still keep the language as Python, he (and others) would love to hear about it.

PyPy

He was asked about PyPy, whether he used it and whether it might someday become the default interpreter. He does not use PyPy, but he does download it once in a while, plays with it for a few minutes, and likes what he sees. He uses Python in two modes, either writing a short little script to get something done, for which he just uses one of the interpreters he already has built on his system, or as a Dropbox engineer deploying Python code to its cluster.

The Dropbox cluster runs a modified Python 2.7, he said, which elicited audience laughter. "I said it, it is no secret", he said. Some parts of Dropbox do use PyPy where it is faster, but the company is worried that some subtle incompatibility will produce weird bugs that are hard to track down. "We have enough of those already".

PyPy shows that you can execute Python faster than CPython. It also provides a testbed where interesting ideas like STM can be tried. But conservative principles lead people to only use PyPy when they know they need the speed. The problem is that by the time you find that out, you have already deployed to enough machines that it makes switching hard. In that, it is like the problem with switching to Python 3.

Dropbox has a lot of third-party dependencies, some of which cannot even be rebuilt from the sources that it has. That is generally true of any company that has millions of lines of Python in production; he also saw it at Google. That also makes switching hard.

In summary, PyPy is a "really cool project", but there are checkboxes that it needs to satisfy that "are hard to check off". He half-jokingly suggested that perhaps PyPy needed to hire Ola and Ola from Django Girls to help create a bigger community around the project.

Favorites

Five short questions about his favorites was up next. Favorite web framework? He said he only writes one web app in any framework and that the last he tried was Flask. Favorite testing library? He mostly just uses unittest and mock from the standard library. Editor? He uses Emacs, but started out with vi (both of which were greeted with applause, presumably by different sets of audience members). He still uses vi (or Vim) occasionally, but if he does that for five minutes, it takes him fifteen minutes to get used to Emacs again.

What is his favorite language besides Python? He used to say C, but "that's kind of boring". People he trusts tell him that modern C++ is a good language. He likes Go, but hasn't written anything significant in it. From talking with the designers, he likes the looks of Swift, which stole a bunch of things from Python. It is easy to steal bad stuff from languages you like and end up with an incoherent pile of features, but Swift's designers appear not to have done that. Finally, favorite exception? He chuckled and answered KeyboardInterrupt to more applause and laughter.

And what he hates

The final question was about what he hates in Python. "Anything to do with package distribution", he answered immediately. There are problems with version skew and dependencies that just make for an "endless mess". He dreads it when a colleague comes to him with a "simple Python question". Half the time it is some kind of import path problem and there is no easy solution to offer.

With that, his time was up. The EuroPython organizers had a gift for each of the keynote speakers: a Basque beret and bandana. They were presented to Van Rossum at the end of the talk (seen above at right).

