Python "standard" library

Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

Python is often mentioned in the same breath with the phrase "batteries included", which refers to the breadth of its standard library. But there is an effort underway to trim back the standard library by removing some unloved modules. In addition, there has been persistent talk of a major restructuring of the library, into a fairly minimal core as described in Amber Brown's talk at this year's Python Language Summit, or in other ways as discussed on the python-dev mailing list in January (though it has come up many times before that as well). A mid-July python-ideas mailing list thread picked up on some of that; it ended up showing, once again, that there is no real consensus on what the standard library is—or should be.

A fairly simple idea for a Python enhancement was posted by Abdur-Rahmaan Janhangeer; the discussion likely went in directions he was not expecting. He suggested adding a stdlib module, akin to the existing builtins module, that would provide a way to discover all of the modules in the standard library. Later in the thread, he expanded on the idea:

Like right now if you want to know the stdlib functions, you have to go through the docs. While teaching programming, i find the builtins module very convenient as students can discover by themselves. Similarly, a stdlib module when inspected, shows what you can import right out of the box. Inspection goes a long way in making learning fun, going through the dosctrings etc.

Andrew Barnert thought the suggestion had merit, but that it could go even further:

Is it just the names that are in stdlib, or some kind of lazy imports for the modules themselves, so you can do "from stdlib import pprint"? The latter is a bit more complicated, but it seems like it would make the feature a lot more useful—it's a way to guarantee that you get the standard pprint even if it's been shadowed by a local file, a way to make sure you get an early error if your distributor decided to leave pprint out of the distribution, and so on. And that would be similar to the builtins module (builtins.print is the builtin print, even if you've shadowed it with a module global).

He pointed Janhangeer at his stdlib project from a year ago, which allows getting names from standard library modules without remembering which modules define them:

>>> import stdlib >>> stdlib.ETree <module 'xml.etree.ElementTree' from ... >

from stdlib ...

This would be an opportunity to clearly define the 'standard library' as something other than 'all the stuff that ships with cPython'

Barnert suggested Janhangeer implement thefeature as a Python Package Index (PyPI) module so that people could try it out. Christopher Barker agreed , but wanted to consider going even further: "".

Steven D'Aprano wondered if there was a real use case for a stdlib module, however. builtins is often used to ensure that the code refers to the built-in function and not some shadowed name. For example, a module that defines its own open() would use builtins.open() to access the "real" function. Shadowing standard library names is probably not used all that often, so the need for stdlib is somewhat dubious: "[...] it isn't clear to me that shadowing parts of the std lib is useful or common (except by accident, which is a problem to fix not a feature to encourage)".

It turns out that there are a lot of corner cases in any real definition of what is contained in the standard library, however, and thus what would appear in a hypothetical stdlib module (or namespace). Barnert asked a series of questions about whether to include platform-specific modules (e.g. framebuf for MicroPython or Apple's PyObjC shipped with Python on macOS), platform-specific removals (e.g. Linux distributions that ship separate packages for parts of the "standard library"), language-internal modules (e.g. __future__ , and C-language "modules" like _datetime or _compression ), and more. The overarching question would seem to be: is the standard library the same everywhere or is it tuned to a particular environment?

Barker had a set of answers for those questions but, as might be guessed, others differed. Beyond that, though, D'Aprano objected to the "scope-creep" inherent in any attempt to define the standard library more precisely. CPython provides the reference implementation for the language and other implementations should, in general, strive to ship everything that comes with CPython—unless there is a good reason not to, he said. "'Standard' doesn't mean 'available everywhere, in every version of every implementation'." He noted that getting a PEP written and approved for a stdlib namespace would be "hard enough" without adding other battles into the process.

D'Aprano had a fairly straightforward definition of what should be in the stdlib namespace: everything that is documented on the "Python Standard Library" page. But even that has exceptions in his mind:

If there's some special case (let's say, the HovercraftFullOfEels module, which we want to document but for some reason we don't want to be accessible in the stdlib namespace), then its fine for it to be left out (and documented as such).

But Barker is not afraid of widening the scope to refine the definition for the standard library. He sees it as an opportunity, and one that would be "far more useful than simply providing a new namespace for all the cruft that is already in there". Perhaps reusing the name "standard library" is not the best way forward; he suggested perhaps using "common library" to denote the subset of what ships with CPython but is expected to be available "everywhere". It is a subject with lots of potential arguments, however:

As someone said -- there is room for a lot of bike shedding around the edges -- so be it. If someone takes up the mantle and makes a test implementation and starts a PEP. then we will have a framework for that bike shedding.

That "someone" will not be him and Janhangeer has not indicated any interest in the larger idea. Overall, Barker's goal seems to be clearly delineating the "things that are really designed to be generically useful and counted on everywhere" versus those that are simply shipped with CPython. He noted the discussions about removing modules from the standard library and thought his idea might provide a middle ground of sorts; things could be moved from the common library to the standard library as part of the deprecation-signaling process. "The fact is that Python is almost entirely platform agnostic, which is a really great feature -- I'm suggesting it would be a tad better if the non-standard parts were more clearly labelled, that's all."

Chris Angelico pointed out some areas where it would be difficult to tease apart the platform specificity in the standard library and that's about where the conversation ended. Part of the problem with any attempt to tackle the standard library is the historical accretion it has undergone. If Python were somehow magically rebooted today, the standard library would likely look much different. It might well, in fact, look a lot like what Barker is advocating, though probably even more minimal than that. Would the Python core developers still want to design a "batteries included" standard library if they started over?

The other place where reworking the standard library runs aground is the problem of backward compatibility. After the huge mess that the Python 3 transition caused, which is, of course, still being felt today and likely for the next five years or more, one would guess the core developers will be extremely careful about breaking things moving forward. That makes it hard to do much more than slowly deprecate a fairly small number of modules over the coming years; a true rework of the standard library seems like something we will not be seeing anytime soon—if ever.