[Python-Dev] Python startup time

On 7/19/2017 12:15 PM, Larry Hastings wrote: > > > On 07/19/2017 05:59 AM, Victor Stinner wrote: >> Mercurial startup time is already 45.8x slower than Git whereas tested >> Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial >> developers, with a startup time 2x - 3x slower... > > When Matt Mackall spoke at the Python Language Summit some years back, I > recall that he specifically complained about Python startup time. He > said Python 3 "didn't solve any problems for [them]"--they'd already > solved their Unicode hygiene problems--and that Python's slow startup > time was already a big problem for them. Python 3 being /even slower/ > to start was absolutely one of the reasons why they didn't want to upgrade. > > You might think "what's a few milliseconds matter". But if you run > hundreds of commands in a shell script it adds up. git's speed is one > of the few bright spots in its UX, and hg's comparative slowness here is > a palpable disadvantage. > > >> So please continue efforts for make Python startup even faster to beat >> all other programming languages, and finally convince Mercurial to >> upgrade ;-) > > I believe Mercurial is, finally, slowly porting to Python 3. > > https://www.mercurial-scm.org/wiki/Python3 > > Nevertheless, I can't really be annoyed or upset at them moving slowly > to adopt Python 3, as Matt's objections were entirely legitimate. I just now found found this thread when searching the archive for threads about startup time. And I was searching for threads about startup time because Mercurial's startup time has been getting slower over the past few months and this is causing substantial pain. As I posted back in 2014 [1], CPython's startup overhead was >10% of the total CPU time in Mercurial's test suite. And when you factor in the time to import modules that get Mercurial to a point where it can run commands, it was more like 30%! Mercurial's full test suite currently runs `hg` ~25,000 times. Using Victor's startup time numbers of 6.4ms for 2.7 and 14.5ms for 3.7/master, Python startup overhead contributes ~160s on 2.7 and ~360s on 3.7/master. Even if you divide this by the number of available CPU cores, we're talking dozens of seconds of wall time just waiting for CPython to get to a place where Mercurial's first bytecode can execute. And the problem is worse when you factor in the time it takes to import Mercurial's own modules. As a concrete example, I recently landed a Mercurial patch [2] that stubs out zope.interface to prevent the import of 9 modules on every `hg` invocation. This "only" saved ~6.94ms for a typical `hg` invocation. But this decreased the CPU time required to run the test suite on my i7-6700K from ~4450s to ~3980s (~89.5% of original) - a reduction of almost 8 minutes of CPU time (and over 1 minute of wall time)! By the time CPython gets Mercurial to a point where we can run useful code, we've already blown most of or past the time budget where humans perceive an action/command as instantaneous. If you ignore startup overhead, Mercurial's performance compares quite well to Git's for many operations. But the reality is that CPython startup overhead makes it look like Mercurial is non-instantaneous before Mercurial even has the opportunity to execute meaningful code! Mercurial provides a `chg` program that essentially spins up a daemon `hg` process running a "command server" so the `chg` program [written in C - no startup overhead] can dispatch commands to an already-running Python/`hg` process and avoid paying the startup overhead cost. When you run Mercurial's test suite using `chg`, it completes *minutes* faster. `chg` exists mainly as a workaround for slow startup overhead. Changing gears, my day job is maintaining Firefox's build system. We use Python heavily in the build system. And again, Python startup overhead is problematic. I don't have numbers offhand, but we invoke likely a few hundred Python processes as part of building Firefox. It should be several thousand. But, we've had to "hack" parts of the build system to "batch" certain build actions in single process invocations in order to avoid Python startup overhead. This undermines the ability of some build tools to formulate a reasonable understanding of the DAG and it causes a bit of pain for build system developers and makes it difficult to achieve "no-op" and fast incremental builds because we're always invoking certain Python processes because we've had to move DAG awareness out of the build backend and into Python. At some point, we'll likely replace Python code with Rust so the build system is more "pure" and easier to maintain and reason about. I've seen posts in this thread and elsewhere in the CPython development universe that challenge whether milliseconds in startup time matter. Speaking as a Mercurial and Firefox build system developer, *milliseconds absolutely matter*. Going further, *fractions of milliseconds matter*. For Mercurial's test suite with its ~25,000 Python process invocations, 1ms translates to ~25s of CPU time. With 2.7, Mercurial can dispatch commands in ~50ms. When you load common extensions, it isn't uncommon to see process startup overhead of 100-150ms! A millisecond here. A millisecond there. Before you know it, we're talking *minutes* of CPU (and potentially wall) time in order to run Mercurial's test suite (or build Firefox, or ...). >From my perspective, Python process startup and module import overhead is a severe problem for Python. I don't say this lightly, but in my mind the problem causes me to question the viability of Python for popular use cases, such as CLI applications. When choosing a programming language, I want one that will scale as a project grows. Vanilla process overhead has Python starting off significantly slower than compiled code (or even Perl) and adding module import overhead into the mix makes Python slower and slower as projects grow. As someone who has to deal with this slowness on a daily basis, I can tell you that it is extremely frustrating and it does matter. I hope that the importance of the problem will be acknowledged (milliseconds *do* matter) and that creative minds will band together to address it. Since I am disproportionately impacted by this issue, if there's anything I can do to help, let me know. Gregory [1] https://mail.python.org/pipermail/python-dev/2014-May/134528.html [2] https://www.mercurial-scm.org/repo/hg/rev/856f381ad74b