Python 3 at Facebook

LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

Python 3 adoption has clearly picked up over the last few years, though there is still a long way to go. Big Python-using companies tend to have a whole lot of Python 2.7 code running on their infrastructure and Facebook is no exception. But Jason Fried came to PyCon 2018 to describe what has happened at the company over the last four years or so—it has gone from using almost no Python 3 to it becoming the dominant version of Python in the company. He was instrumental in helping to make that happen and his talk [YouTube video] may provide other organizations with some ideas on how to tackle their migration.

Fried started working at Facebook in 2011 and he quickly found that he needed to teach himself Python because it was much easier to get code reviewed if it was in Python. At some point later, he found that he was the driving force behind Python 3 adoption at Facebook. He never had a plan to do that, it just came about as he worked with Python more and more.

He started out by being active in the internal Python group. He was often the first to answer questions that came up. He eventually became famous ("or maybe infamous") with the Pythonistas at Facebook because, when he saw a problem with how the language was being used, he didn't ask permission, he simply fixed it. That works at Facebook because there is no real top-down hierarchy of control; everyone has as much power to back out a change you make as you have to make the change to begin with. Over time, his changes built up credibility within the Facebook Python community that would serve him well in the migration process.

Changing something like the Python language version at "Facebook scale" was going to take some time and a lot of diplomacy, he said. He wanted to tell the "story about how I and couple of engineers used our free time, with no authority whatsoever, and made Python 3 the dominant version at Facebook."

In 2013, there was rudimentary support for Python 3.3 at Facebook. It was there as part of a task for adding Python 3 support to the build system. But that task was blocked on Python 3 support in the Facebook libraries, which was in turn blocked by no Python 3 support in the build system. It was something of a catch-22; Python 3 was "available" but nothing in the Facebook environment supported it.

In addition, there was lots of negative sentiment about Python 3 at Facebook in 2013. The overall thinking was that the company would simply stay on Python 2.7 forever. There was talk of jumping ship to another language entirely. Even he said (in an internal group) that Python 3 would never happen at Facebook. Only one person challenged him on that statement and suggested that he do something about it; at the time, he ignored the suggestion, but it did stick in his head.

Some hope

There was, actually, some hope, he said. In January 2013, the four imports from __future__ ( print_function , division , absolute_imports , and unicode_literals ) were required by a "linter" that was being used. They were added in an attempt to extend the life of the Python 2 code base. They were added everywhere in order to quiet the linter, which ended up making it easier to convert modules to Python 3.

The Apache Thrift framework for serialization and remote procedure calls is "used everywhere" at Facebook. Since it was Python 2-only, it was a core blocker. But adding Python 3 support was popular in a poll for new Thrift features that the Facebook Thrift group had run. He voted for it, but not because he was on the Python 3 bandwagon at that point; he thought the Python 2 interface needed a refactor as it looked like it had come from Java.

His thinking started to switch when he saw Guido van Rossum give a talk at Yelp in San Francisco on something called "Tulip", which is what eventually became the asyncio module. He had always been a fan of asynchronous programming in Python, but found that it was fragmented because of the differences between the frameworks (e.g. Twisted, gevent) that provided it. Tulip looked like it would make asynchronous I/O interoperable rather than fragmented. Before that talk was even over, he was communicating with the Facebook Thrift team, suggesting that Thrift should simply support Tulip for Python 3, rather than wait for Twisted, gevent, and others to port to Python 3. A few days later, the Thrift team published a roadmap that showed Python 3 and Tulip support coming.

Both of those arrived in early 2014, but then nothing happened for six months; users did not show up, they had no plans to show up, and they, in fact, did not know about the changes at all.

A new project

In August 2014, he started a project to rewrite a service that he had inherited. He started planning to do it using gevent and Python 2, but then realized it would be obsolete at the time it was written if he did so. In order for something to change, someone needs to be the first one; for Facebook and Python 3, that was him. "For Python 3 in your organization, I think that person should be you."

So he started his project using Python 3 and "everything was broken"; it was no wonder that no one was using Python 3. The build system would not even build his code and all of the third-party wheel packages were only available for Python 2. When he finally fixed enough things to allow his service to be built, it would immediately fail when it was run—someplace deep in the guts of the code that sets up service entry points in the Facebook system.

So in order to get his code running, he had to fix everything else; he rebuilt hundreds of third-party wheels so that they would work with both Python versions and he had to make any internal libraries be 2/3 compatible. Every day, though, someone would commit a Python 2-only change into one of his dependencies. Not surprisingly, he got tired of fixing regressions. One solution would be to force Python 3 compliance within the organization, but Facebook is not a place where that is possible. But, if you act like you have some authority, people will start to believe that you do, indeed, have that authority.

He used up a lot of his social capital to add Pyflakes linting into the build process. He was able to justify adding it because there already was a PEP 8 linter, but Pyflakes would address other code quality issues; in addition, Pyflakes had few false positives so it did not overly irritate the developers. He set things up so that Pyflakes would run on all code that was put up for review, first for Python 2 and then for Python 3. That helped spread the job of keeping Python 3 compatibility out to all of the developers and not just him, which allowed him to make progress with his project.

Early on, he had to be responsive to help people understand that "no, the linter is not wrong" and that there was value to making the code work with Python 3. If the developers had started believing that moving to Python 3 was difficult, they would fall back on the "let's stay with Python 2 forever" mindset. He made it easy for developers to do the right thing with respect to keeping the code running on Python 3. It was easier to just "shut the linter up" and, by extension, him, than it was to complain about it, so most developers just did so.

Education

With all that in place, he had stopped the bleeding, but little or no progress toward running more Python 3 at Facebook was being made. He joined the team that did training on Python programming for new employees at Facebook. The linters already complained if the code was not compatible with 2 and 3, but he wanted to get to a point where 2/3 compatible code was only written for legacy projects and that new code should be written in Python 3. Once again he took matters into his own hands: in 2015, he changed the slides for the new employee Python class to make that statement. The idea was that at some unknown point in the future, Facebook will want to switch to Python 3, so writing Python 2-only code at this point makes no sense since it will have to be rewritten someday. He taught new hires that all of this should just work with the Facebook infrastructure and build systems and that if it didn't, they should file a bug or try to fix it themselves. "Strangely enough, that's what happened."

In January 2015, he "finally shipped" his project. He spent most of the rest of the year telling people how much better it was and why they should switch to using Python 3 where they could. Over the year, various allies in the effort to switch to Python 3 at Facebook made themselves known.

One of those allies was Łukasz Langa, who had "somehow convinced Instagram to move to Python 3". In 2016, he and Langa formed a brand new team in Facebook to shepherd Python within the company, which they dubbed "The Ministry of Silly Walks". Because they were "the Python team", the "perceived authority" he mentioned earlier worked; people assumed they could make decisions about Python at Facebook.

In 2016, he was seeing slow but steady growth in the amount of Python 3 that was being run at the company. There was mention of it in meetings and he regularly heard of new projects that were using it. The tide of opinion had changed at Facebook even though Python 3 was not the default and projects needed to actively choose to use it. By May 2016, he signaled his intention to switch the build system default to Python 3, which was overwhelmingly supported so he made that switch a few days later—with no ill effects.

Toward the end of 2016, there was a post from a project team that reported its results in switching to Python 3. The developers simply ran 2to3 on the code and fixed a few things that it complained about. When they ran the resulting code, they found it was 40% faster and used half the memory. This points to a persistent myth that Fried has heard: Python 3 is slower than Python 2. That may have been true for earlier releases of Python 3, but it is definitely not true now, he said.

Nice things

In early 2017, Instagram finished its migration to Python 3 and Facebook was reaping the benefits of this "glorious future where you can have nice things". Upgrades of Python versions were not particularly scary and brought new features that could be used. Facebook developers now focus on problems like using the new static typing features or migrating services to use asyncio . "Python at Facebook is fun again."

The problem now is that everyone is asking when Python 2 support can be retired. When there are regressions in the Python 2 support for a library or module, it is common to hear developers ask if those users can simply move to Python 3. It is the reverse of the problem he had a few years prior. "Oh what a wonderful world in which I live."

He showed a graph of the Facebook service entry points in Python over time, starting in Q3 of 2015 where there were four total Python 3 entry points. At the time of the switch defaulting to Python 3 in mid-2016, Facebook already had 4% of its entry points as Python 3. In March 2018, it crossed over the 50% line; in mid-May, when he gave the talk, it was 55% of the "tens of thousands of Facebook entry points" that were running Python 3. At Facebook it is now embarrassing to have code that only runs on Python 2, Fried said.

He then reviewed the process. He noted that you have to do more than just build something new; you have to lead developers to it by "being the change you want to see". You should get other people to help, even if they don't know they are helping, which is where linters and unit tests come into play. It is important to educate new hires for where you are heading. Once you get there, or some of the way there, celebrate by enjoying the "nice things": write some "awesome stuff in Python 3". Seeing how the new features can be used will make others want to convert.

He fielded some questions from the audience. One asked how they might make this happen in a more traditional, hierarchical organization. Fried thought that might actually be easier since, instead of thousands of developers that need to be convinced, it should be possible to work up the management chain starting with a manager who recognizes the benefits. It could also be harder if the culture is conservative, but focusing on code quality improvements may help there. Another question focused on code that is monolithic, rather than broken up into multiple entry points; for that Fried suggested looking at the Instagram keynote [YouTube video] from PyCon 2017.

There was lots in the talk that other organizations can use, but it is clear that having an advocate and shepherd with a lot of perseverance will be important. Companies that are planning a conversion of this sort will likely want to have someone like Fried on board.

[I would like to thank LWN's travel sponsor, the Linux Foundation, for assistance in traveling to Cleveland for PyCon.]

