During last year’s F8 developer conference, Facebook announced the 1.0 launch of PyTorch, the company’s open-source deep learning platform. At this year’s F8, the company launched version 1.1. The small increase in version numbers belies the importance of this release, which focuses on making the tool more appropriate for production usage, including improvements to how the tool handles distributed training.

“What we’re seeing with PyTorch is an incredible moment internally at Facebook to ship it and then an echo of that externally with large companies,” Joe Spisak, Facebook AI’s product manager for PyTorch, told me. “Make no mistake, we’re not trying to monetize PyTorch […] but we want to see PyTorch have a community. And that community is starting to shift from a very research-centric community — and that continues to grow fast — into the production world.”

So with this release, the team and the more than 1,000 open-source committers that have worked on this project are addressing the shortcoming of the earlier release as users continue to push the limits. Some of those users, for example, include Microsoft, which is using PyTorch for its language models that scale to a billion words, and Toyota, which is using it for some of its driver assistance features.

As Spisak told me, one of the most important new features in PyTorch 1.1 is support for TensorBoard, Google’s visualization tool for TensorFlow that helps developers evaluate and inspect models. Spisak noted that Google and Facebook worked together very closely on building this integration. “Demand from developers has been incredible and we’re going to contribute back to TensorBoard as a project and bring new capabilities to it,” he said.

Also new are improvements to the PyTorch just-in-time compiler, which now supports dictionaries, user classes and attributes, for example, as well as the addition of new APIs to PyTorch that support Boolean tensors and support for custom recurrent neural networks.

What’s most important for many production users, though, is the improvements the team made to PyTorch’s distributed training capabilities. These include the ability to split large models across GPUs and various other tweaks that’ll make training large models faster when you have access to a cluster of machines.