A foundation for scikit-learn at Inria

We have just announced that a foundation will be supporting scikit-learn at Inria : scikit-learn.fondation-inria.fr

Growth and sustainability

This is an exciting turn for us, because it enables us to receive private funding. As a result, we will be able to have secure employment for some existing core contributors, and to hire more people on the team. The goal is to help sustaining quality (more frequent releases?) and to tackle some ambitious features.

A foundation? What and why? Open source lives and thrives by its base, the community of developers. And scikit-learn is a fantastic example of these dynamics. Because of its grass-root origins, it has focused on features that matter for the small and the many, such as ease of use and statistical models that work well in data-poor situations. Over the years, decisions have been based on their technical merit, rather than the importance of displaying a list of features that are trendy. A consequence of the breadth of contributors with different backgrounds is the library tends to be well-suited for many applications, including some models that are less mainstream. People with dedicated time to support the community That said, over time this is an increasing need for a core team of maintainers. As the library gets bigger, is it more and more difficult to have a full view of what is happening. Integration of new features, quality assurances, and releases are best done by developers who can dedicate a large amount of time to the library. Also, ambitious changes to the library, such as improving the parallel computing engine, need long efforts. For many years, we have always had people with dedicated time to support the community. In France, we were going through hoops to find public money to found them. As someone who has done this effort, I can tell you that is a complicated one . The ability to receive money from sponsors will enable us to scale up our operations. I was initially worried that we would have difficulties finding partners that accepted to give us money without asking for control on the project. However, I was proven wrong, and we have found a small set of great partners.

What will people work on? How will decisions be made? It can be a difficult exercise to balance how money is used in a community-driven project. The project should not loose its drive where the community of developers is important. Interests of the sponsors should not prime over interests of the user base. We will make sure that the money that the foundation receives is invested for the interest of the community. We have a technical committee that supervises the activity of the foundation. Its decisions will be informed by the community . For this, we have an advisory board composed of core contributors of scikit-learn. Beside the advisory board, the technical committee also comprises a delegate from each sponsor. I am excited about the input that our partners will provide us on the priorities for them, as they represent various industries. Voting power will be spread so that sponsors and community have the same voting power.

Why not an existing foundation such as NumFOCUS, or the PSF? There are several reasons why we choose this particular legal vessel. Our endeavor is slightly from the prominent foundations in our ecosystem, NumFocus and the PSF (Python Software Foundation). The first important aspect is that we want to employ full-time developers. Different countries have very different legal frameworks, and it is really hard to transfer money overseas in a non profit. Physical assets like employing people or owning real estate is even harder. We needed something in France. And there might be a need for something else in another country at some point. Another reason to be embedded in the Inria foundation is that it is giving us a really good deal. We basically get legal advice, accounting, office space, and IT support, for an 8% overhead. This is an excellent deal and is part of the sponsoring efforts that Inria will keep doing. Last, we feel that a foundation targeting specifically scikit-learn can raise money from different people than other foundations. I think that there is value having multiple foundations seeking money for open-source software. Indeed, a foundation builds a case and an image, to convince donors. Different donors require a different case and a different image. For instance the president of NumFOCUS argues for a name less focused on numerics. Yet, too wide of a scope can dilute the image. We have in mind to make it easy for other foundations to support scikit-learn. We have majors contributors at leading institutions, such as Andreas Mueller at Columbia or Joel Nothman at Sydney university. It is important that these institutions can easily gather donations too, in the legal framework suited to their country. Hence the name reflects that the foundation is embedded at Inria, leaving room for other initiatives.