Open source software is becoming crucial in the design and testing of quantum algorithms. Many of the tools are backed by major commercial vendors with the goal to make it easier to develop quantum software: this mirrors how well-funded open machine learning frameworks enabled the development of complex models and their execution on equally complex hardware. We review a wide range of open source software for quantum computing, covering all stages of the quantum toolchain from quantum hardware interfaces through quantum compilers to implementations of quantum algorithms, as well as all quantum computing paradigms, including quantum annealing, and discrete and continuous-variable gate-model quantum computing. The evaluation of each project covers characteristics such as documentation, licence, the choice of programming language, compliance with norms of software engineering, and the culture of the project. We find that while the diversity of projects is mesmerizing, only a few attract external developers and even many commercially backed frameworks have shortcomings in software engineering. Based on these observations, we highlight the best practices that could foster a more active community around quantum computing software that welcomes newcomers to the field, but also ensures high-quality, well-documented code.

Funding: This research was supported by Perimeter Institute for Theoretical Physics. Research at Perimeter Institute is supported by the Government of Canada through Industry Canada and by the Province of Ontario through the Ministry of Economic Development and Innovation.

Data Availability: This paper is accompanied by a live website (qosf.org). The code for this website may be located in the following repository: ( https://github.com/qosf/qosf.org ). Additionally, the code used in the paper may be located here: ( https://gitlab.com/qosf/quantum-bench ).

Copyright: © 2018 Fingerhuth et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

The outcome of the varied motivations is a plethora of tools, libraries, and even languages that are hosted in open source repositories. This is the raison d’être of this survey: we would like to raise awareness of open source projects in quantum computing, give credit to contributors, attract new developers to the field, and highlight the best aspects of some projects. To achieve this, we divide the projects into categories that correspond to different levels of the quantum software stack, compare the projects, and highlight best practices that lead to success and more recognition. We include contemporary, maintained projects according to a set of well-defined criteria, which also means that we had to exclude some seminal works, such as Quipper [ 9 ], libquantum [ 10 ] and Liquid [ 11 ], which are no longer actively developed. An accompanying website ( https://qosf.org/ ) will receive automated updates on the projects to ensure that our work will continue to serve as a reference well after the publication of this paper. This website is hosted in an open source repository and we invite the community to join the effort and keep the information accurate and up-to-date.

Open source software is a natural fit to scientific thinking and advancements and scientists have long embraced it with the TeX typesetting system being a prime example. More recently, commercial entities started backing or even taking a leading role in open source software in science. An example is machine learning: fairly complex mathematical models must be tested and deployed on hardware that is difficult to program and use to its full potential, for instance, on graphical processing units. By providing high-quality open source frameworks, such as TensorFlow [ 4 ] or PyTorch [ 5 ], the commercial entities attract the best developers towards their ecosystem. Parallel to these developments, quantum computers started to leave the labs and commercial entities began to sell computing time on this new type of hardware. Thus, with a shared scientific and commercial interest in quantum computing, a tapestry of motivations emerges why the open source model is attractive for developing and distributing software in this domain. Let us highlight some of these motivations:

In the late 1990s, the term “open source” was coined to reflect the development model alone, and it was soon made mainstream by comparing the difficulties of monolithic software engineering to this new model. This latter model is referred to as the “the cathedral”, with a rigid development structure that may or may not meet user expectations. This contrasts to the “the bazaar” model of open source, where the user needs drive the development, often in a haphazard fashion [ 1 ]). The majority of open source contributors were volunteers, whose motivation varied from intrinsic reasons (e.g., altruism or community identification) or extrinsic (e.g., career prospects) [ 2 ]. Many paid programmers contribute to open source projects as part of their job, which is another clear indication that open source is a software engineering paradigm as well as a business model [ 3 ].

Source code has been developed and shared among enthusiasts since the early 1950s. It took a more formal shape with the rise of proprietary software that intentionally hid the code. To counter this development, Richard Stallman announced the GNU Project in 1983, and started the Free Software Foundation. Among other objectives, the aim of the project has been to allow users to study and modify the source code of the software they use. This in turn formalized the concept of collaborative development of software products. The term “free” as in freedom of speech has had philosophical and political connotations, but the model of massively distributed code development was interesting on its own right. Collaborative communities around open source projects started emerging in the 1980s, with the notable examples of the GNU Compiler Collection (GCC) or the Linux kernel. Soon thereafter, the widespread access to internet enabled many developer communities to successfully thrive and coordinate their efforts.

Open source software in quantum computing covers all paradigms and all stages of expressing a quantum algorithm. The software comes in diverse forms, implemented in different programming languages, each with their own vocabulary, or occasionally even defining a domain-specific programming language. However, to provide a representative, but still useful study of quantum computing languages and libraries, we limited ourselves to projects that satisfy certain criteria.

The different steps in the quantum algorithm workflow outlined above mostly refer to the (continuous and discrete) gate models. However, useful analogies can be made for the quantum annealing paradigm. As shown in Fig 2 , having chosen quantum annealing as the quantum algorithm to tackle the Traveling Salesman Problem, the next step is to construct an Ising-type Hamiltonian that represents the problem at hand. This is equivalent to constructing a discrete quantum circuit in the gate-model. The actual QPU that performs the annealing seldom corresponds to the interaction pattern of the Hamiltonian. For instance, the quantum annealing processors produced by D-Wave Systems currently have a particular graph topology—the so called Chimera architecture—that has four local and two remote connections for each qubit. Thus, the previously generated problem graph must be mapped to the hardware graph by finding a minor graph embedding. Finding the optimal graph minor is itself an NP-hard problem, which, in practice, requires the use of heuristic algorithms to find suitable embeddings [ 14 , 15 ]. Finding a graph minor is analogous to quantum compilation and the size of the graph minor can be seen as the direct analogue to quantum circuit depth in the gate-model paradigm. In-depth analyses of quantum annealing performance in Refs. [ 16 , 17 ] have revealed a clear dependence between the quality of minor graph embeddings and QPU performance. Lastly, the embedded graph can either be solved on a QPU or with a classical solver. The latter is similar to using a quantum computer simulator in the gate-model paradigm. When obtaining samples from a quantum annealer, it is common to further postprocess the results with classical algorithms to optimize solution quality [ 18 ]. In both the gate-model and annealing paradigm, we define a full-stack library as software that covers the creation, compilation / embedding, simulation and execution of quantum instructions as illustrated in Figs 1 and 2 .

QPUs and some simulators usually only implement a restricted set of quantum gates which requires compilation of the quantum circuit. Compilation connects the abstract quantum circuit description to the actual hardware or the simulator: it is the process of mapping the quantum gate set G in a quantum circuit C to a different quantum gate set G* resulting in a new quantum circuit C*. As an intuitive example, many quantum circuits use two-qubit gates between arbitrary pairs of qubits, even though those qubits might not be physically connected on the quantum processor. Hence, the quantum compiler will swap qubits with each other until the required two qubits are neighbours, so the desired two-qubit gate can be implemented. After applying the two-qubit gate we need to reverse the swaps to restore the original configuration. The swaps require several extra gates. For this reason, quantum circuits often increase in depth when being compiled.

If the scale of the quantum system is still classically simulable, the resulting quantum circuit can be simulated directly with one of the available open source quantum computer simulators on a classical computer. The terminology is confusing, since hardware-based quantum simulators form a quantum computing paradigm as we classified above. Yet, we also often call classical numerical algorithms quantum simulators that model some quantum physical system of interest, for instance, a quantum many-body system. To avoid confusion, we will always use the term quantum computer simulator to reflect that a quantum algorithm is simulated on classical hardware.

Some of the open source projects require the user to define the quantum circuit of gates manually that represents the chosen algorithm for the given problem definition and quantum computing paradigm. Other projects add another level of abstraction by allowing the user to simply define the graph X and a starting point A, which encapsulates the Travelling Salesman Problem, and then automatically generate the quantum circuit for the chosen algorithm. Note, that we explicitly distinguish a quantum algorithm from a quantum circuit. A quantum circuit is a quantum algorithm implemented within the gate-model paradigm whereas our notion of a quantum algorithm also includes the quantum annealing protocol.

First, the problem is defined at a high-level and is then encoded into an Ising-type Hamiltonian which can be visualized as a graph. Next, via minor graph embedding the problem Hamiltonian needs to be embedded into the quantum hardware graph. Finally, either a quantum annealer or a classical solver is used to sample low-energy states corresponding to (near-)optimal solutions to the original problem.

First, the problem is defined at a high-level and based on the nature of the problem a suitable quantum algorithm is chosen. Next, the quantum algorithm is expressed as a quantum circuit which in turn needs to be compiled to a specific quantum gate set. Finally, the quantum circuit is either executed on a quantum processor or simulated with a quantum computer simulator.

It remains challenging to understand what kind of problem can be solved efficiently by which paradigm and corresponding quantum algorithm. A typical quantum algorithm workflow on a gate-model quantum computer is shown in Fig 1 , whereas Fig 2 shows a typical workflow when using quantum annealing. Both start with a high-level problem definition such as e.g. ‘solve the Travelling Salesman Problem on graph X’. The first step is to decide on a suitable quantum algorithm for the problem at hand. We define a quantum algorithm as a finite sequence of steps for solving a problem whereby each step can be executed on a quantum computer. In the case of the Travelling Salesman Problem we face a discrete optimization problem. Thus, the user can consider e.g. the quantum approximate optimization algorithm [ 13 ] that was designed for noisy discrete gate model quantum computers or quantum annealing to find the optimal solution.

To further complicate the picture, when we talk about quantum computing we talk about several different paradigms. Some of these paradigms are barely abstracted away from the underlying physical implementation, which increases the difficulty of learning them for a computer scientist or a software engineer. We define four paradigms:

Experimental quantum computing is still a relatively new discipline and comparable to the early days of classical computers in the 1950s. Similar to the manual programming of a classical computer with punch cards or an assembler, today’s quantum computers require the user to specify a quantum algorithm as a sequence of fundamental quantum logic gates. Thus, implementing a quantum algorithm on actual quantum hardware requires several steps at different layers of abstraction.

There are several open source projects for quantum annealing with most projects being supported by D-Wave Systems. We already mentioned Qbsolv together with the other quantum computer simulators, but would like to highlight two additional software projects. The package dimod is a Python API for solving QUBO problems with various backends including D-Wave’s quantum processors [ 71 ]. One of its unique contributions is the introduction of the binary quadratic model which unifies the Ising (±1) and QUBO (0/1) formalisms. Lastly, dwave-system is a GitHub project that provides a simple Python API interface to connect with the D-Wave Ocean software stack [ 72 ]. This allows the user to define QUBO problems as well as embed them onto a given quantum chip topology and optimize the minor embedding [ 14 , 15 ]. This project comes closest to a quantum full-stack library in the quantum annealing realm.

Lastly, Google recently released their full-stack library Cirq [ 67 ]. It is written in Python and specifically aimed at the creation, compilation and execution of noisy intermediate scale quantum (NISQ [ 68 ]) circuits. Cirq has a parallelizable simulator backend that requires prior compilation to Google’s preferred quantum hardware architecture. The popular open source project OpenFermion, which generates fermionic Hamiltonians for quantum chemistry simulations provides a plugin to Cirq thereby completing the stack [ 69 , 70 ]. Note, that OpenFermion is generally hardware-agnostic and can also integrate into the other quantum full-stack libraries such as Forest and ProjectQ.

In contrast to quantum full-stack libraries that focus on the discrete gate model, Strawberry Fields is a Python-based library for the continuous gate model, developed by the startup Xanadu [ 65 , 66 ]. It is based on the Blackbird quantum programming language and it is the only quantum software project built on top of a deep learning library: its computational backend for simulations is written in TensorFlow [ 4 ]. Strawberry Fields’ repository contains example implementations of quantum algorithms, including quantum teleportation, boson sampling and several quantum machine learning algorithms.

IBM was first to provide public cloud access to their five qubit quantum processor in 2016. Since then, it has built a community of developers around their quantum software library Qiskit [ 57 , 58 ]. It is a full-stack library and consists of two separate projects, Qiskit Terra and Aqua [ 59 , 60 ]. Similar to Rigetti’s Forest, Terra is the base library that allows the user to define, compile and simulate quantum circuits, whereas Aqua is a collection of quantum algorithms implemented with Terra. Furthermore, Qiskit provides the user with tools for quantum compilation and has a quantum computer simulator module as well as two freely accessible QPUs [ 61 ]. Many of the algorithms in Aqua were outlined in Ref. [ 62 ] and there is an additional project, called Qiskit Tutorials, which contains many Jupyter notebooks with example code for programming in Qiskit [ 63 ]. In addition, Qiskit.js is a JavaScript clone of Qiskit providing the same functionality as discussed above [ 64 ].

Rigetti Computing has open sourced most of its quantum full-stack library Forest [ 53 ]. Forest combines two separate open source projects, pyQuil [ 54 ] and Grove [ 55 ]. pyQuil is an extensive Python library for the generation of Quil programs, where Quil is a quantum assembly language developed by Rigetti [ 54 ]. It can be compiled using Rigetti’s proprietary quantum compiler, which is not available under an open source licence. Compiled Quil programs can then either be executed on their QPUs or simulated using Rigetti’s Quantum Virtual Machine in the cloud or the reference implementation (reference-qvm) mentioned earlier [ 27 ]. Grove is the corresponding quantum algorithms library also written in Python [ 55 ]. It contains implementations of popular quantum algorithms such as the quantum approximate optimization algorithm [ 13 ], the variational quantum eigensolver [ 56 ] and the quantum Fourier transform.

The startup company Artiste-qb has open sourced their full-stack library Qubiter [ 46 , 47 ]. This library is mostly implemented in Python with some parts written in C++. Additional to writing, compiling and simulating quantum circuits, Qubiter provides integration for the quantum processors of all major hardware providers. Lastly, Quantum Fog is a separate open source project to generate and analyze quantum Bayesian networks [ 52 ]. The authors plan to integrate it with Qubiter in the near future.

Next, XACC is a C++ project that stands for eXtreme-scale ACCelerator and is an extensive quantum full-stack library developed with the support of the Oak Ridge National Laboratory [ 44 , 45 ]. It is a software framework that allows the integration of QPUs into traditional high-performance computing workflows. XACC has its own open source quantum compiler and supports execution on quantum chips from a wide range of quantum hardware companies as well as their respective simulators. There is also an open source plugin that enables the use of a tensor network quantum virtual machine as a backend [ 49 , 50 ]. Finally, the project XACC VQE provides implementations of quantum chemistry algorithms for XACC [ 51 ].

Several open source projects exist that move beyond isolated quantum computer simulation or quantum compilation and provide a full-stack approach to quantum computing as defined in Fig 1 . ProjectQ [ 41 – 43 ], XACC [ 44 , 45 ] and Qubiter [ 46 , 47 ] are the three quantum full-stack libraries that made all parts of their respective stacks available under open source licences. ProjectQ was developed by researchers at ETH Zurich and is mostly written in Python [ 41 – 43 ]. It allows the user to define, compile and simulate quantum circuits using an expressive syntax. Additionally, ProjectQ can be used to interface with IBM’s quantum processors through the cloud and support for other QPU backends is anticipated. The FermiLib project completes the ProjectQ stack by providing a Python library to generate and manipulate fermionic Hamiltonians for quantum chemistry [ 48 ].

Quantum computer simulators usually do not restrict the set of quantum gates (except Cliffords.jl) and allow two-qubit gates between any two simulated qubits. However, when implementing quantum algorithms on actual hardware the circuits need to be compiled to the restricted topology of the particular quantum chip used for execution. There are only a few standalone quantum compilers that satisfied our selection criteria. Many quantum compilers are either absorbed into full-stack libraries, or they are proprietary and closed-source, developed by quantum hardware companies. One of the few open source quantum compilers is ScaffCC which translates quantum circuits expressed in the Scaffold quantum programming language to quantum assembly format (QASM) [ 34 , 35 ]. It also allows researchers to analyse the quantum circuit depth of quantum algorithm implementations on hypothetical future quantum chips. Next, QGL.jl is a performance orientated quantum compiler for Quantum Gate Language (QGL) written in Julia [ 36 , 37 ]. Lastly, PyZX is a Python-based quantum compiler that uses the ZX calculus developed in Refs. [ 38 ] and [ 39 ] to rewrite and optimize quantum circuits [ 40 ].

All quantum computer simulators so far are focused on the simulation of gate-model quantum computers. Most of these simulators are used to develop and test quantum algorithms before implementing them on actual quantum chips or to verify results obtained from a QPU. Analogously, the project Qbsolv is used to develop and verify the results obtained from quantum annealing devices [ 32 ]. Technically, it is not a quantum computer simulator as previously defined since it uses a classical algorithm unrelated to the physics of quantum annealing. Yet, we are including it in this discussion because it is the closest analogue to a simulator for quantum annealing devices. It is a C library that finds the minimum values of large quadratic unconstrained binary optimization (QUBO) problems. To achieve this, the QUBO problem is first decomposed into smaller problems which are then solved individually using tabu search, a metaheuristic algorithm based on local neighbourhood search [ 33 ].

Quantum++ is a high-performance simulator written in C++ [ 23 , 24 ]. Most quantum computer simulators only support two-dimensional qubit systems whereas this software library also supports the simulation of more general quantum processes. Qrack is another C++ based simulator that comes with additional support for Graphics Processing Units (GPUs) [ 25 ]. The developers of Qrack put special emphasis on performance by supporting parallelization over multiple CPU or GPU cores. A more educational and less performance-oriented quantum computer simulator is Quirk [ 26 ]. It is a JavaScript-based simulator that can simulate up to 16 qubits in a modern web browser. Quirk provides a visual user experience by allowing beginners and experts to construct quantum circuits via simple drag-and-drop operations. Next, Rigetti Computing, a hardware startup focused on superconducting circuits for the discrete gate model, has open sourced the project reference-qvm [ 27 ]. This is a reference implementation of the Quantum Virtual Machine (QVM), synonymous with quantum computer simulator, used in their full-stack library Forest. It is a purely Python-based simulator which is meant for rapid prototyping of quantum circuits. So far, all mentioned quantum computer simulators simulate any quantum circuit until a certain depth. This implies that these simulators support Clifford as well as non-Clifford quantum gates. In contrast, the project Cliffords.jl restricts itself only to quantum gates from the Clifford group [ 28 , 29 ]. It is widely known that Clifford circuits can be simulated efficiently with a classical computer [ 30 ] and Cliffords.jl allows for fast and efficient calculations by making use of the tableau representation [ 31 ] and it is written in the high-performance programming language Julia.

Overview of the projects and how their features align with the typical quantum algorithms workflow shown in Figs 1 and 2 . Note, that the workflow is different in the quantum annealing paradigm as indicated by the reassigned column headings. Postprocessing is an additional feature used in quantum annealing to improve solution quality [ 18 ]. Data obtained in August 2018.

Let us briefly give an overview of all the open source projects considered in this review. For a project to be considered, it had to fulfil the criteria outlined in Fig 3 . At this point, we would like to give an honourable mention to the quantum software project Q#. Most of it is licenced under custom licence terms, which are not recognized as an open source licence by the OSI, and therefore the project had to be excluded. Table 1 lists all the selected projects and provides high-level information such as taglines, programming languages and supported operating systems. Different projects cover different parts within the typical quantum algorithms workflows shown in Figs 1 and 2 . Table 2 illustrates this by clearly defining each project’s range of applicability within the workflow. We start at the bottom of the outlined technology stack with quantum computer simulators and compilers.

In this section, we outline the criteria and explain our reasons for selecting them. A concise overview of the selection process is depicted in Fig 3 . In the second part of this section, we provide a brief outline for each of the selected projects.

Evaluation

Our aim for this section is to evaluate the surveyed projects based on a set of selected criteria, which emerge from best practices in open source software development. This set of best practices is not set in stone. As with any evolving system, they are subject to change, especially as new methods for software development and project management emerge, such as, for example, was the case when distributed version control systems largely replaced the centralized approach to software development. Hence it is inevitable that there is some arbitrariness to our choices of criteria. We anchor these choices in the literature and practice of software engineering to minimize arbitrariness. For a thorough and practical introduction to building open source projects, please refer to Ref. [73], which provides the reader with an introduction on how to start projects and establish communities around them.

The structure of the section revolves around domain areas (or “best practices”) in open source software development. Background and reasoning for each domain area is concisely provided along with the criteria designed to evaluate the surveyed projects in this domain. Furthermore, we provide quantitative and qualitative evaluation results in the form of summary tables. As our data sources, we use static analysis of the source code, metadata from software repository hosting sites and in-depth analysis of the documentation to evaluate the included projects.

Documentation The first resource that enables users to get familiar with a new software project is its documentation. Writing good documentation is a skill that is not necessarily aligned with the skill of being a good developer. On the contrary, the authors and main developers of the project might be the wrong people to write the project’s documentation, as they cannot easily take the perspective of a new user [73]. The duty is often fulfilled by the role of a technical writer in a commercial software development setting. Documentation also gets outdated easily—even software engineers in the commercial sphere often struggle to keep documentation up to date [74]. Developers of open source projects have arguably even less incentive. However, even bad or outdated documentation is better than none, as developers find incomplete or outdated documentation still useful [74]. The guidelines described in the following paragraphs are a set of conventions that the community of open source developers have converged upon. Good user documentation starts with a well written README file, which serves as the initial point of contact. For most users, it is the first piece of documentation they encounter. Repository hosting websites like GitHub, Gitlab and Bitbucket even display the contents of the README file directly on the project’s homepage. As such, README files should provide a concise but thorough overview of the project, and serve as a hub to more detailed parts of the documentation. As a minimum, the README file should clearly state the mission statement of the project, even if it may seem obvious from the project’s name. Next, it should communicate the capabilities (feature list) of the project on a high level, provide the build and setup instructions (or a link to the part of documentation that describes them) and list the major requirements of the project. Adding an example of how to use the project or a screenshot in a practical setting can help users grasp what the project is offering. In addition, it is always helpful to state the licence to clearly communicate the terms associated with using the code and mention main contributors / maintainers of the project to properly give credit where it is due. Depending on their level of familiarity with the project, users come to consult documentation with two basic purposes in mind: newcomers seek information about how to actually use the project, while more experienced users might be interested in an explanation of more advanced features. Good user level documentation accommodates for both of these expectations. Hence, in our analysis, we differentiate between user documentation and detailed tutorials. To get first time users familiar with the project, step-by-step tutorials are often used as they guide users through the process of using the library. By user documentation, we mean concise per-feature documentation. This is the type of documentation mostly consulted by experienced users, since it provides an efficient and concise overview of a project’s classes or functions and variables. For example, a particular function and its arguments are described briefly and its usage is outlined with a small code example. In many evaluated projects this is combined with automatically generated documentation as part of the source code documentation. In general, missing pieces and deficiencies should also be mentioned explicitly, as it is hard for users to tell what features are present but not documented without consulting the source code. Existing users need to be informed about recent changes and bug fixes in new releases such that they can adapt their usage of the code accordingly. This is best done in the form of changelogs that keep track of versioning and list new features as well as minor changes for each software release. Changelogs should also give credit to developers that contributed to a particular release and ideally thank users who reported crucial bugs. Fig 4 shows the detailed results of our qualitative documentation analysis in form of a colour coded heatmap with scores ranging from 1 (bad) to 5 (good). The detailed rubrik used for scoring each of the five aspects can be found in S1 Table. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 4. Heatmap of documentation analysis results. The heatmap shows the evaluation results for source code documentation, README files, changelogs, user documentation and tutorials on a scale from 1 (bad) to 5 (good). The evaluation rubrik used for scoring can be found in S1 Table. Data was obtained in August 2018. https://doi.org/10.1371/journal.pone.0208561.g004

User-centric discussion channels It is understandable that users adopting a new project sometimes face the need to ask questions, even in presence of high-quality documentation. Unlike commercial software, open source software is usually provided with no official support channels and hence this function is often voluntarily performed by the members of the wider user community of the project. Yet, researchers in Ref. [75] have shown that despite the lack of direct funding, community driven support can outperform the support of commercially developed software in terms of quality. However, the users cannot engage in the practice of answering each others’ questions without the presence of a forum dedicated to this purpose. Traditionally, this function is performed by user-centric mailing lists (or more historically, USENET groups), dedicated Q&A sites, or forums. Interactive communication channels, like IRC and Slack, Riot or Keybase are also useful, since the barrier to ask question is lowered in a chat-based interface. Additionally, it also provides an easier, interactive debugging experience. We have surveyed all projects with respect to their user-centric discussion channels and the results can be found in Table 3. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 3. Evaluation results for the community analysis. For each project, we indicate if a public development roadmap exists and if the software is published in form of releases. Additionally, we report the GitHub community profile score, the total number of contributors, the type of user- and developer-centric discussion channel and the type of public code review process—specifically if it applies to internal (I) and/or external (E) contributors. Data obtained in August 2018. https://doi.org/10.1371/journal.pone.0208561.t003

Developer documentation The needs of project users are substantially different than the needs of the project’s developers when it comes to documentation. While users need to get familiar with outward facing parts of the project—its API, requirements and the licence—the developers of the project are interested in code documentation, exhaustive references and overall system architecture. Appropriately addressing the expectations of developers—such as maintaining high ratio of comments as a form of developer documentation—increases the probability of obtaining an external contribution and thus converting a user into a member of the project’s development community. Furthermore, it also decreases the maintenance load on the existing developers [76].

Developer-centric discussion channels External developer contributors in open source projects usually undergo an evolution, where they transition from passive users to external developers in multiple stages [77]. To encourage this transition, the existing developer community should be open enough such that passive users see a way how to participate in the development process. This not only includes a process for accepting patches or pull requests, but also a public forum where developer and design discussions take place. Developer-centric discussion channels conduct the majority of the design work in open source projects [78]. In addition, they often host code reviews. This is not only a fundamental practice for maintaining code quality within the project, but also serves as a process to increase code ownership and encourage knowledge transfer within the development team [79]. However, a dedicated channel for developer discussions is not only useful for evolving contributors, but also beneficial for the existing, often distributed team of developers. It has been shown that developers in distributed teams need to maintain awareness of each other [80]. Developer mailing lists and more dynamic communication platforms, such as IRC channels or group chat applications are useful in maintaining awareness, even if not all discussions are code-related [81]. Table 3 shows the discussion platforms that are being used for these purposes by the analysed projects.

Issue tracking system When familiarizing themselves with a new piece of software, it is common for users to hit roadblocks. Even seasoned users encounter bugs and rare problems when exploring more advanced parts of the code base. Seeking help with their problems, users need a way to reach out to more experienced users, or even the project developers, and in the absence of better solutions end up using the project maintainer’s email, if available. This often leads to overloading the maintainer with repetitive questions and bug reports. An issue tracking system helps avoid these problems. It provides central overview of all the issues and bugs related to the project, and their status. Some of the issues may turn out to be false positives, problems that are actually on the user side. In these cases the issue tracking system serves as a knowledge base for the users experiencing the same problems. Finally, we want to put emphasis on how a project deals with questions, issues and pull requests. If an issue or pull request has seen no response from a core contributor (or an employee of the commercial entity backing the project) within 30 days we consider it as being ignored. For this review we defined a core contributor as a developer whose contributions sum to either 10% of all line additions or deletions or 15% of all commits. We then define attention rate as 1−I where I is the fraction of ignored issues and pull requests with respect to the total number of issues and pull requests. An ideal project never ignores any of its user or developer questions or contributions and would have an attention rate of 1.0. Lastly, we also compute the average time it takes a core contributor (or an employee of the company hosting the project) to respond to issues or pull requests.

Roadmap Well defined vision of the product is one of the best predictors of success in software development projects [82]. Since in open source projects the teams are not restrained to one geographical location, but distributed, the project’s vision needs to be clearly communicated within the community. A concise representation of the project’s long term vision is often realized in form of the project’s development roadmap. Users benefit from the presence of the roadmap due to better clarity of the project’s future development. Developers are more encouraged to contribute, since they see their contributions in context. Additionally, newcomers to the project can identify features that they have the skills and interest for. Our community analysis in Table 3 indicates which quantum software projects make use of development roadmaps. Examples of successful open source projects outside of quantum computing that leverage (often community-defined) roadmaps are OpenStack [83] and OpenNebula [84].

Outlining contribution process Many open source projects have, over time, developed their own processes for accepting contributions. This does not include only the technical requirements, but often contains conditions that a new contribution should adhere to, such as using the same code styling as the rest of the repository or making sure that the contribution is provided in a specific way (i.e. via pull request or a patch file submitted to the developer mailing list). It is beneficial if these conditions and an overall description of the process are provided, such that expectations are set correctly and the risk of upsetting either side is minimized. In Ref. [85], the authors concluded that newcomers to open source projects that follow established conventions in introducing themselves to the community and providing their contributions are more likely to be accepted. Hence, it is safe to conclude that explicitly stating the conventions leads to faster and more successful contributions. Many projects deal with this issue by providing a set of instructions in the README file, or even in a separate CONTRIBUTING file. A good contributing file describes the expected way of interaction with other developers, outlines the expected process for reporting bugs and submitting patches / pull requests and also touches upon the governance structure of the project. Since all projects considered in this review are hosted on GitHub, we chose one of GitHub’s own metric, the “community profile”, as a quantitative measure for quality of the community contribution process [86, 87]. The community profile evaluates seven aspects which are considered best practices for encouraging contributions by GitHub. These include the existence of a short project description, a README file and a code of conduct which contains standards on how to engage in the community of developers. Furthermore, a project should have a CONTRIBUTING file outlining how users can contribute to the project and a licence file that states how the code can be used. Templates for issues and pull requests are also required for a complete community profile since they help streamline the process of issue tracking. In our evaluation, we express the community profile as the fraction X/7 where X is the amount of satisfied community profile requirements (see Table 3).

Usage of version control systems The practice of versioning the code is an essential part of any software development team that consists of more than one person, especially in development teams that do not share the same geophysical location, as is often the case with open source community projects. The particular type of the versioning system that is being used influences many aspects of the development process—the concurrent development of features, the review process as well as the likelihood of new contributions. Based on their topology, version control systems are categorized as centralized and decentralized. The examples of popular centralized version control systems are Subversion [88] or CVS [89]. Decentralized version control systems nowadays are a more common industry practice, which is explained by the number of perceived advantages from the developer’s point of view. First of all, they treat all developers equally, as all developers have the ability to commit locally and hence maintain revisions. Additionally, they are often able to perform automated merges, simplify workflows for experimental branches and support work on the repository without internet connection [90].

Licence The source code of a software project is considered creative work and as such, in the absence of other arrangements, default copyright laws apply [21]. Simply making the source code of the project publicly available, i.e. as publishing it on a code hosting site such as GitHub does not release the project into the public domain [91] and does not make it open source. On the contrary, code that is made public without a licence is still considered proprietary and as such, not free to be used, shared or modified, even for non-commercial or research purposes [92]. Therefore the act of including a licence with the code—formally referred to as releasing the code under a given licence—is what is granting users and developers a set of rights to use, modify and share the project’s source code. As such, the presence of the licence under which the code repository is released is a fundamental part of the definition of an open source project. In general, the open source software licences are divided into two groups—so called permissive and copyleft. Permissive licences tend to not restrict the users and developers, and allow the inclusion of the licenced code within commercial software. Some licences include less severe restrictions, such as preserving attribution (i.e. The Apache 2.0 Licence [93]). Copyleft licences, on the other hand, require the authors of the derivative works to redistribute their work under the same, or compatible copyleft licence. The advantage of using a copyleft licence is in enforcing the open access even to the works that extend or otherwise build upon the original work. However, that might be seen as restrictive, especially in commercially driven settings. For a more thorough guide to the space of licencing of software, we recommend Ref. [94]. We provide an overview of the open source licences associated with the surveyed projects in Table 1.

Code readability The readability of the code in open source projects is an important factor for maintainability and increases the probability of new developers contributing, since both current and new developers need to read parts of the existing codebase to contribute. The act of reading code is considered the most time-consuming component in software maintenance [95]. However, the notion of what properties make code readable easily gets subjective. Suggestions like improving variable or method naming, code deduplication and simplifying loops, conditions and structures are common, universal improvements [96]. More subjective attributes like the indentation style or camel-casing need to preserved across the project for consistency. Some projects, organizations and languages deal with this issue by imposing project-, company- or even language-wide code styling conventions (see for example Python’s PEP8 [97]). To help with quantifying the notion of code readability, several metrics have been suggested [98, 99]. However, these are not widely used in practice, and code readability is often addressed as part of the code review process. For the less empirical notion of code complexity, a popular metric is the cyclomatic complexity [100]. It is a quantitative measure for the number of paths through the source code that are linearly independent. Hence, a lower score (corresponding to lower complexity) is considered better, as it signifies a codebase that is less convoluted. Cyclomatic complexity was only extracted from Python projects, which the majority of projects are, since the tool radon allows for easy extraction of this metric. Unfortunately, other programming languages such as Julia, JavaScript or C++ provide no simple possibility for computing such metric. The results for the Python projects are captured in Table 4. Additionally, we conduct qualitative assessment of the source code comments in Fig 4 (see Table for interpretation details). PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 4. Evaluation results for the static analysis of each project and its source code. We report the version control and issue tracking systems as well as the total number, attention rate and average response time for all open and closed issues and pull requests (PRs). Next, we analyze the existence of a test suite and report the resulting code coverage for most projects. Code complexity is only reported for projects written in Python since other languages do not allow for fast retrieval of this metric. Data obtained in August 2018. https://doi.org/10.1371/journal.pone.0208561.t004