One third of CPAN distributions (33.1%) have a github repository, but which distributions are they, and are distributions more likely to have a repo if they're further up the CPAN River? This is a quick post to record the stats for future comparison.

At the QA Hackathon in Berlin in April 2015, we discussed how development practices should mature as a distribution moves up river. One of the points was that by the time a dist has reached the middle of the river, it's a good idea to have a public source code repository. This makes it easier for other people to contribute, and leaves a clear master source should you disappear for some reason.

The following table summarises how many distributions on each stage of the River have a repo listed in the distribution's metadata.

Number of downstream dependents 10k+ 1k - 9999 100 - 999 10 - 99 1 - 9 0 # dists 45 195 570 1589 8210 21250 No repo 17 62 190 737 4575 15748 37.8% 31.8% 33.2% 46.3% 55.6% 74.0%

As with the water quality metrics, the percentage improves as you go up river, until you get to the head of the river (10k+ dependents), at which point it gets worse again.

The reality is that there are a lot of CPAN distributions that have a github repo but the repo isn't listed in the distribution's metadata. The first thing I plan to do is work down from the top of the river, looking for these cases, and submitting pull requests. Feel free to help!

Please enable JavaScript to view the comments powered by Disqus.

Disqus