GitHub just released a new feature: you can search for specific messages in commits. More info here.

We used this feature to search for commits involving refactoring operations. Instead of searching for refactoring, which generates too many results, we decided to search for specific refactorings, such as extract method, rename class, move method etc.

We found around 1.1 million commits denoting the following refactoring operations (the search was performed on January 4th, 2017):

Rename variable: 600,776 commits (results)

Rename method: 157,815 commits (results)

Rename class: 99,264 commits (results)

Move method: 82,009 commits (results)

Move class: 65,364 commits (results)

Extract method: 50,401 commits (results)

Inline method: 39,309 commits (results)

Extract class: 9,009 commits (results)

Extract interface: 7,503 commits (results)

Extract superclass: 1,270 commits (results)

Pull up method: 1,388 commits (results)

Push down method: 100 commits (results)

Therefore, renames are the most popular refactorings (77%), followed by move method/class (13%), by extract/inline method (8%) and by extract class/superclass/interface (1%). Inheritance-related refactorings, i.e. pull up/push down method, are not common.

Clearly, not all refactoring operations are documented in commit messages. However, at least these results can be used as an approximation for the real refactoring activity on GitHub, coming from more than 1.2 million commits.

Furthermore, they are not fundamentally different from the ones we found in our recent "Why We Refactor? Confessions of GitHub Contributors" paper (pdf available here). In this paper, we used a tool (RefactoringMiner) to detect refactorings in commits (see Table 2, page 5). The main difference seems to be that we detected more Extract than Move Method. In this previous study, we only considered Java systems; we also considered a single rename operation (rename package).