Git & refactors: there must be a better way

Wednesday, September 06, 2017 2 Comments

You just moved some method to a different file while doing a refactor. You know, cleaning up stuff and making it clearer. Chances are you modified the moved code.

What about diffing it now? Not trivial, is it?

And, what if someone modified the method at the original location while you were refactoring? A very ugly merge... at best.

What if your Git client understood the refactors well enough to help you in this example?

The example explained

The figure above shows how method GetTimeBetween() has been moved from Period.cs to TimePeriodField.cs .

The left side shows the file before changes, and right side shows the relevant changes that occurred.

File TimePeriodField.cs only shows up in "after" because it is the destination of the moved method UnitsBetween() but it doesn't contain any other changes.

During the move, GetTimeBetween() not only changed file (and class), but also:

It was renamed (from GetTimeBetween to UnitsBetween ), that's why the "R" icon is displayed.

to ), that's why the "R" icon is displayed. It was moved to a different class and file, as we just saw, and that's why the pink "M" icon is displayed. (Moved methods within the same file are marked with an "M" in a different color, like DateComponentsBetween() shows.)

shows.) The body of the method was modified, and the "C" icon shows that.

The graphic shown above that illustrated this scenario was not created manually. It was automatically generated by gmaster, the Git client we are working on. It can parse code, it can diff based on that, and it can even calculate methods (or functions) moved across files.

In gmaster, you can click on the UnitsBetween() method and it will immediately show you something like this:

There must be a better way

Diff tools up to this point have been text syntax-based, and the same hold true for merge tools. They don't care whether the text to merge is code or a love letter.

What if diff/merge tools parsed code to learn the structure of the source files and calculated differences based on that?

That's what we considered a few years ago when we developed SemanticMerge. It is a standalone diff/merge tool that you can plug into any version control software to calculate diffs.

But, to actually track moved code between files, a standalone tool wasn't enough. Deeper collaboration with the version control system was required. Things like being able to work on a commit basis, to get all modified files in a commit and track refactors based on that. Or, being able to access all the files involved in a merge, and not just the ones in conflict.

gmaster basically reuses all the features in SemanticMerge but goes one step further adding cross-file diff and merge besides some other features like visualization and a focus on ease-of-use.

Merging refactored code

The commit we diffed so far is the one marked with a "home" (checked out) in the following diagram (which is the Branch Explorer, also included as part of gmaster):

Meanwhile, someone else modified the GetTimeBetween() method in its original location ( Period.cs ) on the branch ChangeGetTimeBetween .

What if now we try to merge the two branches?

Git detects that Period.cs was modified concurrently, so there is a conflict to solve.

But, gmaster goes one step further. Do you see the big blue banner on top? "Code moved between files that needs merge detected!". Basically, it parsed the files involved in the merge (even if they were not in conflict) and found that something else happened.

If you click on Show , you'll see something like this:

This is really the "meat" in gmaster, what makes it several steps further evolved in merge technology. First, you can now see that the tree above (top left) says "multifile conflict" and it shows the list of files involved. Second, you can see TimePeriodField.cs that was not considered before.

But more interestingly, check the diagram above; it shows how the file was (base, in the center of the graphic), and then the changes made by each contributor on the left and right. To make it clearer, we'll isolate the above diagram in the following screenshot:

You can quickly see how GetTimeBetween() was modified on one side, but was moved/renamed/changed in the other. That's why it is in conflict, and gmaster will help you solve this problem.

Check out the following video to see how the changes made in GetTimeBetween() are merged with the ones in UnitsBetween() and placed correctly in TimePeriodField.cs .

Not just for complex stuff

What I showed is the state of the art in merge, but don't get me wrong; gmaster will help in merge conflicts without cross-file refactors.

It can calculate semantic diffs and merges inside a single file. For example, if you rearrange a file and your colleague changes a method in the original location.

And, it comes with a built-in (yet super powerful) text-based 3-way merge tool for the occasions where semantic doesn't apply, like when the source language is not supported (gmaster supports C, C++, C#, Java and then a set of community contributed parsers including Delphi and some others), html files and more.

Turning complex merges into automatic ones

We have been using all of this for months, and the best part is not even when the conflicts show up and you must merge them. The best part is that using semantic merges, even multi-file ones, automatically solves (if you want to) many conflicts that would otherwise require manual intervention.

It can be a great time saver.

Enter gmaster beta

We are developing gmaster around 3 key concepts: visualization (the Branch Explorer you saw above), built-in diff and merge, and semantic capabilities.

gmaster is a Git client for Windows, it is currently in beta, and it is available to download right now from https://gmaster.io/installer.

The example above can be found here https://github.com/gmasterscm/tour.

1-minute videos explaining how it works can be found here https://gmaster.io/tour.