The 2016 Git Retrospective: Diffs
In 2016, we saw improvements made to one of the most important parts of any version control system: the ability to calculate and display diffs between revisions of files.
Join the DZone community and get the full member experience.
Join For FreeWelcome to the next installment of our Git in 2016 retrospective! In part one, we looked at upgrades made to git worktree
throughout the year. In part two, we'll be investigating some of the improvements made to one of the most important parts of any version control system: the ability to calculate and display the differences between revisions of files.
In June, the Git 2.9 release tweaked the git diff
command's default options to imply the -M
flag, making diffs more pleasant for users without requiring them to customize their Git config. Git 2.9 also introduced the commit.verbose
configuration option, which lets you quickly review your changes whilst writing your commit message. In November, Git 2.11 shipped some exciting experimental improvements to Git's core diffing algorithm, to render more human-readable diffs.
Detecting Renamed Files by Default
Unlike some other version control systems, Git doesn’t explicitly store the fact that files have been renamed. For example, if I edited a simple Node.js application and renamed index.js
to app.js
and then ran git diff
, I’d get back what looks like a file deletion and an addition:
diff −−git a/app.js b/app.js
new file mode 100644
index 0000000..144ec7f
−−− /dev/null
+++ b/app.js
@@ -0,0 +1 @@
+module.exports = require(‘./lib/index’);
diff –git a/index.js b/index.js
deleted file mode 100644
index 144ec7f..0000000
−−− a/index.js
+++ /dev/null
@@ -1 +0,0 @@
-module.exports = require(‘./lib/index’);
I guess moving or renaming a file is technically just a delete followed by an add, but this isn’t the most human-friendly way to show it. Instead, you can use the -M
flag to instruct Git to attempt to detect renamed files on the fly when computing a diff. For the above example, git diff -M
gives us:
diff −−git a/app.js b/app.js
similarity index 100%
rename from index.js
rename to app.js
The similarity indexon the second line tells us how similar the content of the files compared was. By default, -M
will consider any two files that are more than 50% similar. That is, you need to modify less than 50% of their lines to make them identical as a renamed file. You can choose your own similarity index by appending a percentage, i.e., -M80%
.
As of Git 2.9, the git diff
and git log
commands will both detect renames by default as if you'd passed the -M
flag. If you dislike this behavior (or, more realistically, are parsing the diff output via a script), then you can disable it by explicitly passing the −−no-renames
flag.
Verbose Commits
Do you ever invoke git commit
and then stare blankly at your shell trying to remember all the changes you just made? The verbose flag is for you!
Instead of:
Ah crap, which dependency did I just rev?
# Please enter the commit message for your changes. Lines starting
# with ‘#’ will be ignored, and an empty message aborts the commit.
# On branch master
# Your branch is up-to-date with ‘origin/master’.
#
# Changes to be committed:
# new file: package.json
#
...you can invoke git commit −−verbose
to view an inline diff of your changes. Don’t worry, it won’t be included in your commit message:
Bump left-pad to 1.1.0
# Please enter the commit message for your changes. Lines starting
# with ‘#’ will be ignored, and an empty message aborts the commit.
# On branch master
# Your branch is up-to-date with ‘origin/master’.
#
# Changes to be committed:
# new file: package.json
#
# −−−−−−−−−−−−−−−−−−−−−−−− >8 −−−−−−−−−−−−−−−−−−−−−−−−
# Do not touch the line above.
# Everything below will be removed.
diff –git a/package.json b/package.json
index 28bcfc7..825e958 100644
−−− a/package.json
+++ b/package.json
@@ -31,7 +31,7 @@
“@atlassian/gulp-atlassian-amd-name”: “0.1.1”,
“@atlassian/gulp-atlassian-expose-soy”: “^1.0.1”,
“@atlassian/gulp-atlassian-soy”: “^2.0.6”,
– “left-pad”: “1.0.3”,
+ “left-pad”: “1.1.0”,
“babel-core”: “^6.9.0”,
“babel-plugin-react-require”: “^1.0.2”,
“babel-plugin-transform-es2015-modules-amd”: “^6.8.0”,
“babel-plugin-transform-es2015-modules-amd-lazy”: “^0.1.0”,
~
The −−verbose
flag isn’t new, but as of Git 2.9 you can enable it permanently with git config −−global commit.verbose
Experimental Diff Improvements
git diff
can produce some slightly confusing output when the lines before and after a modified section are the same. This can happen when you have two or more similarly structured functions in a file. For a slightly contrived example, imagine we have a JS file that contains a single function:
/* @return {string} "Bitbucket" */
function productName() {
return "Bitbucket";
}
Now imagine we’ve committed a change that prepends another function that does something similar:
/* @return {string} "Bitbucket" */
function productId() {
return "Bitbucket";
}
/* @return {string} "Bitbucket" */
function productName() {
return "Bitbucket";
}
You’d expect git diff
to show the top five lines as added, but it actually incorrectly attributes the very first line to the original commit:
$ git diff HEAD~1
diff --git a/name.js b/name.js
index 2c1ed9c..0f5bd2a 100644
--- a/name.js
+++ b/name.js
@@ -1,4 +1,9 @@
/* @return {string} "Bitbucket" */
+function productId() {
+ return "Bitbucket";
+}
+
+/* @return {string} "Bitbucket" */
function productName() {
return "Bitbucket";
}
Thewrong comment is included in the diff! Not the end of the world, but the couple of seconds of cognitive overhead from the "Whaaat?" every time this happens can add up.
In December, Git 2.11 introduced a new experimental diff option, --indent-heuristic
, that attempts to produce more aesthetically pleasing diffs:
$ git diff HEAD~1
diff --git a/name.js b/name.js
index 2c1ed9c..0f5bd2a 100644
--- a/name.js
+++ b/name.js
@@ -1,4 +1,9 @@
+/* @return {string} "Bitbucket" */
+function productId() {
+ return "Bitbucket";
+}
+
/* @return {string} "Bitbucket" */
function productName() {
return "Bitbucket";
}
Under the hood, --indent-heuristic
cycles through the possible diffs for each change and assigns each a “badness” score. This is based on heuristics like whether the diff block starts and ends with different levels of indentation (which is aesthetically bad) and whether the diff block has leading and trailing blank lines (which is aesthetically pleasing). Then, the block with the lowest badness score is output.
This feature is experimental, but you can test it out ad-hoc by applying the --indent-heuristic
option to any git diff
command. Or, if you like to live on the bleeding edge, you can enable it across your system with:
$ git config --global diff.indentHeuristic true
Coming Up Next
git diff
saw some powerful improvements in 2016! I imagine we'll see the new experimental diff options being tweaked and their scoring refined over the next couple of releases in 2017 before being enabled by default for all users. Stay tuned for our next Git in 2016 retrospective article, in which we look at improvements made to Git's extension system and Git LFS: a companion project for tracking large files with Git. If you have any questions or hot diff tips, please hit me up on Twitter: I'm @kannonboy.
If you stumbled on these articles out of order, you can check out the other topics covered in our Git in 2016 retrospective below:
Or, if you've read 'em all and still want more, check out Atlassian's Git tutorials (I'm a regular contributor there) for some tips and tricks to improve your workflow.
Opinions expressed by DZone contributors are their own.
Comments