Why You Need a Git Pre-Commit Hook and Why Most Are Wrong

Steve Pulec

Mar. 03, 12 · Interview

Likes (1)

Comment

Save

23.2K Views

a pre-commit hook is a piece of code that runs before every commit and determines whether or not the commit should be accepted. think of it as the gatekeeper to your codebase.

want to ensure you didn’t accidentally leave any pdb s in your code? pre-commit hook. want to make sure your javascript is jshint approved? pre-commit hook. want to guarantee clean, readable pep8 -compliant code? pre-commit hook. want to pipe all of the comments in your codebase through strunk & white ? please don’t.

the pre-commit hook is just an executable file that runs before every commit. if it exits with zero status, the commit is accepted. if it exits with a non-zero status, the commit is rejected. (note: a pre-commit hook can be bypassed by passing the --no-verify argument.)

along with the pre-commit hook there are numerous other git hooks that are available: post-commit, post-merge, pre-receive, and others that can be found here .

why most pre-commit hooks are wrong

be wary of the above’s example as the majority of pre-commit hooks you’ll see on the web are wrong. most test against whatever files are currently on disk, not what is in the staging area (the files actually being committed).

we avoid this in our hook by stashing all changes that are not part of the staging area before running our checks and then popping the changes afterwards. this is very important because a file could be fine on disk while the changes that are being committed are wrong.

the code below is the pre-commit hook we use at yipit. our hook is simply a set of checks to be run against any files that have been modified in this commit. each check can be configured to include/exclude particular types of files. it is designed for a django environment, but should be adaptable to other environments with minor changes. note that you need git 1.7.7+

#!/usr/bin/env python

import os
import re
import subprocess
import sys

modified = re.compile('^(?:m|a)(\s+)(?p<name>.*)')

checks = [
    {
        'output': 'checking for pdbs...',
        'command': 'grep -n "import pdb" %s',
        'ignore_files': ['.*pre-commit'],
        'print_filename': true,
    },
    {
        'output': 'checking for ipdbs...',
        'command': 'grep -n "import ipdb" %s',
        'ignore_files': ['.*pre-commit'],
        'print_filename': true,
    },
    {
        'output': 'checking for print statements...',
        'command': 'grep -n print %s',
        'match_files': ['.*\.py$'],
        'ignore_files': ['.*migrations.*', '.*management/commands.*', '.*manage.py', '.*/scripts/.*'],
        'print_filename': true,
    },
    {
        'output': 'checking for console.log()...',
        'command': 'grep -n console.log %s',
        'match_files': ['.*yipit/.*\.js$'],
        'print_filename': true,
    },
    {
        'output': 'checking for debugger...',
        'command': 'grep -n debugger %s',
        'match_files': ['.*\.js$'],
        'print_filename': true,
    },
    {
       'output': 'running jshint...',
       # by default, jshint prints 'lint free!' upon success. we want to filter this out.
       'command': 'jshint %s | grep -v "lint free!"',
       'match_files': ['.*yipit/.*\.js$'],
       'print_filename': false,
    },
    {
        'output': 'running pyflakes...',
        'command': 'pyflakes %s',
        'match_files': ['.*\.py$'],
        'ignore_files': ['.*settings/.*', '.*manage.py', '.*migrations.*', '.*/terrain/.*'],
        'print_filename': false,
    },
    {
        'output': 'running pep8...',
        'command': 'pep8 -r --ignore=e501,w293 %s',
        'match_files': ['.*\.py$'],
        'ignore_files': ['.*migrations.*'],
        'print_filename': false,
    },
    {
        'output': 'checking for sass changes...',
        'command': 'sass --quiet --update %s',
        'match_files': ['.*\.scss$'],
        'print_filename': true,
    },
]


def matches_file(file_name, match_files):
    return any(re.compile(match_file).match(file_name) for match_file in match_files)


def check_files(files, check):
    result = 0
    print check['output']
    for file_name in files:
        if not 'match_files' in check or matches_file(file_name, check['match_files']):
            if not 'ignore_files' in check or not matches_file(file_name, check['ignore_files']):
                process = subprocess.popen(check['command'] % file_name, stdout=subprocess.pipe, stderr=subprocess.pipe, shell=true)
                out, err = process.communicate()
                if out or err:
                    if check['print_filename']:
                        prefix = '\t%s:' % file_name
                    else:
                        prefix = '\t'
                    output_lines = ['%s%s' % (prefix, line) for line in out.splitlines()]
                    print '\n'.join(output_lines)
                    if err:
                        print err
                    result = 1
    return result


def main(all_files):
    # stash any changes to the working tree that are not going to be committed
    subprocess.call(['git', 'stash', '-u', '--keep-index'], stdout=subprocess.pipe)
    
    files = []
    if all_files:
        for root, dirs, file_names in os.walk('.'):
            for file_name in file_names:
                files.append(os.path.join(root, file_name))
    else:
        p = subprocess.popen(['git', 'status', '--porcelain'], stdout=subprocess.pipe)
        out, err = p.communicate()
        for line in out.splitlines():
            match = modified.match(line)
            if match:
                files.append(match.group('name'))
    
    result = 0
    
    print 'running django code validator...'
    return_code = subprocess.call('$virtual_env/bin/python manage.py validate', shell=true)
    result = return_code or result
    
    for check in checks:
        result = check_files(files, check) or result
    
    # unstash changes to the working tree that we had stashed
    subprocess.call(['git', 'reset', '--hard'], stdout=subprocess.pipe, stderr=subprocess.pipe)
    subprocess.call(['git', 'stash', 'pop', '-q'], stdout=subprocess.pipe, stderr=subprocess.pipe)
    sys.exit(result)


if __name__ == '__main__':
    all_files = false
    if len(sys.argv) > 1 and sys.argv[1] == '--all-files':
        all_files = true
    main(all_files)

to use this hook or a hook that you create yourself, simply copy the file to .git/hooks/pre-commit inside of your project and make sure that it is executable or add in to your git repo and setup a symlink.

Hook Git

Published at DZone with permission of Steve Pulec. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending

Why You Need a Git Pre-Commit Hook and Why Most Are Wrong

why most pre-commit hooks are wrong

Related

Partner Resources