Commit Message Linting with Magit

I have a confession to make. I’ve been writing bad commit messages for years. It takes time to write good commit messages, and often I’m in a hurry. Or so I tell myself. But that’s a false dichotomy. I can have my cake and eat it too! Recently I discovered how to use magit (the Emacs library for interacting with git) to enforce best practices for commit messages.

Here’s the punch line: when I try to finalize a commit after writing a commit message, various checks are performed. For each violation, I am prompted whether to fix it or ignore it and commit anyway. It’s an extra 2 seconds that improves every commit message.

I took inspiration from this post by Chris Beams. As a quick overview, here are the recommended best practices for commit messages:

  1. Separate subject from body with a blank line
  2. Limit the subject line to 50 characters
  3. Capitalize the subject line
  4. Do not end the subject line with a period
  5. Use the imperative mood in the subject line
  6. Wrap the body at 72 characters
  7. Use the body to explain the what and why vs. how

With a combination of magit settings, some custom elisp, and a git commit template, each of these suggestions is supported in some way.

Built-In Magit Checks

Out of the box, Magit has the ability to check the subject length, auto-wrap the body at 72 characters, and check that there is a blank line between the subject and the body. These capabilities are documented here.

The git-commit-summary-max-length user option governs the summary length, while the git-commit-fill-column user option governs the body width. Note that git-commit-fill-column is listed as deprecated, but I’m not sure why! It’s so useful! I think the suggestion is just to use fill-column, but then I would need to configure that for the commit message major mode…it seems like such a hassle.

The git-commit-finish-query-functions user option is the gateway to enforcing the remaining checks. The git-commit-style-convention-checks user option allows the user to specify which checks to apply. Out of the box, non-empty-second-line and overlong-summary-line are supported, which address items 1 and 2 above. I was able to write some elisp code to support items 3, 4, and 5.

Custom Checks

Looking at the source code for git-commit-check-style-conventions it was fairly clear how to add additional checks. I’ll just list the code and then discuss it.

;; Parallels `git-commit-style-convention-checks',
;; allowing the user to specify which checks they
;; wish to enforce.
(defcustom my-git-commit-style-convention-checks '(summary-starts-with-capital
                                                   summary-does-not-end-with-period
                                                   summary-uses-imperative)
  "List of checks performed by `my-git-commit-check-style-conventions'.
Valid members are `summary-starts-with-capital',
`summary-does-not-end-with-period', and
`summary-uses-imperative'. That function is a member of
`git-commit-finish-query-functions'."
  :options '(summary-starts-with-capital
             summary-does-not-end-with-period
             summary-uses-imperative)
  :type '(list :convert-widget custom-hood-convert-widget)
  :group 'git-commit)

;; Parallels `git-commit-check-style-conventions'
(defun my-git-commit-check-style-conventions (force)
  "Check for violations of certain basic style conventions.

For each violation ask the user if she wants to proceed anway.
Option `my-git-commit-check-style-conventions' controls which
conventions are checked."
    (save-excursion
      (goto-char (point-min))
      (re-search-forward (git-commit-summary-regexp) nil t)
      (let ((summary (match-string 1))
            (first-word))
        (and (or (not (memq 'summary-starts-with-capital
                            my-git-commit-style-convention-checks))
                 (let ((case-fold-search nil))
                   (string-match-p "^[[:upper:]]" summary))
                 (y-or-n-p "Summary line does not start with capital letter.  Commit anyway? "))
             (or (not (memq 'summary-does-not-end-with-period
                            my-git-commit-style-convention-checks))
                 (not (string-match-p "[\\.!\\?;,:]$" summary))
                 (y-or-n-p "Summary line ends with punctuation.  Commit anyway? "))
             (or (not (memq 'summary-uses-imperative
                            my-git-commit-style-convention-checks))
                 (progn
                   (string-match "^\\([[:alpha:]]*\\)" summary)
                   (setq first-word (downcase (match-string 1 summary)))
                   (car (member first-word (get-imperative-verbs))))
                 (when (y-or-n-p "Summary line should use imperative.  Does it? ")
                   (when (y-or-n-p (format "Add `%s' to list of imperative verbs?" first-word))
                     (with-temp-buffer
                       (insert first-word)
                       (insert "\n")
                       (write-region (point-min) (point-max) imperative-verb-file t)))
                   t))))))

The first command simply defines a customizable list of checks the user wishes to enforce. By default, all are enforced. The second command, which defines the function my-git-commit-check-style-conventions, is where the linting actually occurs. The code is intended to mirror as closely as possible the built-in git-commit-check-style-convention, perhaps for future integration.

I’d like to draw the reader’s attention to the three or statements. The first verifies the summary starts with an uppercase letter. The second verifies the summary does not end with punctuation, which I have defined as .!?;,:. Both of these are simply regular expressions.

I am most proud of the last one. It checks whether the first word of the summary is a verb in the imperative mood. I thought I would have to do some crazy NLP for this (I am in fact working on the NLP specialization on Coursera presently), but I took a play out of pydocstyle’s book. That tool simply uses a word list of verbs commonly used in python docstring summaries. For each such verb, its imperative form is listed in a file. I grabbed that file and stored it as imperative_verbs.txt in my emacs folder (removing the comments at the top). I check whether the first word of the summary is in that list. Then, if it isn’t and the user elects to commit anyway, the code asks the user if she would like to add it to the white list.

The code below is how I load the word list.

(setq imperative-verb-file "~/.emacs.d/imperative_verbs.txt")
(defun get-imperative-verbs ()
  "Return a list of imperative verbs."
  (let ((file-path imperative-verb-file))
    (with-temp-buffer
      (insert-file-contents file-path)
      (split-string (buffer-string) "\n" t)
      )))

Finally, I add these additional checks to the git-commit-finish-query-functions. Here is my use-package declaration:

(use-package magit
  :ensure t
  :bind (("C-x g" . magit-status))
  :custom
  (git-commit-summary-max-length 50)
  (git-commit-fill-column 72)
  :config
  (add-hook 'after-save-hook 'magit-after-save-refresh-status t)
  (add-to-list 'git-commit-finish-query-functions
               #'my-git-commit-check-style-conventions)
  )

Git commit template

That addresses items 1 through 6. What about the last guideline: “Use the body to explain the what and why vs. how”? For that, I use a git commit template.

In my home directory, I have a file .gitmessage with the following contents:


# Explain the *what* and *why* vs. *how*.
#
# Reference issue tracker numbers at bottom, e.g.
#      Resolves: #123
#      See also: #456, #789
#

Then, in my .gitconfig file, I have the following:

[commit]
        template = ~/.gitmessage

Every time I start a commit, the .gitmessage is shown in the commit window as a reminder to focus on the what and why. It also reminds me to reference any ticket numbers. (One area of future improvement is to use bug-reference-mode to automatically link associated tickets.)

Summary

Often we use a perceived conflict between speed and quality to justify lax standards. But technology allows us to have the best of both worlds! I’m hoping these checks encourage me to write better commit messages going forward.

Subscribe to Adventures in Why

* indicates required
Bob Wilson
Bob Wilson
Data Scientist

The views expressed on this blog are Bob’s alone and do not necessarily reflect the positions of current or previous employers.

Related