Readit News logoReadit News
schacon · 2 years ago
For better or worse, my experience as a GitHub cofounder and author of several Git books (Pro Git, etc) is that the Git commit message is a unique vector for code documentation that is highly sub-optimal.

The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line. So in the case of this commit example would be the very simple message of a generic "US-ASCII error" problem. Everything they talk about in this article is what is great about the _rest_ of the commit message, which, given modern tools, is _almost never_ seen by anyone.

The main problem is that Git was built so that the commit message is the _email body_, meant to be read by everyone in the project. But for better or worse, that is not generally the role of this text today. Almost nobody ever sees it. Unless it's discussed in a bunch of patch series over a mailing list, nobody reads anything other than the first 50 chars of the headline. It's actively difficult to do, by nearly every tool built around the Git ecosystem.

Even if you're _very good_ at Git, finding the correct invocation of "git blame" (is it "-w -C -C -C"? Or just _two_ dash C's?) to even find the right messages that are relevant to the code blocks you care about is not widely known and even if you find them, still only show the first line. Then you need to "git show" the identified commit SHA to get this long form message. There is just no good way to find this information, even if it's well written.

This is one of my biggest complaints with Git (or, indeed, any VCS before it), and I think why people just don't care much about good commit messages. It's just not easy to get this data back once it's written.

If you want an example of this, search through the Git project's history. Run a blame on any file. It's _so hard_ to figure out a story of any function implementation in any file, but the commit messages are _pristine_. Paragraphs and paragraphs of high quality explanation for almost every single commit. Look at any single commit that Jeff King has done for the last decade. Hundreds of hours of amazing documentation from a true genius that almost nobody will ever appreciate. It's horrifying.

I don't know exactly what the answer is, but the sad truth of Git is that writing amazing documentation via commit message, for most communities, is almost entirely a waste of time. It's just too difficult to find them.

js2 · 2 years ago
As someone who has contributed to Git since before GitHub existed and who maintains legacy code, I simply cannot disagree more. I use `git blame`, `git log`, and `git show` in the terminal all the time. It's trivial to follow the history of a file. It takes me seconds to use `git log -G` to find when something was added or removed.

Nothing pains me more than to track down the commit and then find a commit message that's of the form "bleh" or "add a thing" when the developer could have spent 60 second to write down why they did it.

Nothing gives me more joy than to find a commit message (often my own) that explains in detail why something was done. A single good commit message can save me hours or days of work.

Let me also just say, and this is a bit of shot: GitHub contributes to the problem of bad commit messages. If I'm lucky, folks have put some amount of detail in the PR description, but sadly that's not close at hand to the commit log. It's another tool I have to open. Usually though, the PR is just a link to Jira, so that's another degree of indirection I need to follow. Then the Jira is a link to a Slack conversation. And the Slack conversation probably links to a Google doc.

As an industry, we're _terrible_ at documentation. But folks like Jeff King are fighting the good fight. At the end of the day, I don't think the problem is with the technology. I think it's a people problem. Folks perceive writing documentation as extra work, so they don't. There's no immediate value to it. The payoff comes days, weeks, or months later.

Please, write good commit messages. Just spend a minute saying why you did something so that every commit isn't a damn Chesteron's fence exercise. Put it in the commit message where I can easily find it. Your future self and I thank you.

Edit to add: I didn't address your argument, that commit messages are too hard to find.

First, I don't find this to be true. I rarely have trouble following the history of a line of code, a function, or a file.

Second, commit messages have value at the time they are written even if they are never seen again. I find that writing a good commit message helps ensure that I've written in code what I've intended to (I often view the diff while writing the commit message) and they have value to the people reviewing my code.

neilkk · 2 years ago
The thing is that writing a good commit message for future people doing `git blame` is only worth it if it's a line of code which someone in the future will look at and need to know why it was changed from its previous form to the current form.

If you simply want to comment the current state of the code, you should add a comment in the code.

No one will ever need to know in the future why that particular space character is an ascii space, so the whole commit message is just a blog entry in the wrong place.

It would have made sense to just put a comment at the top of the file saying "make sure encoding is whatever".

Izkata · 2 years ago
Definite agree there: Be it git or svn I spent a huge amount of my bugfixing and refactoring time in the history figuring out why things are the way they are.

> Usually though, the PR is just a link to Jira, so that's another degree of indirection I need to follow. Then the Jira is a link to a Slack conversation. And the Slack conversation probably links to a Google doc.

Assuming all those links in the chain still exist. Before Jira we had FogBugz, almost all those old cases are gone (some were imported). And we used Flowdock for 10 years, that's completely gone.

Commit messages are the only thing we can rely on for this history. Use it. And try to avoid squashing commits, that erases this history - yes, even for a feature branch, changes from code review should be separate from the initial push, explain why it's being changed so we don't make the same mistake later.

munksbeer · 2 years ago
> As someone who has contributed to Git since before GitHub existed and who maintains legacy code, I simply cannot disagree more. I use `git blame`, `git log`, and `git show` in the terminal all the time. It's trivial to follow the history of a file. It takes me seconds to use `git log -G` to find when something was added or removed.

I 100% agree with this. I do this all the time. I also agree with the rest of the post. The sentiment raging against these longer git commit messages smells very much like elitism to me.

nox101 · 2 years ago
I'm mixed on this. My project has a bug tracker. A commit is required to have a bug id. The bug tracker has entire discussions of what lead to the commit so it's not clear to me that a detailed commit message is a plus when the real detailed info is in the tracker. Yes it's indirect but there's no way I'm going to summarize the entire issue discussion.

Maybe this is a job for machine learning. Read the code, read the commits, read the bug tracker, add a git super-blame that asks the LLM to summarize why every line is the way it is and what it's doing

michaelcampbell · 2 years ago
> It's trivial to follow the history of a file.

If the committer uses the "show history of a file" process model. These days, it's mostly "squash commits until I get a good/flattering 'story' of what I did", which removes typos, failed experiments, mid-thought commits, and any other blemishes of _what actually happened_.

Commits CAN be used as great history, if history is allowed, but I've found that "modern" workflows tend to the rebase/squash side of things and also are mostly write-only.

elygre · 2 years ago
How does this work in the face of unrelated refactoring? Say you first fix a bug somewhere, with a great commit comment. Then some refactoring happens, and the affected function is moved to a new class in a new file. Are you still able to track the original git comment?
spacemankiller · 2 years ago
> Edit to add: I didn't address your argument, that commit messages are too hard to find. First, I don't find this to be true. I rarely have trouble following the history of a line of code, a function, or a file.

I don’t think this is proper way of reasoning. What is hard and easy is subjective. And you discuss it as it would be objective. Word against word. It would be wise to have some poll and see results.

If one geek is writing and reading commit messages doesn’t mean it’s easily accessible by everyone. It’s hard to make something as a widespread standard if tooling doesn’t make it super easy to access. Allow people to leave kudos and emoji to other people commits messages and people will start making them better :D And later show heroic people with git —-stats

corethree · 2 years ago
His credentials indicate that it may be possible that his arguments are based on data while your credentials and evidence indicate personal, anecdotal experience. Therefore I would trust his reasoning more. Additionally, I personally identify with it.

I mean a git developer finds git easy to use? That's biased data.

I love how both of you dropped your street cred before launching into your reasoning. It just shows how much more credentials convinces people rather then the argument itself. Normally that stuff logically doesn't matter and people are just doing it to grab some "authoritah" but in this case your backgrounds actually contributed to the arguments.

krobelus · 2 years ago
> The main problem is that Git was built so that the commit message is the _email body_, meant to be read by everyone in the project.

I find this very hard to believe. Isn't it "everyone who is interested in the commit subject/files touched should read the body". Why would anyone else read immutable historical documentation?

> Even if you're _very good_ at Git, finding the correct invocation of "git blame" (is it "-w -C -C -C"? Or just _two_ dash C's?) to even find the right messages that are relevant to the code blocks you care about is not widely known and even if you find them, still only show the first line. Then you need to "git show" the identified commit SHA to get this long form message. There is just no good way to find this information, even if it's well written.

This sounds like you are joking. Any good IDE will be able to annotate each line with blame info, and show the diff at the press of a button. On such diffs, the IDE should allow recursive blaming on context/deleted lines. Tools like Tig allow exactly that.

GitHub certainly does make it hard to see commit messages, I give you that :)

> Hundreds of hours of amazing documentation from a true genius that almost nobody will ever appreciate. It's horrifying.

?? It's not like it was written for fun. This documentation attached to a commit exists to reduce the risk of accepting the patch from someone who might not be around in future, to fix any problems introduced. By disclosing all their relevant thoughts, the author shows their good intentions: they enable others to build on top of their work. If the author kept their thoughts to themselves they would gradually build up exclusive ownership of the code, which is often not a good idea. Also a commit message serves as proof of work, which can be important when there's too many patches. For commercial projects some of this is less important.

makeitdouble · 2 years ago
I might be in the minority, but parent's comment is probably about people like me: most of my coworkers have context free, or at best succinct commit messages. I never read more than the first line listed in the commit list, and don't even assume the description is always accurate.

Instead I'll spend my time stalking the related merge request, where the full description of the whole change resides, with probably a link to the ticket or reference documentation, and all the back and forth on why something is or isn't a good idea.

I think the world could be a better place if all of that was in git directly, but that's also utting much more burden on an already complex tool.

tiborsaas · 2 years ago
> I find this very hard to believe. Isn't it "everyone who is interested in the commit subject/files touched should read the body". Why would anyone else read immutable historical documentation?

If you think about who created GIT (Linus) then it suddenly makes sense that the commit message is like an email body since most of the Linux kernel collaboration is done via a mailing list?

brabel · 2 years ago
> Even if you're _very good_ at Git, finding the correct invocation of "git blame" (is it "-w -C -C -C"? Or just _two_ dash C's?) to even find the right messages

I am terrible at git on the terminal, but with IntelliJ or emacs and magit, I can trivially find every commit ever to change a file, and easily navigate the commits to see every full commit message. It's not hard when you use a proper tool, and I have a feeling almost everyone has something like that?! Do you really try to stick with the git CLI and memorize hundreds of commands and flags?? Why?!

winwhiz · 2 years ago
Really simple answer: Repeatability. I am not saying it is the only one right blessed answer, but if you really want to know why people haven't moved to pure GUI interfaces, imagine describing to someone how to add a new directory to their path.

  fleet $HOME/.config/fish.config
  # ADD this line somewhere
  set -x PATH /opt/git/bin $PATH
Or: 1. Either hit WINDOWS-E and right click on This PC and select properties (it might be called something other than This PC if someone renamed it) or either press WINDOWS key or click Start or click the Windows icon (if you don't see them try mousing into a corner of your screen (typically bottom left) until they and the rest of the bar un-autohide) look for and click a gear symbol (should expand to say Settings if you hover), click System, on the left and the bottom you should see About. 2. Click the text Advanced system settings (on the right), look for a new window with a set of tabs, you want Advanced. Click the button Environment Variables. 3. In the top columnar box EITHER find a variable named Path, highlight and click button Edit, in a new window click button New, type '/opt/bin/git' in a text field that has appeared at the bottom list items, click OK OR click the button New, in a new window enter Path for Variable name and /opt/git/bin for Variable value, click OK (you shouldn't need to Browse Directory or Browse File). 4. Click OK button, click OK button, close Settings window.

bradjohnson · 2 years ago
IME git abstractions make it easy to read and navigate standard workflows, but incredibly difficult to repair issues that arise due to divergence of some kind or another because they are so opinionated.

I use git 99% in the terminal, and 1% in some git tool for visualization, but I find that a lot of people use it in the opposite way and have problems working with others that use a very slightly different workflow. You don't need to memorize hundreds of commands and flags, honestly a dozen or two gets you to expert status in most respects.

mostlylurks · 2 years ago
I don't find it more difficult to use or remember commands for than remembering how to accomplish similar tasks in some GUI (especially if that GUI is emacs). And unlike most GUIs (emacs may be an exception), I can trust that my knowledge of the git CLI won't become out of date when my GUI tool inevitably undergoes a UI redesign of some sort.

But more importantly, the CLI allows my typical workflow where I chain together a bunch of git (and other) commands in a row, allowing me to just type in, for instance, several different commits, their messages, and what files should go into each in one go without having to break my concentration by having to move around in some GUI between commits. Sprinkle in some stash manipulation and interactive rebases, compilation, and unit testing, and you'll really start to see how the CLI allows you to offload some of your working memory to your invocation in a way that a GUI just can't.

mschuster91 · 2 years ago
> Do you really try to stick with the git CLI and memorize hundreds of commands and flags?? Why?!

Because IntelliJ is... less capable than it should be. Personally, I find `git add/commit -p`, `git diff` far easier to use than IntelliJ, and because Python is a fucking mess I had to install the codecommit git helper into a Python venv... but you can't tell IntelliJ to use that venv's $PATH for `git pull`/`git push`.

Oh, and you can't really macro complex stuff in IntelliJ, whereas I can do a single-command release and push-tag of a project with about 30 Git submodules in a (convoluted) Bash one-liner.

schacon · 2 years ago
I don't know IntelliJ well, but I would be surprised if they did the rather expensive rename following that the multiple -C invocations did. Maybe someone can inform us here? GitHub definitely does not, but that is 100% my personal fault I assume.
bigfatfrock · 2 years ago
I was mind blown reading this also - are we not programmers for the sake of laziness in the face of these kinds of "problems"? I have to hail Tim Pope for Fugitive.vim also. HAIL TIM POPE!
DarkNova6 · 2 years ago
100% this
globular-toast · 2 years ago
This is a failure of GitHub etc. GitHub tries to dumb things down for users because I guess it's judged they can't reason with commit histories and this is one of the consequences. The mess in especially private GitHub repos is beyond belief sometimes.

The thing is there's nowhere else for such documentation to go. It's not appropriate for a code comment. But we've got a whole generation of developers now who think git is GitHub and the only purpose of git is uploading changes to GitHub.

Git sucks, but it sucks a lot less than everything else. But we need to go back to basics and understand what version control is actually for.

codemac · 2 years ago
It's great for historical research though. It's one of the few pieces of documentation that will live with the code forever. github and other forms of centralization are not open data formats that folks trivially backup/convert/carry forward. They usually leave the data behind if they move the project somewhere else.

So no, I don't think it helps the current community much either. But it helps the debugger years later.

schacon · 2 years ago
Is it great for historical research? I feel like the format and tooling around it is uniquely _not great_ for historical research. I think it's optimized for discussions before integration, which is largely what PR descriptions and comments are largely used for now.

I feel like given great commit messages, determining a story and useful history around any block of code given the Git tooling is incredibly difficult even if there are _amazing_ commit messages.

Like say you are trying to determine why a 10 line function is the way that it is. You blame it. Not even with the stupid-simple GitHub UI that _I_ originally wrote, but with the more expensive CLI interface that follows renames and ignores whitespace changes, etc. Now you get a list of SHAs of commits and the first 50 chars of commit messages for each line for the last modifications, etc. How do you even stitch those messages into a useful story (in order) to tell you how that function evolved to what it is now and why?

lanstin · 2 years ago
Till the team you are handing off the code to just copies the files and commits into a fresh new repo without any of the history. I had this happen once to a server I wrote, and then like 2 years later the new team comes and asks me if I knew of the server, and I'm like "I wrote it" and then they are all confused.
mb7733 · 2 years ago
Well, `git` is still the primary way I interact with a git repository, and `git log` shows the entire commit message by default. So I don't run into this problem.

If some "modern" git frontend is only capable of displaying the first line of a commit message, then this is a problem with that tool, not git itself.

(I'm also not convinced this is a limitation of all modern tooling...)

schacon · 2 years ago
I can't tell if this is engaging with trolls or not, but I can't imagine that all of your interactions with your codebase are via `git log` with no other flags. Even the with the normal Git CLI that most of us use daily, most of us use `--oneline` or whatever to simplify useful calculations and visualizations like `--graph`, etc. But we're talking here mostly about code archeology, learning about the history of a block of code, so this comment seems somewhat ridiculous in that context.
TheRealPomax · 2 years ago
periodic reminder that `gitk` exists, and has come with git since... pretty much forever? If you're reading `git log`, you really owe it to yourself to run `gitk` at least once to see what you've been missing for over a decade now.
sohamssd · 2 years ago
Your entire argument boils down to the fact that it's hard to view git blames. It's not.

As stated by other people, IDEs like VSCode and IntelliJ do an extremely good job of showing the blame. And they DO show the entire commit, body and everything at once.

tux1968 · 2 years ago
I never considered the idea that it was atypical, but I read full commit message text all the time. There are many different ways to drill down into a commit, and then read the entire commit once you know it's relevant. Even doing a simple git log, and then a searching for some keyword through every full commit message, can be useful.
tcoff91 · 2 years ago
Many editors have great git blame integration that makes these messages quite accessible.

It's really easy in emacs with magit to view commit messages from git blame view.

I believe vim, vscode, and jetbrains IDEs all make this simple.

nijave · 2 years ago
Yeah, a lot of these also have Github and ticket tracker (Jira, etc) integration so they'll also pull in context from those, too

Most of the stuff I work on uses merge commits on Github so you can just click the PR # in the merge commit message and arrive at the PR, browse through commit messages, discussion, etc

cerved · 2 years ago
Using vim-fugitive it's

  :Git blame %

palata · 2 years ago
> The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line.

Maybe I do it wrong, but the most basic interface I use to check the git history is `git log`, which shows the whole commit message.

GitHub takes me 18 clicks to find the commits, I don't see why I would even bother using it.

ParetoOptimal · 2 years ago
Many engineers primarily or even exclusively use git via githubs interface and have never made a commit with a body.
lisper · 2 years ago
> The main issue is that most of the tooling ... generally only shows the first line.

> I don't know exactly what the answer is

Isn't it obvious? Write better tools. There is no reason you have to be stuck with the deficiencies of what someone else has built. That's the whole point of open-source software.

It's more than a little concerning that a "GitHub cofounder and author of several Git books" has to have this pointed out to them.

ParetoOptimal · 2 years ago
>There is no reason you have to be stuck with the deficiencies of what someone else has built.

There is a concerning trend of "we only use vscode" and popular preference shifting to "adjust to popular tool" rather than "use best tool".

This means sadly things like GitHub start to define git even more for your coworkers.

goku12 · 2 years ago
I don't know how it's for everyone else, but I do value the body of the commits from others. It's true that I see only the subject line for most commits. But I eventually read the full body of commits I'm interested in. Honestly, it's frustrating when commit messages don't carry enough context. Sometimes that context fits in the subject line. For others, I expect an elaborate body.
madsbuch · 2 years ago
On my work I make 1-15 commits a day. If I have to spend thought cycles on the commit message, that is time that goes from other productive endeavours.

I think, as the original commenter also wrote, this might be worth it in much slower paces projects that is run in another cadence / over mailing lists.

I particularly think that high paced application development do not benefit from git as documentation.

keybored · 2 years ago
I’m surprised that you (in particular) would say this. git-log is, to me, fine for displaying the whole message (not just the subject). And sure, I often fiddle with copy-pasting SHA1s like a caveman, but it’s fast enough for some quick history spelunking.

Finding the history of a particular code change is even more manual for me: maybe doing a chain of `git log -S'line'` where `line` copy-pasted in at every step. But doable and not a time-sink for my off-hand what’s-this thoughts. (But: something more convenient that isn’t an unreadable Unix pipeline one-liner would be very nice.)

My litmus test is simple and doesn’t involve hallucinating that other people are even reading my messages: am I reading my own past commit messages? Yes. I am curious why I did or didn’t do something on a daily basis(!)

heads · 2 years ago
To tack one additional problem onto your excellent list: the commit message is usually only the start of a conversation about why a change should be made. The rest of that discussion is whether it meets the bar and what needs to be adjusted before it can land on the collaborative trunk. Done well, that is valuable reading.

Git was designed with the distributed viewpoint. A commit message, as written by the author, is necessarily correct: I’ve decided this is right, and it’s on you to decide if you want to merge it into your history too.

In our current systems we usually have a URL in the commit message that links to the actual story behind the commit — the discussion on the pull request, merge request, or code review. I rarely see the results of these discussions being amended into the commit message. If the repo lives forever but the database behind the code review tool gets toasted then something just as important is lost forever.

(I come from a background of one idea equals one amended, fast forwarded commit to master. It’s possible other people rely on branch history to reflect the evolution of ideas and how they go from a request for review to approved code. In my experience branch histories tend to have very low quality commit messages and even then they only show one side of the conversation — the author’s responses to their reviewer’s and their own critiques.)

adityaathalye · 2 years ago
> I don't know exactly what the answer is, but the sad truth of Git

> is that writing amazing documentation via commit message,

> for most communities, is almost entirely a waste of time.

> It's just too difficult to find them.

I completely agree that well-written git log messages are goldmines of information.

I wish makers of popular git forges had made it easier to create and consume this information.

Almost all my wiki pages start with piping git log messages into a text file.

Git logs are the entry point to good project documentation.

(edit: fix formatting)

schacon · 2 years ago
To be clear from reading some of the other comments, I don't work at GitHub anymore so while I may have partially caused the issues I'm complaining about, I don't have the ability to fix them anymore.

Also, while most GUIs and editors have blame capability (as does GitHub actually), most of them don't ignore whitespace changes (-w), code movement or renames (the -C options) so they're often of limited use.

Finally, I _would_ like people to write good commit messages, I just would like to see a tool that actually uses that work in a way that helps document your code in an easy and valuable way, and the Git/Hub tooling makes that process at best "tedious" as someone in the thread says.

I am working on a new Git client called GitButler[1] and would like to address this at some point down the line, so maybe it ends up being me who helps fix this after all :)

1: https://gitbutler.com

jakub_g · 2 years ago
In my experience it all depends on what kind of codebase it is (product? library/framework? private company? opensource?), commit velocity, release cadence & how the codebase is used in general.

In low-velocity opensource libraries, good and clean commit messages can be really helpful when debugging arcane issues. I used to be maintainer of a frontend framework & widget library and we tried to have good commit messages as we'd often go back when over old commits when fixing bugs.

I agree that using git from command line for blame is not easy, this is something I always do from GitHub UI instead.

When GitHub is the repo's choice for PRs, and the codebase is product codebase with high velocity, having a pristine git history and clean commits and commit messages is not practical; however, the expectation should be to at least have good PR descriptions. When blaming commits in GH UI, it's easy to go to the PR which introduced the commit (it's linked below commit title); and PR descriptions can be enforced via templates in .github folder.

PR descriptions have an advantage that they can use images, videos etc. to better explain what they change. This is especially useful for frontend codebases.

I work on a big frontend monorepo. We have tools in place to do visual bisect between pull requests (each PR gets its own preview env). We very much do read PR descriptions when doing bisect to confirm which of the recently merged dozens of PRs introduced a regression in production N hours ago.

But in general I agree that commit messages are not good place to storage general knowledge (they're good for "what and why is changing here"). For documenting gotchas etc. I prefer to have code comments in relevant places of code; or README.md in subfolders. (Sadly, I notice most programmers just don't document anything anywhere at all).

colelyman · 2 years ago
One tool that I think promotes commit messages like the OP is magit in Emacs. Before using magit, I always used `git commit -m '...'` and didn't realize that commit messages could be longer than a line.

I agree that this is a tooling problem, but magit is a breath of fresh air in many ways (including verbose commit messages).

nequo · 2 years ago
What I like about magit is that it shows me the diff of the would-be commit when I write the commit message. And also that I can pick which sections of the diff to a file I want to include in the commit.

I used Vim + git CLI before and this was much less convenient. (I never tried fugitive though. It might be similarly great on these two features.)

kelnos · 2 years ago
While this may be how most people interact with git, I couldn't disagree more when it comes to my personal use.

I use 'git blame' (I've never needed to pass any options to it) and 'git show' liberally if I'm trying to understand a change that was made, and if the committer took the time to write a commit message body, of course I'll see it and read it.

> ... I think why people just don't care much about good commit messages. It's just not easy to get this data back once it's written.

I think people don't care much about good commit messages because they are unprofessional and sloppy. They just want to get the commit in, push the PR/MR, get it reviewed and merged, close that Jira ticket, and get credit for those sweet sweet story points (ugh). And on top of that, they generally don't care to document their changes because they personally don't see the value of doing so. Surely they'll remember the change if they ever revisit it (no of course not, but many people think they will), and they don't really give much thought to the possibility that others might need more context.

And besides, all the discussion about the bug or feature or whatever was happening in the bug tracker, so providing a link to that issue in the commit message is enough, right? (No, it's not; I hate it when people do that and think that's all they need to do.)

> The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line.

Then maybe this is GitHub's fault; fix your web UI, then. I avoid GUI interfaces to my dev tools as much as possible, and I think the git command line is perfectly fine for this. It absolutely does not only show the first line, generally. 'git log', 'git show', etc. give you the full message by default. In general I would say you have to go out of your way (by providing more command line options) to hide the message when using the command line tools.

> the Git commit message is a unique vector for code documentation that is highly sub-optimal.

Sure, because it's not a vector for code documentation, it's a vector for change documentation. And there's no better place to put the description of a change than in the record of the change happening.

While I agree that many people write very poor commit messages, I don't think the tooling and discoverability is why.

caskstrength · 2 years ago
> Even if you're _very good_ at Git, finding the correct invocation of "git blame" (is it "-w -C -C -C"? Or just _two_ dash C's?) to even find the right messages that are relevant to the code blocks you care about is not widely known and even if you find them, still only show the first line. Then you need to "git show" the identified commit SHA to get this long form message. There is just no good way to find this information, even if it's well written.

The good way to browse git blame a read commit messages is to use Magit. It is also great at letting you seamlessly rebase/split/merge long patch series.

tibbar · 2 years ago
In practice, I think GitHub/GitLab/etc solve this UX problem pretty well. Inline git tools let you jump immediately to the PR that generated the code change, and the PR description + code reviews + snapshot of the commit help to understand what the point of the change was. You can search the PRs when you want to find some context. (It's unfortunate that PRs are not stored in the repository itself. I mean, Git is not a great database for a multi-user webpage, so this wouldn't quite work... but it would be nice if the archive was durable and easy to export/share.)
dkarl · 2 years ago
> The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line. So in the case of this commit example would be the very simple message of a generic "US-ASCII error" problem.

This is a feature, and a crucial one. No one would include fifty lines of explanation if everyone had to see it. It would be better to throw the information away than to inflict it on everyone who was scanning through the commit history looking for a particular change.

Yet it is valuable information that only makes sense in the context of that change. There is nothing in the corrected version you can connect to the issue that was fixed. It's obnoxious to include comments about errors that have been removed, like this:

    # where civica QueryPayments calls are taking too long # use ASCII whitespace
(This is ridiculous, but not unrealistic. I've seen code comments that said things like "# removed syntax error in invocation of query generator." This is what you get from programmers trying to juice their LOC stats.)

The commit message is the right place for this kind of information, but most people reading the commit messages don't care. They're scanning through looking for something else, and all they need is a few words that tell them if this is the commit they're looking for. The person who needs to see the full story is the person who is interested in this change in particular. Maybe they found it by grepping the git log for "invalid byte sequence". Maybe they found it because they're looking at all the changes in that file, because some tooling that occasional modifies that file keeps messing it up. What matters is that if they have a special interest in that change, they have a way to see whatever information they committer felt was worth preserving, and the committer has a place to put that information where only someone with a special interest will see it.

zaptheimpaler · 2 years ago
I feel half vindicated about my rant a few weeks ago[1] arguing that we should make commit messages as long as we like instead of the stupid 50 character or whatever limit. If enough people do that, maybe tools like GH will stop wrapping the message by default. Even if not, atleast the first line is usually easy to see in most tools by hovering over it or something.

[1] https://news.ycombinator.com/item?id=38831282

cerved · 2 years ago
Presumably you're referring to commit subjects.

And no, they should absolutely not be as long as you like. It breaks things

gorkish · 2 years ago
Making sense of code (or any system that changes over time) vis a vis its own history is one of those things where I really think AI/ML tools can really shine. Even with relatively low quality commit messages, I can look at something that happened 15 years ago in a codebase I am familiar with, and there will probably be enough information that I can assemble the full context, even if finding some of that information is challenging or time consuming. git log, git blame, look at the other code made in the commits, read the issue descriptions, read the code reviews. It just seems like a model could slurp that up and do a decent job of giving you a couple of paragraphs about why the line of code you are staring at is the way it is.

TBH putting such a detailed writeup in the git log doesn't really have any return -- for it to ever be useful to you again, you have to know the information is there; you then have to actively seek it out, with the hope that whatever you did to make it 'searchable' is going to work for you again. I can say with surety that if I were looking at a bug similar to the one linked from this article, I would not look to the git log for inpsiring a fix; I'd just fix it. Any extra time I would take would be to understand how a UTF8 nbsp ended up where it shouldn't have been in the first place -- something that the author of this commit seemed to have no interest in doing, but which likely has greater relevance than the documentation of the fix.

I want to be clear that I support commit messages that say what they do though; I'm not advocating for -m 'fixed' shenanigans, however at the same time I believe that -m 'fixes #1234' is often enough

lucioperca · 2 years ago
Of course Emacs has a mode for it:

https://github.com/redguardtoo/vc-msg

jeremyw · 2 years ago
I take Scott's point with a difference perspective.

Though commit messages are ephemeral and hard to utilize in the future, they're the stream of consciousness of the project.

They convey very important shifts in direction, discoveries in the making, code smells, limits of current architecture, and markers of tech debt. We don't know what this beast will be. And we figure it out commit by commit. Document it.

yencabulator · 2 years ago
Commit messages are the very opposite of ephemeral; they are the longest-lasting history a project is likely to have!
twosdai · 2 years ago
Completely agree, the value with the message is really just to link an external ticket Id, the user experience is much better in external ticketing systems for all of the story telling that the article loves.

Don't read "external ticket system" as closed either, plenty systems are open to the public.

cerved · 2 years ago
Right. The massive commit with minimal description and a PR number which I can look up in Azure DevOps to find a review with no description, no discussion and a mention of a number I can go and look up in Jira, where some Scrum master wrote half a sentence of what needs to be done and asking to "reach out to Jeff" for explanation.

So much more valuable and great user experience

nightfly · 2 years ago
git log. git blame, grab hash, git log hash. You make it sound like some arcane magic...
lambda · 2 years ago
It's amazing, your experience with Git is so different than my own.

I routinely open a file in my editor, hit "Ctrl-c v B" for Git Blame mode, go to the line I'm interested in, and hit "Enter". Bam, there's the full commit message. From there I can can continue to trace backwards, blaming lines and reading full commit messages.

But, you know, not everyone uses Emacs and Magit, fair. How about just using "git gui blame file"? Click on a blame line, see the full commit message. This is a tool included with Git (available in a separate package in some installations).

OK, rather use an IDE? Install GitLens in VSCode. Easily accessible blame in your editor, where you can hover or click in various places to see full commit messages.

I mean, I agree in part; there are some tools which make good commit messages hard to write or find. The tiny little commit message edit box in VScode is not ideal. Lots of people use a workflow of "commit lots of crappy commits with one liner commit messages, let GitHub/GitLab squash them on merge."

But as an expert Git user who has managed to convince some teams to have a good commit message culture, if you do get people used to writing good commit messages, they can be very easy to find and read later on, there are tons of tools that make them easy to browse.

kimixa · 2 years ago
I regularly use the git command line, and "git show (pasted SHA)" in my second terminal doesn't really feel like the road block to understanding the grandparent seems to make it out to be. It takes me many orders of magnitude more time understanding what is output rather than searching for it, and like you mentioned there are any number of UIs (third party, editor integration, or even shipped with git like gitk) that wire everything up into a nice UI.

And I also disagree with the GP's complaint that "Most people only read the shortlog" being any kind of disadvantage. The commit message isn't for everyone, it's for the one time someone needs to figure out exactly what it did and why that commit was made, and why a change in X causes a behavior change in Y, and can save hours of work. It's like code comments, 99 times out of 100 you don't need them as you're just interacting with a documented API, but that 1 other time they are a godsend.

gloosx · 2 years ago
I use fugitive.vim, and blaming is very convenient there as well as every other git workflow. I can press a shortcut to see when every line in the current file was changed, and who changed it along with the commit hash. If I need more – I can expand every hash to see the full context, including full commit text and diff. Maybe cli git is not too easy to use since how complex it is, but there exists a git wrapper so awesome it should be illegal
da_chicken · 2 years ago
> The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line.

This has been an issue with version control tooling for quite a long time. I'm fairly certain both CVS and SVN did the same thing. But I agree that you're still right.

I'm also very amused by the number of replies to your comment along the lines of, "oh, it's actually very easy because I always use <third party tool>".

Which is, of course, rather proving the point.

dragonwriter · 2 years ago
No, because the point is about common tooling, and the common tooling does not, actually, make this difficult.
b33j0r · 2 years ago
Author of nit, here. I tried to move the landscape towards semantic reasoning. It’s on github but kind of abandonware. Life and incompetency happened ;)

No shilling. I commented here because I still think my framework was decently thought out, and mostly that calling someone a nit or a git is exactly what linus was thinking. Make it easy enough for anyone to use.

Nit is something people could take as a thought experiment.

phaedrus · 2 years ago
So then I am not wrong that I do all my git commit messages via the "-m" commandline option with a short phrase like "frob the baz"?

(Initially I started using -m to avoid getting trapped in Vim. But even after I gained the option to use e.g. Notepad++ as the editor, I never saw the point in using anything more than "-m 'message'".)

kimixa · 2 years ago
Git respects the EDITOR environment variable and has done for decades (so likely before many here really used it) - you should probably be setting that (or equivalent on your platform) to the editor you want anyway.

Weird workaround just to avoid basic configuration seems like more work in the long run.

ManuelKiessling · 2 years ago
I agree for the use case of scrolling through a git history, yes, but when I land at a certain commit, e.g. by hitting the blame label in IntelliJ on a line whise reason d‘change I‘m interested in, then I will totally read the whole commit message in the hope that it helps me understand the change (in addition to looking and trying-to-understand the change itself).
FrederickZh · 2 years ago
> Everything they talk about in this article is what is great about the _rest_ of the commit message, which, given modern tools, is _almost never_ seen by anyone.

This was why I created gh-ph [1].

[1] https://github.com/Frederick888/gh-ph

zellyn · 2 years ago
If you follow a pull-request based workflow, and if you typically squash down to one commit, then finding these messages isn't too bad, since the commit description pre-populates into the pull request description. I often track changes down not to their commit, but to their pull request.

Granted, that's not exactly `git`, but rather `github`…

Deleted Comment

Groxx · 2 years ago
I feel like this explains a lot about why GitHub is so consistently hostile towards showing or writing decent commit messages.

Which has helped push people away from writing useful ones, on an unprecedented scale, which makes it a self-fulfilling prophecy.

Great.

Just great.

djha-skin · 2 years ago
This sounds like an excellent sales pitch to use email based good workflows such as those advocated for by Drew DeVault[1].

1: https://git-send-email.io/

6510 · 2 years ago
Just do a threaded conversation in a comment at the top of each file. Add your name and the date.

Deleted Comment

lawtalkinghuman · 2 years ago
The reason people (myself included) rather like good Git commit messages is evident when one compares them to the alternative.

You're working in a commercial/closed source environment and want to find out why line 57 in src/blah/db/utils.py does that. Where do you look?

- inline code comments. Usually non-existent. Often out-of-date, sometimes misleading, frequently tells you no more than you can discern from just reading the code itself (especially now type annotations are trendy again). Rarely explains why the code exists. There's a reason people caution against too many comments, and that translates into people probably not putting enough commentsin.

- calling code? Helpful, but thanks to microservices and increased levels of abstraction (APIs, DI frameworks, messaging buses, config parsing) you've got to go check 900 different repos out to work out what is going on.

- email? Give up. You'll find invitations to the company Christmas party and Q2 sales figures but actual tech explanations are in short supply.

- Slack etc - same problems as email, plus developers who hide away all the interesting stuff in private team channels

- Google Docs - you probably don't have access to the relevant doc, and there's no way to know that you don't

- wiki/docs? Half baked, wrong etc. Or it'll be autogenerated JavaDoc type stuff that'll tell you what you already know or can reasonably infer from the code. Also, findability sucks. Or the developers just avoid the whole thing because the software is nasty and corporate and barely usuable.

- bug tracker/ticketing system? You ask around and someone says "oh yeah, Dave made that change two years ago" and then you search for tickets that match related keywords only to find out that those tickets weren't brought over from Trello into JIRA, and now you need to go ask IT to give you access to the legacy Trello board which they don't want to do because then it'll put them over the five users per month limit or whatever.

- Architecture Decision Records / decision logs / whatever you want to to call them - nice if they exist, I guess.

- ask the person who wrote it? This assumes they still work there and can remember. Plus you gotta do the asking around routine which takes days and destroys all hope and joy in the world.

By a process of elimination, commit messages are the closest you're going to get. They're right there - on your computer, neatly integrated into your editor, hopefully. You can search them fast in a terminal window rather than in some slow web-based monstrosity. If you're lucky, they're actually useful. Even if they aren't, they're at least contextually useful in helping you narrow down your search strategy for the inevitable plunge through email/slack/JIRA/Trello/internal wiki etc.

Ideally what should happen is the really useful commit messages get copied into stable technical documentation like decision logs or a properly maintaned wiki. If people did that, great, but it's pretty rare. A culture of sharing weird interesting tech things in a Slack-type system can help because future devs can at least search but you do that at the cost of more interruptions for colleagues now.

The broader issue is of all the bad options you can choose, it often tracks the wrong thing. In something like Trello/JIRA/whatever, if you're looking for the technical reasons, it'll have the business reasons without the technical stuff, or vice versa. You generally want both, and most systems only give you half the story.

ep103 · 2 years ago
I know the OP didn't mean it this way, but after reading HackerNews for the last decade or whatnot, it never ceases to surprise me how often developer complaints stem from developers just not doing their damn job.

"Almost nobody ever sees it.... nobody reads anything other than the first 50 chars of the headline."

On the one hand, I get it. If a tool makes something difficult, people are less likely to do it, and as engineers we want to make tools to cause people to fall into the pit of success. So, improving this part of git makes sense.

On the other hand, just do your damn job. If a coworker doesn't understand a code change, because they didn't bother to read the commit message, they're a bad developer. If they didn't write a git commit message because "no one is going to read it anyway", they're a lazy engineer. These things aren't excuses, they're incompetence, and not everything needs to cater to the least competent people in our profession.

eschneider · 2 years ago
When I document, or write commit messages, I don't really _care_ if other folks will ever look at them. Documentation is a gift for future me. If something wasn't obvious to figure out, or a potential source of future problems, I want it written down, so if _I_ go looking for info, it's there.

The fact that things are now documented for other folks is just a side benefit.

schacon · 2 years ago
I feel like it's not a question of "doing your damn job". It's a question of what value can you expect to get from a particular investment. If blame is your tool and every line happens to be changed from a different blame invocation (is it "-w", "-w -C", "-C -C -C", etc), how do you learn the story of this block of code best? Maybe you then need to read a story _per line_ of code. But that's not actually worst case. Maybe you need to drill down to the commit _before_ that because the last change isn't semantically important. Maybe the one before that, etc. How many commits that touch those lines significantly do you need to research and read amazingly well written commit messages before you totally understand the context of this particular block of code?
Sohcahtoa82 · 2 years ago
> it never ceases to surprise me how often developer complaints stem from developers just not doing their damn job.

One thing I learned is that any forum that appeals to software engineers will appeal to software engineers of all skill levels, from the guy that did a 6 week coding camp because he heard SWEs make a lot of money but didn't really learn anything but thinks he's an expert now, to geniuses with 10+ years experience.

For every comment from someone who really knows what they're doing, there's one from someone that really doesn't.

sethammons · 2 years ago
Any system where the proposed solution is "be better" without an outline of "and here is how" and some method of enforcement is doomed to fail.

Checklists, build checks, linters, tests, SLOs, post incident responses, follow up tickets, etc all serve to unload "be a better software developer" into actual systems and processes that can continuously enable the better behavior.

Simply stating "do a better job" wont work as organizations scale. Related, you can expect what you inspect.

qez2 · 2 years ago
> If they didn't write a git commit message because "no one is going to read it anyway", they're a lazy engineer.

If an engineer spends an hour writing a commit message that no one reads, that's an unproductive engineer, compared to where they should be.

I have to admit, I am lazy. I don't spread seeds by hand; I use a tractor. I don't swim across the ocean; I use an air plane. Likewise, I don't write documentation in commit messages; I write documentation in PR descriptions, READMEs, and official document sources. You got me, I'm incompetent.

My "job" is to write software, not follow some arbitrary "pure" practices.

> If a coworker doesn't understand a code change, because they didn't bother to read the commit message

And I would argue we shouldn't cater to developers who make documentation difficult to access for everyone else by hiding it where only crappy tools can reach it.

herrkanin · 2 years ago
If writing good commit messages isn't specifically defined as part of your job, why would you waste business hours writing commit messages that are beyond what is expected of you and frankly useless since nobody would ever read it anyway?
bastardoperator · 2 years ago
Why would I waste time reading this paragraphs long commit message when I can look at a diff and a 40 character headline and completely understand the issue? You think it's lazy, I think this is wasteful. Personally I don't need an epic story about making a one character change because your editor isn't configured to catch gremlins... it's just not that interesting.
Cthulhu_ · 2 years ago
This is the problem with any kind of documentation; while you can write the highest quality, meticulous, most obvious and clearest prose, it's moot if nobody reads it.

And nobody reads it because there's so much of it and there's no clear starting point. People just want the summary of what they're looking for.

I started to learn Java almost 20 years ago, we had a text book and everything. After the first two chapters, I learned how to google and instead of reading everything, just find what I need. I never went in-depth with reading because... it's mostly useless knowledge that quickly becomes outdated.

lazide · 2 years ago
Counter point - Engineering systems which require constant overriding of basic human nature therefore requiring making significant effort on the regular to avoid mistakes is bad engineering.
lanstin · 2 years ago
I'd hate to say that laziness makes a person an incompetent developer. Often my problems stem from an excess of sincere hard work and rather than from laziness.
kriiuuu · 2 years ago
But if you don’t cater to the worst on your team you are often viewed as the problem
nonethewiser · 2 years ago
The "just do your damn job" retort presupposes that their job is to read the entire body of every git commit. That's the question - you can't just presuppose it.

Deleted Comment

Dead Comment

rpsw · 2 years ago
Overall agree with the sentiment, but I would add a more specific Bottom Line Up Front (BLUF) such as: "Fix test issues caused by non-breaking space character \xa0".

Tells me exactly what the problem was straight away, but I'm still free to choose to read more if I want to know more.

Anon1096 · 2 years ago
Yep this message is way better. And honestly, looking at the diff in github it is pretty obvious to me what has changed (and why really, since the only reason for a changeset to have a diff look identical is that non-visible characters have been added or removed).

So all I'd require is a good main message for history-search purposes. A short story about how you went to Narnia and came back to find the root cause of a bug isn't really relevant imo but I'm also not against writing it if you just want to vent in a PR description/commit's extended message.

ninkendo · 2 years ago
Stories about how you went to narnia and back may be useful to a future contributor who finds themselves in narnia. This is very likely not the last time that an invalid byte sequence will show up in one of the source files in this tree, and if it happens again, it may be good to see the symptoms in the git log.
master-lincoln · 2 years ago
This would be my ideal commit message as well. The rest of the commit body in the article is just how it was discovered. I don't think describing how one works belongs into a git commit message. Your message tells me why and which change was made, that's enough to me.
nextaccountic · 2 years ago
I love this concept. I always begin messages with the most actionable or important thing at the top, and the rest that follows is the context. Respect the time of others and don't bury the lede
SamuelAdams · 2 years ago
You see this all the time in business proposals. Executive summary at the top, typically 2-3 paragraphs max. Manager summary, 1-2 pages. Engineer detailed overview, 3-10 pages. Anything else is an addendum.
spenczar5 · 2 years ago
I have felt that pride in writing a great commit message, but I am less sure of the value to others. I don’t think most people search commit messages when they encounter an unusual error message, or when adding a new feature, or really almost ever.

It’s a bit sad, but I have a growing suspicion that beautiful commit messages are a bit of vanity by the programmer. The person primarily impressed is often the author; others will walk on by without noticing.

There is room sometimes for those aesthetic flourishes but I am not convinced they have much practical value, and I have stopped really being bothered by commit messages of “fix whitespace issue” from others. I think I am a better colleague for that.

Things might be different on a project like Git or Linux with huge distributed teams and tons of commits, versus the projects I am used to which have between 1 and 100 contributors, mostly from the same organization.

bombcar · 2 years ago
> I have felt that pride in writing a great commit message, but I am less sure of the value to others. I don’t think most people search commit messages when they encounter an unusual error message, or when adding a new feature, or really almost ever.

They have value even if the only person who will ever look at them is you - and I will say that when bisecting an issue, the commit message of the commit I finally find is really useful (or it could be if it wasn't fixed thing). It also means that if you encounter a similar issue again, you know that there's a note on a commit you can find.

anthomtb · 2 years ago
I agree with this wholeheartedly. If writing a detailed, multiparagraph commit message, assume the target audience is future you.

Most likely, a time-pressed dev on the far side of the world will think your commit broke something and send a 2:00 AM message of "URGENT: code broke CRITICAL customer request" with a link to the commit, whatever JIRA issue they are working on, and zero additional context. They will NOT bother to read the message (likely explaining how they got into their pickle in the first place) but will see your email, send a message, and do whatever it is they do while waiting for someone else to figure out the problem. You, being that someone else, will now have an excellent starting point on the top priority for the day. Much better than if your message had just been "fixed it".

tux3 · 2 years ago
In some orgs, people never run a bisect. Not once a year.

They go as far as squashing out swaths of history into big un-reviewable blobs. Once code has been merged, they never look inside a past commit again.

In spite of isolated (desperate) demands for rigor, it works fine.

20after4 · 2 years ago
I might be weird but I try to at least skim all of the commits on any project I am actively involved with. If it's an open source project then those commit messages will live on forever. They will even be indexed in regular search engines, not just code search (this maybe not so much now that GitHub is locking out bots more and more)

When I'm trying to solve a problem and not finding results on google or stack overflow, sometimes I search GitHub just to see if a similar thing shows up in PRs or commit messages anywhere (including private repos I have access to search). It's helped me out on countless occasions. Good commit messages do have value beyond vanity, absolutely without a doubt. The fact that many developers aren't looking, that's their loss and hopefully they will see the light once they have enough experience. Maybe teach a junior dev how to search them! Maybe link them to TFA.

Lex-2008 · 2 years ago
I was bitten by too-short commit messages few times already, when someone asks me "why is it done this way?" - I check git history to find my own 3-year-old commit with message "it should be done this way"... Since then I try to write my commit messages so at least future me would get a hint why a change was necessary.
ajuc · 2 years ago
If you change a line of code without doing git-blame on it first you're doing it wrong.

I've been bitten by this many times - I change obvious bug, I'm about to commit the changes, I see the previous commit which introduced the "bug" on purpose and the attached JIRA task has perfectly good explanation for why my obvious change would have reintroduced some bug from 2 years ago :)

kaashif · 2 years ago
> If you change a line of code without doing git-blame on it first you're doing it wrong.

Working on a project where this is necessary sounds like a hellish experience.

The place for comments explaining why the code is needed is right next to the code! On an adjacent line!

boolemancer · 2 years ago
Seems to me that if you're introducing something that seems like a bug on purpose, you should probably have the comment in the code explaining why it's there.
pletnes · 2 years ago
I often use the git blame feature in the IDE to understand what’s been going on. A good commit message will be appreciated, should I happen to find one.
j2kun · 2 years ago
I find them valuable, especially when trying to study a new codebase. In the current era where we get immediate feedback on everything we post online, it's harder to see the value that comes from writing good commits, and the value can be delayed by weeks, months, or even years.
cheald · 2 years ago
IMO, the primary target audience for good commit messages is the same target audience as good code comments: me six months from now. Being able to read why and how a particular thing was done has helped me in debugging and troubleshooting an issue on more than one occasion.
kelnos · 2 years ago
I both agree and disagree. I think you're right that most commit messages won't end up being seen. But when you do need to see one, having a good commit message can be critical to understanding a change, especially if the person who made the change is long gone by the time you need to look at it. Or, hell, if that person was you, but it was far enough in the past that you don't recall the details.
cybrox · 2 years ago
If anything, this just tells us that tooling should incorporate commit messages a lot more. While these kind of messages are most valuable in large projects, there are some of them in a lot of projects and they could have saved a lot of time.

Especially now with AI IDE integrations, incorporating a software's whole history into supplemental tools would be more useful than ever before.

gotts · 2 years ago
I agree with you that searching across commit messages happens rather rarely so return on great commit messages might be questionable

where great commit messages like the one in the blog post make perfect sense are pull requests. If the commit message explains the whole thought process that the author had while working on it, it saves so much time on pull request review.

thefourthchime · 2 years ago
I agree. My view is that you shouldn't write comments because if you have to, then your code isn't clear or organized well enough. If you do need a comment, perhaps to document a "Chesterton's Fence", you should put a big nice comment block to explain why and what's going on.

The reality is people don't like to read, if they do it'll be an overview of how the code is organized, they don't want to read git commits or even comments. The code is the only truth. GPT can already explain in English what the code is doing pretty well already, imagine in 2-3 years.

agubelu · 2 years ago
I think the "code should be self-documenting" view is a bit simplistic.

Good comments shouldn't explain what the code is doing, I agree that should be evident from the code itself if it's clear enough. However, why the code is doing what it's doing, or why it's being done in a certain way and not in a different way, is meta-information that is very hard to express in the code itself, and that's where comments are most useful.

slily · 2 years ago
I agree this one goes into more detail than is useful for future reference, most of the explanation would be better off in a PR description. But in general I would rather people go into too much detail than the more common variant of not providing any contextual information anywhere (or only in a chatroom at best) and sticking to one-line commits. As long as the important information is near the top so I don't have to wade through the verbose "this is how I discovered this issue" thing, go crazy.
cerved · 2 years ago
Worse, PR tools like Azure DevOps (and GitHub?) don't do a good job of displaying the information.

Just a big diff.

I often get asked about the reason for a change in a review comment, even when there's a thorough description in the backing commit.

It's sad. I would prefer PRs over email like Git does

gumby · 2 years ago
That first line of the commit message is most important so that `git log` can address chesterton's fence. And IMHO in this case the committer whiffed.

The key is not to put what you did in that first line, but why. Anyone interested in what can just look at the code, perhaps via a diff.

So something like "nginx .conf files must be in us-ascii"

Then "changed blahblah.erb to remove nonbreaking space character"

Then the rest of the commit message which is quite good.

Think of it as a news article: write in decreasing levels of importance and increasing levels of detail, assuming the reader could stop reading at any point.

hatsix · 2 years ago
Nah, first line needs to be a summary of what you changed, so that you can find the offending commit in the first place.

A news article doesn't explain WHY in the headline, it explains what.

In this case, the OP's first line is spot on... if you're reading through git log, you can see that this commit likely didn't change anything functional about a test, and you should move on.

tux1968 · 2 years ago
Hard disagree.

There's little reason to search the text of commit messages to find out what changed. There are many git tools to find out which commits affected parts of the code you're interested in. Whereas, trying to find that in commit messages is really inefficient and relies on reading, rather than such automated tools.

The purpose of the commit message is to help our fellow humans get a higher level understanding than is available from quickly scanning the code.

nathan_phoenix · 2 years ago
> Think of it as a news article: write in decreasing levels of importance and increasing levels of detail, assuming the reader could stop reading at any point.

Great quote and life advice, will definitely steal this! Thanks!

joshuamorton · 2 years ago
But a commit message in an arbitrary project is not where you give someone a lesson about nginx rules.

"nginx .conf files must be in us-ascii" is maybe a good bug or pull request title, but it may correspond to multiple commits that do different things, but it doesn't tell me what's happening. Is this converting a file to us-ascii, is this writing a tool to convert files to ASCII, is this updating documentation, is it creating a test, some combination? Leading with what, not why addresses that confusion.

palata · 2 years ago
> The key is not to put what you did in that first line, but why.

Can't we just say that the key is to put something that makes sense for the first line, given that sometimes only the first line is printed?

I don't really care if it says "The files must be in us-ascii" or "Changed the files to us-ascii"... both of them clearly tell me that the files were changed to us-ascii.

gumby · 2 years ago
The difference is Chesterton's fence: when you encounter something seemingly pointless you should learn why it was there before you consider removing or changing it.
adrianmsmith · 2 years ago
I think the disadvantage with this style of documentation is you can't really alter the commit message after it's written.

(I mean you could obviously with "rebase" but are you really going to alter something written one year ago, already merged to "main", and cause a bunch of pain with everyone's feature branch etc.?)

Compare that with documentation stored in a .md file, or even a Wiki or even Confluence. My colleague can write something and if I see a way to improve it I can go ahead and do that, and other colleagues can improve on what I've written.

In this particular case I suppose the bug is fixed and won't come up again. But I also myself find it tempting to describing the design of a particular component when I commit that component, and that's something I now avoid. What about when that component needs to be changed by a future commit e.g. due to the business requirements changing? Will the commit documentation just describe the differences? Then in order for a new team member to find out how the system works by reading the documentation they've got to read multiple commit messages and "merge" them in their head.

masklinn · 2 years ago
> I think the disadvantage with this style of documentation is you can't really alter the commit message after it's written.

That is not a disadvantage. The commit is a historical record, if I come back to that commit 3 years later I want to know its purpose in the context it was in, I don’t want a whitewashed history.

> Compare that with documentation stored in a .md file, or even a Wiki or even Confluence. My colleague can write something and if I see a way to improve it I can go ahead and do that, and other colleagues can improve on what I've written.

That’s like comparing a bicycle and a goose.

> But I also myself find it tempting to describing the design of a particular component when I commit that component, and that's something I now avoid.

That’s a shame. Knowing the considerations (or lack thereof) and tradeoffs at time of creation are often useful to understand defects, either in the original, or in evolutions, or in changes of use case.

> Will the commit documentation just describe the differences?

Yeees?

> Then in order for a new team member to find out how the system works by reading the documentation they've got to read multiple commit messages and "merge" them in their head.

No, for that you maintain a separate “current” documentation, which does not need to cover implementation tradeoffs, or that the original was written under time crunch, or whatever.

thrdbndndn · 2 years ago
> That is not a disadvantage. The commit is a historical record

OP's point is that, while commit message is indeed a historical record, documentation isn't (or shouldn't).

If you double commit message as documentation, it would cause issues like wrong information confusing or misleading future readers because it's non-editable.

20after4 · 2 years ago
I really love documentation that lives in the same repo with the code. My favorite is a .md file for every module, class or component. Some mixture of inline code docs and standalone docs is probably ideal. But docs as markdown that don't require some compile step to build the documentation, and doesn't require opening a browser to view them, is just so much better, IMO, compared to any sort of external docs like a wiki or html on a server somewhere that gets re-generated by a CI job.
spencerchubb · 2 years ago
If you put docs in a markdown file, you will still be able to see what the markdown said at that time because it will also be in the commit history.
twosdai · 2 years ago
Sucks when you mess up in your commit message though and don't type the right thing.
ryanisnan · 2 years ago
I think the non-editable nature of commit messages is precisely the benefit though. Yes, you can't really modify them post-hoc, but being able to step through a code base's history can be really illuminating.
kelnos · 2 years ago
A commit message isn't documentation that should be updated as things evolve. It's a historical record of a single change. Sure, if you later realize you forgot to put an important detail there, that's a shame. But overall I think it's actually important that they can never change.
josephg · 2 years ago
I really think git made a mistake in conflating the immutable log of what was changed with the (ideally mutable) story of what got merged in. So you see people arguing over squashing commits vs rebasing vs merging. Squashing commits makes the history of commits a better story of features being added. Merging preserves the immutable log of the actual changes made to the code, and rebasing sort of does a bit of both.

But, I don't see any reason we can't have our cake and eat it too. We're programming computers after all and we can make them do whatever we like.

If I wrote my own git, I think I'd split commits into those two parts. I'd leave the history of changes immutable - probably with some sort of Merkle DAG like Git does. And then have a separate associated data store which stores the commit messages, in a nice sensible, editable log describing the work that actually happened. Let people arrange and rearrange the commit descriptors however they like. If you want, group commits around feature tags, fix typos and make any changes to the messages that you want. But, the whole while the underlying log of diffs ("what actually changed in the code") can remain (gloriously) unaffected.

passivegains · 2 years ago
> I really think git made a mistake in conflating the immutable log of what was changed with the (ideally mutable) story of what got merged in. So you see people arguing over squashing commits vs rebasing vs merging.

Every team I've been on struggled with this over and over and over. The tools are so hard to use it's tempting to make the version control process facilitate "git log" instead of the other way around, which is just absolutely insane. Obviously my co-workers should learn to use their damn tools like professionals, something something a poor craftsman, but honestly? This time the tools really are to blame.

zilti · 2 years ago
Fossil has something a bit like that.

Deleted Comment

dllthomas · 2 years ago
> you can't really alter the commit message after it's written

You can append with git notes, though on a message that long I expect they're unlikely to be noticed.

goku12 · 2 years ago
Commit messages aren't a replacement for source documentation. The latter contains information relevant to the tree. Commit messages are transient information (historical info as someone put it). For example, an update caused by outdated dependency. Or the tests done to diagnose a bug.
keybored · 2 years ago
I have seldom run into this being a problem.

The context of a commit message is that someone took some minutes to explain what the context of the change is. Using their current understanding. Explain the problem. Lay out the assumptions. Given three paragraphs or so it will help immensely to figure out how or why something you/them thought was the case was in fact wrong when the message was written.

That is documentation in itself.

And if you make straightforward mistakes like a typo in an issue key in the message and you really care: you can make a note of it on the commit with git notes.

> Compare that with documentation stored in a .md file, or even a Wiki or even Confluence.

I don’t want to access a remote wiki for every little code context (certainly not Conf.). The code is just right there. Comments/Doc comments/commit messages are mostly enough for that.

tehnub · 2 years ago
I know this isn't a great solution, but GitHub does let you write comments on individual commits. You could add whatever addendums you want there.
tyrust · 2 years ago
I think commit messages are mostly valuable for a future code reader asking "why is this bit like this?" and then looking at blame logs for the answer. As you point out, bigger picture stuff ought to be elsewhere (documentation, tracking bug).

Keeping docs in version control and including doc changes with the code changes is a nice way to address your concern.

macintux · 2 years ago
There's no reason this documentation can't be replicated in another context, and for all we know it was.
OJFord · 2 years ago
One thing I disagree with is:

> I wouldn’t expect all commits (especially ones of this size) to have this level of detail.

(emphasis added) - actually in my experience it's often the little ones, innocuous looking things that might really need a relatively longer explanation.

Yesterday I wrote three paragraphs on why I added `--limit=999` to a `gh pr list` because it's confusing: there's already a `limit(` in the `--jq` argument, and the higher it is (given say infinite PRs in total) the lower the end result will actually be. (Yes I wrote a comment too. And probably spent even longer thinking about and working it up than writing about it; hopefully I'll recall it as an example the next time someone implies the job is about churning out code!)

unregistereddev · 2 years ago
I agree with you that the little innocuous things often need a longer explanation, but the linked commit message is way too long IMO. It either wastes the readers' time, or it causes the readers' eyes to gloss over at the wall of text. You don't need to document your entire journey in order to document your findings and explain why.

> This was a non-ascii whitespace character that caused `ArgumentError: invalid byte sequence in US-ASCII` when running `bundle exec rake`

^ should be sufficient. It includes enough keywords to come up in a search if someone has a similar problem in the future, it contains the root cause of the problem, and it is short enough that people are unlikely to gloss over it.

OJFord · 2 years ago
Yes, it's not my preferred style either, but it's much better than 'fixes error' type thing, subject line only, that's so common.

I like the form:

    Fix ArgumentError 'invalid byte sequence'

    Non-ASCII whitespace characters cause [...]. This was apparent in [...] because [...].

    This commit fixes the issue by removing the offending character; so the file is now solely ASCII characters.
Or that sort of thing. Subject tells me why, body tells me what the problem was and how it was fixed. (Who, when, where are already in the commit metadata! The diff shows a very literal 'what' too, the what/how in the body should offer context and explanation as required.)

mbork_pl · 2 years ago
The article explains why all the rest is, maybe not needed, but good to have.
keybored · 2 years ago
For a commit that adds a language binding (and might be 100+ additions/deletions) I might just say “Add X function”. Because I’m just following established patterns. But for the linked kind of change? Yeah, several paragraphs of explanation is definitely useful.
pdpi · 2 years ago
I find myself commenting code in a similar pattern: A small kernel of "interesting" code that has a 1:1 ratio (or higher) of comments to code, which enables the rest of the codebase to be "boring" self-documenting boilerplate-y code that doesn’t really warrant much in the way of commenting.
macspoofing · 2 years ago
It's not a great git commit.

1) For all that text, the first line "Convert template to US-ASCII to fix error" - could be better. Maybe a couple of extra words to state what whitespace character caused the error, and what the error was. That comment plus the diff is all the context you need.

2) Honestly, everything else is kind of pointless. It doesn't hurt, but there's not a lot of value here. The author documented their journey in tracking this bug .. who cares?

GreenWatermelon · 2 years ago
People who like to learn and improve as programmers do care. In fact, the article explains the value of all that additional stuff which implies there are those who DO care.

The article even provides a link to a search result showing multiple commits from people who learned from the fix.

That commit message is a treasure trove of knowledge.

macspoofing · 2 years ago
Outside of the small caveat that his first line could be better (which is what all future engineers will read while scanning commit messages), like I said, at worst, it doesn't hurt.

I like this level of detail, whether it is at the commit, PR, or ticket level. If one of my guys did this same write-up for this same problem, especially one of my junior guys, I would have patted them on the back and told them they did a great job - because you wouldn't want to discourage them from doing more of this kind of write-up in the future.

But here, we can be a little bit more honest, and the truth is, that the problem he solved was trivial, so this kind of detail is overkill for that problem. Once find that the config file has unprintable non-ascii character, immediately you know most parsers would blow chunks on that - and there is only one fix - remove the problem character. So succinctly tell me the error you saw, tell me the character, and if you know tell me HOW it got in there (which is probably the most important detail that isn't in the write-up so this could be prevented in the future) - and that's enough because if in the future another engineer does a ticket/commit search for this error in our bug tracker, hopefully these details will show up immediately.

bhasi · 2 years ago
For great commit messages, just browse the git history of the Linux kernel where this is the standard.

The first line always mentions the subsystem affected by the change, followed by a one-line imperative-mood summary of the change. Subsequently, three questions are answered in as much detail as possible:

1. What is the current behaviour? 2. What led to this change? 3. What is the new behaviour after applying this change?

Example:

"Currently, code does X. When running test case T, unexpected behaviour U was observed. This is because of reason R. Fix this by doing F."