Show HN: Generate commit messages using GPT-3

This creates precisely the kind of commit messages that I regularly scold junior developers for :)"

In my opinion, commit messages should clarify the intent of WHY you changed things. I can already see WHAT you changed from the diffs.

But of course, any tool can only work with the what, they cannot know that these lines are related to a bug report filed in a technically unrelated system.

peepee1982 · 3 years ago

I disagree. I want the what. The change itself explains the how.

If the why isn't obvious and there's no link to a tracking system that explains it, it's fine if it's in the message body.

I do want the why in comments, though.

KronisLV · 3 years ago

Personally, I think that the following is a good approach:

  PROJ-2354 add/modify/remove/... WHAT to implement/fix/... WHY

with the code showing the HOW.

Ideally, with the commit/merge request having a textual description and/or a list summary for the overall changes, alongside some diagrams/images/gifs/videos, as well as further discussion where applicable. Oh and an issue management system of some sort with the original (business) requirements, notes from requirements engineering, as well as information about testing. Something like architecture decision records (ADR), script snippets, Markdown Wiki documentation or install instructions can also live in the repo. Then, with a decent test coverage and CI setup, it can also be pretty safe to merge the changes, because most of the stuff concerning them will be known and understood.

But at the end of the day, there will be as many opinions as there are people.

For some, there is no need for longer commit messages (e.g. with multiple lines, like a separate subject/body with explanation) which is more or less my case because that information will be in the merge/pull request. Others will say that filling out merge/pull requests is unnecessary because the commits should have that information (I disagree, but I've heard that stance). Some other people won't even bother with commit messages because in their eyes working code at the end of the day is all that matters (once again disagreed, but we've all seen "code fixes" in the log before). And some will have way different workflows, like not using a web UI of some sort for discussion but instead relying on commit logs and mailing lists.

Use whatever workflow feels adequate for you and your colleagues.

Joeri · 3 years ago

I would go for the combination: first what, then why.

Links break, so a link to why is not good enough when it comes to long-lived code. A good commit message should however start by completing the sentence "when committed this will ...". This makes reading the one-line summaries of the git log the clearest to interpret what happened.

VadimPR · 3 years ago

In my way of working, the 'why' can go into the overall PR and the 'what' into the individual commits. Both are important - the reason for changes and a concise summary if what you've done.

jiggywiggy · 3 years ago

PRs are not easy to read in the commit history, a year later a commit message is mostly flat without context.

lelandfe · 3 years ago

As I always say in these conversations, PR descriptions and comments are ephemeral. Your git history should be forever, but you’re not guaranteed to be in the same repo on the same host for eternity.

I have already worked on multiple projects that got handed to us as a .git/ folder. Commit messages referencing non-existent issues abound.

I now make my whole team ensure that nothing crucial is left to live in the PR alone.

OJFord · 3 years ago

I fully agree that this is an (even) easy(ier) way to write crap commit messages.

> But of course, any tool can only work with the what

Well, it could be a lot better at least - imagine passing a Jira ticket, telling it it's a bug fix or feature (the script could determine from the API); then you could probably get it not only to neatly summarise 'why' for the subject line but also have a go at relating it to the diff for the body.

huijzer · 3 years ago

Why should the commit message explain why? I thought the point was to give a summary of the changes so that you don’t have to read the full diff.

jkukul · 3 years ago

I think that one doesn't exclude the other.

You can still write a multi-message commit with two messages:

1. Short summary of what is being changed

2. Explain WHY

I think the point is that even if 1. is missing it can be worked-back by reading through the diff. But if 2. is missing then the future generations have no way of finding out reasons behind some decisions.

kolinko · 3 years ago

The summary of changes can be inferred, or generated automatically with gpt ;)

"Why", on the other hand can be lost.

Especially during refactoring. Let's say you removed some assertion / safety check from a function, because you verified that it's not necessary there. Without explaination in a commit, someone may not get your reasoning.

Same thing with renaming variables, reordering the code etc.

Comments may be useful in some cases, but in many cases there won't be a right place to put them in.

0x008 · 3 years ago

Because it is very useful down the road to understand why you used implementation x and not implementation y.

OJFord · 3 years ago

The subject line should be 'why':

    Fix 500 due to syntax error accessing /users

the body can summarise and expand on (..if you know what I mean) the diff as well as explaining why:

    Due to <arcane language reason> in this case <syntax> was
    interpreted as a baz, when clearly the author in <blame commit>
    intended foo, which would return the response with bars here
    as expected.

    This commit fixes the issue by adding an explicit semicolon,
    thus forcing the foo interpretation.

That's probably overkill for a simple syntax error (unless it really is that arcane in which case it might be a bit of a teaching moment/object lesson).

Compare:

    Add semicolon

    [no body]

splix · 3 years ago

What's the point of the summary?

But "why" is very important for the future code owners. Year or tho later someone else adding a new fix may have a question about the existing parts to avoid breaking them. And the only thing he can rely is `git blame` to figure out "why it's implemented in this particular way"

charcircuit · 3 years ago

A commit message is made of a title and a body. There is usually more context on why a change was made than can fit in the title.

stubish · 3 years ago

Generated commit message could explain the why, if the code changes had comments explaining this sort of thing. Someone still needs to document why changes are being made, but at least you only need to do it once. And maybe GPT-3 can do a good job of selecting the relevant info and summarizing the why of the change?

BadOakOx · 3 years ago

I fully agree, this is my favorite write up on how a git commit messages should look like: https://cbea.ms/git-commit/

taink · 3 years ago

Very interesting read, thanks!

Would you happen to know the justification behind "capitalize every commit subject line"[1]? I can understand finding it more appealing, but talking about it being as important as limiting the subject to 50 chars and not ending it with a period (which has a sensible justification), not as much.

[1] https://cbea.ms/git-commit/#capitalize

Deleted Comment

Ghoyome · 3 years ago

I just had a realization. Usually in my private repos I do “what;why” so I can go back to commits when I brake stuff. But I should be using branches for what and commits for why…

rgrs · 3 years ago

Interesting. I wonder if ChatGPT can be fed data from JIRA or feature ticket implemented by the commit. This could give us the "why"

fatfox · 3 years ago

Can you give some examples of good commit messages?

naet · 3 years ago

A bad commit (that one of my coworkers always does) is "update file.ext". Says nothing other than the name of the file that was updated, which ends up with tons of repeat commit messages for common files and provides zero info that wasn't already included in the commit itself.

Another poor commit is a description like "adds padding". It's a little too vague and doesn't really tell you much that wasn't already apparent by looking at the change itself.

A better commit might be something more like "Add variable padding to ProductLogo component, fixing logo overflows for issue#78". It summarizes the change, the intended outcome of the change, the reason for the change and a reference to an issue all in one short sentence.

You don't have to go into overwhelming detail for every minor front end change but if you're intelligently tracking and squashing your commits writing them well can help a lot later on if you ever need to understand the context of an older commit or even a given line in the codebase.

TedDoesntTalk · 3 years ago

Subjective and also depends on the culture where you work. I’ve worker at places where the majority of the “why” Is in a JIRA ticket, so the commit message better reference that ticket number. Not so at other places. See what I mean?

PUSH_AX · 3 years ago

Or, just squash your commits and focus on bigger things.

codingminds · 3 years ago

As already mentioned: It depends.

But this might be a good start: https://www.conventionalcommits.org/en/v1.0.0/#summary

We've used this as a starting point and adapted it to our needs (E.g. some simplification, defining the possible values for scope, etc.)

scrollaway · 3 years ago

Take a look at Wine's commit log. It's really well curated. https://github.com/wine-mirror/wine/commits/master

kardos · 3 years ago

So this would be more appropriate as an "explain commit" tool, which would also evolve as gpt3 gets better

yummybear · 3 years ago

"Fixed some bugs I was told to"

anshumankmr · 3 years ago

Fixed the bug fix issue

I must have seen these commit messages so many times if I had a penny each time, I would be rich by now.

onion2k · 3 years ago

For the sake of your cherry-picking colleagues please don't bundle multiple fixes in a single commit.

throw_m239339 · 3 years ago

> In my opinion, commit messages should clarify the intent of WHY you changed things. I can already see WHAT you changed from the diffs.

And I'd scold you for doing that if I were your superior. The WHY should be in the pull request, not in the commit message. a commit message should succinctly explain WHAT was changed from an architecture/organisation perspective.

Kiro · 3 years ago

Big disagree on that. I think the commit message should tell me what the change does, not why.

silisili · 3 years ago

Ideally, both.

'Change rounding to thousandths' isn't overly helpful, and probably apparent.

'Fix overspending bug' is vague.

'Fix overspending issue by rounding to thousandths instead of hundredths' is the ideal commit msg here, as it gives a brief what and why. Possibly even with a ticket number, though I see how after years and switching systems that becomes less useful. More useful is briefly describing the why as a code comment, using good judgement of course.

jkukul · 3 years ago

It's a bit ironic. You just stated your opinion without explaining why you think that way.

logronoide · 3 years ago

As a rule of thumb: the WHY in the PR, the WHAT in the commits.

reacweb · 3 years ago

I would say the opposite. The manager who receives the PR (merge request in gitlab) needs to know what has changed (if it is not obvious from the diff) to assess the change before accepting it. He has to know what has changed, for example to decide which non regression tests to performe.

The final user of the software will receive a changelog (a list of commit messages) that shall identify the bugs that have been fixed and the new user requirements that have been added. He needs to know why the code has changed to know what he has to do.

Deleted Comment