Ask HN: What is your Git commit/push flow?

jawns · 4 years ago

Here's how I address this problem.

When I'm developing, but before I create a PR, I'll create a bunch of stream-of-consciousness commits. This is stuff like "Fix typo" or "Minor formatting changes" mixed in with actual functional changes.

Right before I create the PR, or push up a shared branch, I do an interactive rebase (git rebase -i).

This allows me to organize my commits. I can squash commits, amend commits, move commits around, rewrite the commit messages, etc.

Eventually I end up with the 2-4 clean commits that your coworkers have. Often I design my commits around "cherry-pick" suitability. The commit might not be able to stand on its own in a PR, but does it represent some reasonably contained portion of the work that could be cherry-picked onto another branch if needed?

Granted, all of the advice above requires you to adhere to a "prefer rebase over merge" workflow, and that has some potential pitfalls, e.g. you need to be aware of the Golden Rule of Rebasing:

https://www.atlassian.com/git/tutorials/merging-vs-rebasing#...

But I vastly prefer this workflow to both "merge only," where you can never get rid of those stream-of-consciousness commits, and "squash everything," where every PR ends up with a single commit, even if it would be more useful to have multiple commits that could be potentially cherry-picked.

elpakal · 4 years ago

I do this (kind of too), but instead of an interactive rebase I just do a `git reset --soft <target_branch>`, where target_branch is the local (and up-to-date) copy of my target branch. That gets me one clean commit that I can force push up to replace my branch @ remote.

(This works for me because auditing commit history is not important where I work, if it were I would organize commits better.)

dopidopHN · 4 years ago

I first did rebase, and after a while I realized I was not getting much of it since I mostly wanted to merge everything down to a single commit.

andrewzah · 4 years ago

This is the ideal workflow for me since I think merge commits clutter up the log, but it falls apart if people aren't consistent in following it.

*glances and $corp git repo and sees 'updates' 'fix' 'updates'. Sigh.

Consultant32452 · 4 years ago

I've always been fond of the "WIP" commits.

aerovistae · 4 years ago

...what is the golden rule of rebasing

edit: googled, "Never rebase while on a public branch" i.e. a shared branch

notapenny · 4 years ago

It usually works out fine when you do a `git pull --rebase`, but not everyone does this or has it setup so pulling might have some nasty effects. Generally helps to consider a feature branch as a private branch. Don't push to other people's features without asking, don't fuck up other people's work.

SnowingXIV · 4 years ago

Yes, if you're going to rebase, rebase your feature branch. Do not do it on the shared staging/dev etc.

fatnoah · 4 years ago

This is very close to my approach. I work in a private branch and make many commits along the way, but then organize it to tell a coherent story for the actual PR.

spacechild1 · 4 years ago

I do the same. A clean and logical history helps other people understand your code and might consequently increase the chances of getting it merged.

nyanpasu64 · 4 years ago

I do the same thing generally. In fact I generally find multiple mistakes or changes to be made, reading over my own code before merging, requiring figuring out which commit to merge my changes into (git absorb helps in the easy cases) and even more rebasing, or giving up and adding an extra commit at the end.

I see some people whose projects (Furnace Tracker, PipeWire, previously Famitudio) seem to make progress very quickly without getting noticeably slowed down by technical debt, despite sloppy programming and unorganized commit logs (push-to-head). Meanwhile I move slowly, dread reviewing hundreds of lines of my own code, and produce technical debt (regrets) anyway, not as many surface-level lintable errors but plenty of entrenched mistakes. I wish I could move faster, but instead struggle to make progress.

matticusrex · 4 years ago

The only problem I have with this workflow in the command line is that I would like to be able to split changes to the same file across multiple commits. I think some GUI tools enable this, anyone know about it?

tuckerpo · 4 years ago

I do this via git add -p which breaks your changeset down into atomic patches that you can either add, skip or delete before making a commit. You can turn one file change into many commits this way, if need be.

btschaegg · 4 years ago

Have a look at `tig`. It's even included in Git for Windows now and does this reasonably well.

Only the keybindings are a bit weird if you're not accustomed to Vim bindings:

- Open tig

- Change into the staging view with `s`

- Select your file using the arrow or `j` and `k` keys

- Press Return to show the diff

- Navigate to the line(s) in question with `j` and `k` (arrow keys will switch files)

- Stage parts with `1` (single line), `2` (chunk parts), `u` (chunks) or split chunks with `\`

- "Leave" the diff with `q`

- You can find the keybindings with `h` in the help screen, which also uses Vim keys -- like manpages usually do

kjeetgill · 4 years ago

If you're a vim guy, I use `git difftool` setup with `vimdiff` for this. Let's say you have your changes in a branch CHANGES on top of public branch PUBLIC.

1. I `git checkout PUBLIC -b CLEANUP` to a new branch.

2. Do a `git difftool CHANGES`, which opens each changed file in vimdiff one at a time.

3. For each file, I use :diffput/:diffget or just edit in changes I want.

4. Commit these changes on the CLEANUP branch.

5. Use `git difftool CHANGES` again to see the remaining diff.

6. Repeat until the diff comes back empty!

My unstructured changes tend to contain a handful of small typo fixes, white spacing, localized refactors, and 1 or 2 larger refactors and a behavioral change. Once they're all broken out, It's usually easy enough to use `git rebase -i` and reorder the smaller changes first, put out PRs for just those first, etc.

distortedsignal · 4 years ago

I use the interactive flag on git add for this. It lets you add parts of a file, commit, and then do it all again. I want to say it's git add -i, but I'm not 100% on that. My fingers just do the right thing when I want it to happen.

tailspin2019 · 4 years ago

I use Sourcetree [0] for this (on Mac but Win version available)

I've tried most Git clients on Mac over the years and kept gravitating back to Sourcetree.

I only tend to use it for this particular workflow (picking out very granular changes on a line-by-line basis). Otherwise, 90% of my git stuff is via IDE integrations or command line.

[0] https://www.sourcetreeapp.com

erdo · 4 years ago

Gitx is great for that on a mac, but every time I set up a new MacBook I have to hunt down the latest repo / build. It keeps getting abandoned and then forked and continued, and then abandoned again...

Edit: looks like it's back from the dead again :) https://github.com/gitx/gitx

yodsanklai · 4 years ago

The way I deal with this is 1. try to make commits small enough so you are less likely to split them after the fact. 2. when I need to split a commit, I use VSCode UI and apply patch by patch. This is one of the rare case were I use a GUI for git. For most other things the command line is fine.

arilotter · 4 years ago

When you're doing your interactive rebase, find the commit you'd like to split - choose "edit" for that one. Then when you reach that point, you can do `git reset HEAD^1` to bring those changes back onto your staging stack, and make as many commits as you need.

atq2119 · 4 years ago

I can second the recommendation of tig in the sibling comment.

Additionally, git itself comes with a simple `git gui` command that allows you to do partial commits on a line by line basis. It also has a nice "amend last commit" mode.

bentcorner · 4 years ago

sublime merge handles this really nicely. You can do this in VS code too but the UI is a little more fiddly (right-click->"Stage Selected Range").

newaccount74 · 4 years ago

If you have a Mac, check out GitUp. It's a simple but fantastic tool for cleaning up git histories: merging commits, splitting commits, reordering...

matijsvzuijlen · 4 years ago

Git has a built-in GUI for that. Just run 'git gui'. It's not pretty but it works.

dimtion · 4 years ago

The most important git feature I discovered was `git add -p`, this allows both to select which patchs to stage, but also to do a review of what you are going to stage. Combined with `git commit -v`, this allows you to have plenty of occasions to review your changes before creating your pull request.

Shameless plug, but here are other efficiency tips I wrote about, for working in a high demanding environment: https://dimtion.fr/blog/average-engineer-tips/

seba_dos1 · 4 years ago

For tasks that you have mentioned, there's also built-in `git gui` for those who prefer to click.

ponsfrilus · 4 years ago

Where is that? `git: 'gui' is not a git command. See 'git --help'.`

mbar84 · 4 years ago

Sublime Merge offers a nice GUI for the workflow you're describing with `git add -p`

bjourne · 4 years ago

Here is my last five commit messages: "width*height is area", "fixing stuff i broke", "rm properties we dont need", "rm more useless attributes", "nicer figures". I do my best to keep the code base as clean as possible, but I couldn't care less about keeping the commit history pretty. Any time spent on prettifying git history is better spent on documenting the existing* code imo.

convolvatron · 4 years ago

thank you for saying this. I'm always afraid to. revision control is a necessary safety net and facilitates discussion around changes (PRs). but people act as if the history is somehow _really really important_.

I've seen someone post on HN, apparently seriously, that the history is more important than the source.

I know a (potentially) really good developer that spends his time pulling in the recent patches and reorganizing them to make an alternate history that is prettier somehow.

sure, every once and a while it because useful/necessary to bisect, and a 'clean' history might help with that.

but seriously - why do we fetishize this? this is a medium where the amount of writing vastly outweighs the amount of reading.

when people are looking for a bug do they seriously find value in seeing how the code evolved? or do they just figure out why it doesn't work? is there an implicit assumption that the code all worked at some point and the task is to find out when/how it was broken?

just really confused

74ls00 · 4 years ago

Yes, going back and understanding why something broke/has changed is incredibly valuable. Often it's not because of one singular decision but a collection of decisions over time that resulted in some behavioural regression. Being able to easily hop through all the commits of the recent past is incredibly valuable for me to understand how we can prevent such errors in the future, not just patch over the current one and move on. Fixing things without considering how we got here I tend to find leads to messy code; extra checks and assertions that aren't necessary if one takes the time to update the underlying assumption or modules that end up too tightly coupled because an extra bit of logic is added to fix that one bug.

Obviously it's possible to go too far; not every commit needs an attached essay. Many of my commits are just "fixed typo" or "added unit test for X", but then sometimes I'll write a short paragraphs or two explaining my rationale, referencing the commits that came before

8b16380d · 4 years ago

Yeah I love picking my way thru junk commits/comments. You may as well not use VCS.

sodapopcan · 4 years ago

It depends on the project. An open source library should probably have a sane history, a closed-source application, in a lot of cases it doesn't matter so much and often useful to see the whole workflow.

RussianCow · 4 years ago

Why would it matter whether the repo was for an open source library or a closed-source application? That seems like an arbitrary distinction.

kyleee · 4 years ago

i think this can be true, in some cases the git history may never be looked at so agree effort may be better soent elsewhere

wildpeaks · 4 years ago

When multiple people work in the repo, I like Squash Merge in Github best because you can still do small commits in your feature branch, and when you merge, it generates a message from all commits messages (so there is still a trace of the process, but you can get rid of noise like "fixed a typo" with the benefit of hindsight) and history looks clean because it's merged as a single commit, no rebase footgun to worry about.

https://docs.github.com/en/pull-requests/collaborating-with-...

musicmatze · 4 years ago

IMO this feature destroys the history the developer of the PR should have crafted carefully in the first place. With a squash merge you basically say "I don't give a f..." and remove all traceability from the PR, giving future developers potentially a big headache if they have to figure out why something was done. That's why I think squash merge should _never_ be used and is one of the very big anti-features of github. Commit history has to be crafted like code, by the developer who wrote that code and in a way that other developers can see the steps that were taken to craft that code. Squashing PR commits just removes all of that, resulting in a SVN-style repo with "checkin 2020-01-01" like commits. Yes, there might be more in the commit message, but its value is lost because it is not for a small, possibly atomic change.

blindmute · 4 years ago

The squashed commit message should include comprehensive documentation of what happened in the commit. There's no reason "check-in 2020-01-01" should appear anywhere. In the extremely rare case of needing to see how the commit was written step by step, the PR is still there.

mjmj · 4 years ago

For our company squash merge simplifies cognitive load when a bad commit is deployed to production. There’s exactly one commit that caused the issue and one commit to be reverted. Also very easy in CI to deploy previous commit to revert back to steady state quickly before debugging whatever the issue is.

lmarcos · 4 years ago

I used to do exactly what you described a few years ago (when I was learning git for the first time). Not anymore. A few reasons:

- commit early/commit often. I usually push one commit when I think the feature is done. While others do a review of my code, I commit to improve the code/fix the issues found by others. The advantage here is that future readers looking at the history of file X line N can know what other files were introduced alongside file X (as a reader of big codebases, this is a nice side effect). I don't like hiding defects either from the git history (one could in theory squash all the commits of a given PR in order to keep the "history clean"... In my experience having a trace of bugs fixed at PR time, or other subtle details is also worth it and serves as documentation of what not to do).

In the cases I need to work through many days in a single feature, and only if the feature is so complicated/critical than I cannot reproduce it from scratch by myself again, then yes I push the progress upstream. This is usually not the case though: I stash progress. I tend to open small PR and usually I remember what I've done (so I could write the entire code again easily). Plus, hard drives fail, sure but they are also quite reliable. In 20 years of work I never experienced losing "non critical" work because of disk failure (for critical work, I for sure have a different workflow).

deckard1 · 4 years ago

The flow you use will typically depend on the company you are working for. Using regular git (no Github/Gitlab) will often have a different workflow than Github.

My team (and myself) prefer this workflow:

- One commit per PR. This allows for easy reverts and cherry-pick.

- One developer per branch. You can do a few devs per branch, but rebases need to be coordinated and carefully handled, because:

- No merge commits. Only rebase onto latest main. Which means force-pushing PR branches and, thus, rewriting history (other devs working on same branch need to be aware that history has changed).

If you're constantly rebasing onto main, then all of your working commits sit on top of the latest code in main. Which means you do not have to deal with tricky merge conflicts, where your commits may weave in and out of the main branch at various points in time because you were doing "git merge" at random points. In addition, if you squash your commits before doing a rebase this will also make merge conflicts rather trivial, because you're only dealing with one set of merge conflicts on one commit.

That's the big picture, team workflow. For my personal workflow, I rely on "git add -p" and stashes. The only time I do a commit and push up code is: a) when I have a large change and want to make sure I don't lose it or b) others have already reviewed my PR and I want to keep the new changes separate to make their life easier when reviewing a 2nd time. I use "git reset --soft HEAD~<number-of-commits>" to squash instead of "git rebase -i" because I find it easier and quicker.

I must emphasize this point: learn "git add -p". It's extremely useful in the case where you have some changes like debugging code or random unrelated changes that you do not want to commit. It's a filtering mechanism.

wodenokoto · 4 years ago

> - One developer per branch.

I had never even considered that some teams might have multiple developers active on the same branch.

mgkimsal · 4 years ago

hrm.... I do this occasionally with some other folks on a project. often it's fe/be concerns - one person needs something added in a payload, for example. Working in their same branch to add that for their environment/branch was much faster than trying to coordinate a series of other changes/branches. It's not something I/we do a lot, but the few times we've done it in the last couple of months people have really liked it, and seemed to feel like we were making faster progress somehow.

musicmatze · 4 years ago

> - No merge commits. Only rebase onto latest main. Which means force-pushing PR branches and, thus, rewriting history (other devs working on same branch need to be aware that history has changed).

How does that even scale? I would imagine that in a team of 10, you would be rebasing 90% of your day and only 10% doing actual work?

rurban · 4 years ago

I automized constant rebasing. it's a couple of cronjobs for the mirrors and projects I'm maintaining over several years, and the cost is marginal. I get about one failed rebase email per month.

a big project of mine is about 2500 commits ahead. rebasing this beast is partially automated, but still I get about 2000 upstream changes through once a month. you need scripts to rebase and to rollback for a wrong choice.

it scales trivially.

izietto · 4 years ago

My flow is:

1.a. a lot of dirty commits/wip commits

1.b. a few of clean commits, when I spot changes that I know they are already a commit by itself

Before opening the PR:

2. `git log -p`: I inspect the commits I've done and I decide what should go together, what should be edited and what can stay as it is

3. `git rebase -i`: I apply the changes I've decided during 2

4. repeat 2 and 3 until I'm happy with the results

5. the last `git rebase -i`: reword almost every commit, as almost all the commits at this point have placeholder descriptions

I'm very happy with this strategy. It requires some time to get used to it but at the end my PRs are very clean and well-thought.

globular-toast · 4 years ago

One thing that can make rebasing much easier is making use of --fixup and --squash. Often you know at the time that a commit is either fixing a previous commit or that you would want to squash it with a previous commit. This can save a lot of time later as you can simply issue a --autosquash to rebase. If you do it right it means others can rebase your branch too.

timongollum · 4 years ago

If you are afraid to lose what you've done, you can stash your minor changes to keep track of the small things that you already got working, and then, when you got enough working to make a commit, you do it with the best code you got. And for the many pushes, you could backup your projects files everyday. I think is a much more appropriate way to use the tools that git gives us.

That way, you won't be afraid to lose your recent work by messing something up, because you have the stash, and won't be afraid to lose your whole project/progress because you have a recente backup of it.

For instance, I have a backup script that runs everytime I shutdown my work computer so I won't have to worry if suddenly my hard drive gives up on everything.