> The idea, particularly as realized in the GitHub pull request workflow, is that the real “unit of change” is a pull request, and the individual commits making up a PR are essentially irrelevant.
I loathe GitHub PRs because of this. Working at $dayjob the unit of change is the commit, and every commit is reviewed and signed off by at least 1 peer.
And you know what? I love it. Yes, there's some overhead. But I can understand each commit in its entirety. I and my coworkers have caught numerous issues in code with these single-purpose commits of digestible size.
Compare this to GitHub PRs, which tend to be beastly things that are poorly structured (not to mention GitHub's UI only adding to the review problems...) and multipurpose. Reviewing these big PRs with care is just so much harder. People don't care about the commit message, so looking at the git log it's just a mess that's hard to navigate.
My ideal workflow is commits are as small as possible and PRs "tell a story", meaning that they provide the context for a commit.
I will split up a PR into
- Individual steps a a refactor, especially making any moves their own commits
- I add tests *before* the feature (passing, showing the old behavior)
- The actual fix or feature commit is tiny, with the diff of the tests just demontstrating how behavior changed
This makes it really fast to review a PR because you can see the motivation while only looking at a small change. No jumping around trying to figure out how the pieces fit together.
The main reason I might split up a PR like that is if one piece is likely to generate a discussion before its merged. I then make that a separate PR and as small as possible so we can have a focused discussion.
As someone who has often had to dig into the history to figure out what happened, I always want to see at least this. And I wouldn't be opposed to seeing it broken down even more as it was worked on. Not one big squash merge that hides what really happened.
I'll also add one more to your list: Any improvements that came out of the review but stayed in that merge should each be individual commits. I've seen hard-to-trigger bugs get introduced from what should have been just a style improvement.
One of the problems is that GitHub's UI and workflow isn't very good for this in various ways (can't review commits, can't really diff with previous after amending commit).
So as a rule, I tend to stick with "1 PR == 1 commit", except when there's a compelling reason not to.
And to have a useful "git blame". My editor setup shows me a subtle git blame on each line of code, and I find it quite helpful to know who changed what last and why. Both when coding, and when debugging.
This is why, contra the linked article about commit messages, I strive to make minimal and cohesive commits with good messages. Commits are for future archaeology, not just a way to save my work every minute in case my hard drive dies.
In ~18 years of git use, I have never needed it, but I see it mentioned often as an important reason to handle commits in some certain way. I wonder what accounts for the difference.
>Working at $dayjob the unit of change is the commit, and every commit is reviewed and signed off by at least 1 peer.
Respectfully, that's the dumbest thing I've ever heard.
Work on your feature until it's done, on a branch. And then when you're ready, have the branch reviewed. Then squash the branch when merging it, so it becomes 1 commit.
Commit and push often, on branches, and squash when merging. Don't review commits, review the PR.
I've had people at various jobs accidentally delete a directory, effectively losing all their progress, sometimes weeks worth of work. I've experienced laptops being stolen.
If I used your system, over the years me and various colleagues would have lost work irretrievably a few times now, potentially sinking a startup due to not hitting a deadline.
I feel your approach shows a very "Nothing bad will ever happen" attitude.
Yes, of course you should have a backup. Most of those don't run every few minutes, though. Or even every few hours.
"Just trust the backup" feels like a really overkill solution for a system that has, as a core feature, pushing to a remote server. And frankly, a way to justify not using the feature.
I began using Jujutsu as my VCS about 2 months ago. Considering most of my work is on solo projects, I love the extra flexibility and speed of being able to safely fixup recent commits. I also love not having to wrangle the index, stashes, and merges.
`lazyjj` [1] makes it easier to navigate around the change log (aka commit history) with single keypresses. The only workflow it's currently missing for me is `split`.
For the times when I have had to push to a shared git repo, I used the same technique mentioned in the article to prevent making changes to other developer's commits [2].
It's been a seamless transition for me, and I intend to use Jujutsu for years.
Huh, reading the penultimate "“Units of change” and collaboration" section reinforces the feeling that Github PRs really are a poor way to do code submission/review, and have been holding back a lot of the industry from better ways of working for a long time.
GitHub-style PRs are the worst way of reviewing changes, except (in practice, if not in theory) for all the others :P.
When my then-employer first stated using git, I managed to convince them to set up Gerrit -- I had my own personal Gerrit instance running at home, it was great. I think it lasted about a year before the popularity factor kicked in and we switched to GitLab.
At least part of the difficulty is that approximately everyone who would need to learn how to use a new mechanism is already familiar with PRs. And folk new to the industry learn pretty quickly that it's normal and "easy". The cultural inertia is immense, and that means we're much more likely to evolve our way out of the hole (or switch to a completely different paradigm) than we are to en-mass switch to a different way of working that's still Git.
There are ways to make the PR experience more reasonable; GitHub has introduced stacked PRs which queue up. Even without that, I find disabling merge commits puts my team in a sweet spot where even if I'd prefer Gerrit, I'm not tearing my hair out and neither are the rest of my team who (mostly) don't care all that much.
I get to use both Gerrit (rebase-cherrypick workflow) and gitlab (PR/MR workflow) at work.
I think that MR is better for smaller projects i.e. ~10devs - it's lower overhead, just commit while you work then write up a description when you push.
I think rebase-CP is better for larger projects like ~100 devs committing to a repo - linear git history and every commit having a clear description+purpose+review is worth the overhead at that point.
So one-off tools and infra and stuff get chucked into gitlab while "the product" is in Gerrit.
It's maddening, considering their size, and that there are proven other examples out there of versioned change sets and divorcing the commit ID from the content ID.
JJ just surpassed a milestone for me personally where there were no hick-ups for more than 6 months and it feels genuinely superior to git and also to sapling for performance, stability and UX. If you ever considered switching i think it might be now. Colocated mode does not feel like a second class citizen and works really well too so there is always the option to use a sophisticated git client like fork for certain tasks. VisualJJ is also a great albeit not open source vscode extensio n that is slowly catching up to ISL. (If you are used to ISL and think visualJJ looks empty and lacks features: most things are hidden in the context menu which takes some getting used to.)
Jujutsu is “just” a piece of software that you install, and it is free and open source. Graphite.dev is a service, and it is not free, but as a service it gives you features that Jujutsu cannot like automatically merging stacked PRs.
For something as fundamental as source control in this day and age, I’d go with the former open source option (and have recently been learning Jujutsu)…
This article covers my own experience with JJ very accurately. I'll even go as far as to say that if had to write my own article about jj, I'd use exactly the same talking points and examples. Great writeup
I would really love a comparison between JJ and fossil. I use fossil for personal projects instead of git. So I'd like to know if I should consider JJ.
I think the biggest difference is one of philosophy: in Fossil (as I understand it) every change you make to the code is tracked, and there's no concept of rebasing or making the history prettier. If you make a commit and then realise you've left some debug logs in the code, you've got to make a new commit to delete those logs. This has the advantage that you never lose data, but the disadvantage that all of those commits end up in your long-term project history, making things like bisecting or blaming more difficult.
In JJ, however, changes are explicitly mutable, which means if you finish a commit and then realise you want to go can and change something, you've got multiple ways to squash your fix directly into the original commit. In addition to this, you also have a local history of every change you make to your project (including rewriting existing commits) so you still never lose any data locally. However, when you finish adjusting your commits and push them to someone else's machine (e.g. GitHub), they will only get the finished commits, and not the full history of how you developed them.
I know some Fossil people feel very strongly that any sort of commit rewriting should be banned and avoided at all costs, because it creates a false history - you can use Jujutsu like that, but it probably doesn't add much over Fossil at the point. But in practice, I think Jujutsu strikes the right balance between allowing you to craft the perfect patch/PR to get a clean history, and preventing you from losing any data while you do that.
> I think the biggest difference is one of philosophy: in Fossil (as I understand it) every change you make to the code is tracked, and there's no concept of rebasing or making the history prettier. If you make a commit and then realise you've left some debug logs in the code, you've got to make a new commit to delete those logs. This has the advantage that you never lose data, but the disadvantage that all of those commits end up in your long-term project history, making things like bisecting or blaming more difficult.
how you deal with accidental committing of text that was not supposed to be there (say, accidental pasting a private email)?
jj is just a CLI, analogous to git alone. It doesn’t come with any of the other stuff Fossil does. I don’t think large/small project makes any difference either.
Nice writeup -- had been wondering about how it compares to Git (and any killer features) from the perspective of someone who has used it for a while. Conflict markers seems like the biggest one to me -- rebase workflows between superficially divergent branches has always had sharp edges in Git. It's never impossible, but it's enough of a pain that I have wondered if there's a better way.
For me, it's not so much that jj has any specific killer feature, it's that it has a fewer number of more orthogonal primitives. This means I can do more, with less. Simpler, but also more powerful, somehow.
In some ways, this means jj has less features than git. For example, jj doesn't have an index. But that's not because you can't do what the index does for you, the index is just a commit, like any other. This means you can bring to bear your full set of tools for working on commits to the index, rather than it being its own special feature. It also means that commands don't need certain "and do this to the index" flags, making them simpler overall, while retaining the same power.
I loathe GitHub PRs because of this. Working at $dayjob the unit of change is the commit, and every commit is reviewed and signed off by at least 1 peer.
And you know what? I love it. Yes, there's some overhead. But I can understand each commit in its entirety. I and my coworkers have caught numerous issues in code with these single-purpose commits of digestible size.
Compare this to GitHub PRs, which tend to be beastly things that are poorly structured (not to mention GitHub's UI only adding to the review problems...) and multipurpose. Reviewing these big PRs with care is just so much harder. People don't care about the commit message, so looking at the git log it's just a mess that's hard to navigate.
I will split up a PR into
- Individual steps a a refactor, especially making any moves their own commits
- I add tests *before* the feature (passing, showing the old behavior)
- The actual fix or feature commit is tiny, with the diff of the tests just demontstrating how behavior changed
This makes it really fast to review a PR because you can see the motivation while only looking at a small change. No jumping around trying to figure out how the pieces fit together.
The main reason I might split up a PR like that is if one piece is likely to generate a discussion before its merged. I then make that a separate PR and as small as possible so we can have a focused discussion.
I hate squash merges.
I'll also add one more to your list: Any improvements that came out of the review but stayed in that merge should each be individual commits. I've seen hard-to-trigger bugs get introduced from what should have been just a style improvement.
So as a rule, I tend to stick with "1 PR == 1 commit", except when there's a compelling reason not to.
This is why, contra the linked article about commit messages, I strive to make minimal and cohesive commits with good messages. Commits are for future archaeology, not just a way to save my work every minute in case my hard drive dies.
In ~18 years of git use, I have never needed it, but I see it mentioned often as an important reason to handle commits in some certain way. I wonder what accounts for the difference.
- The real unit of change lives in Git
- The real unit of change lives on some forge
I want it to live in Git.
Respectfully, that's the dumbest thing I've ever heard.
Work on your feature until it's done, on a branch. And then when you're ready, have the branch reviewed. Then squash the branch when merging it, so it becomes 1 commit.
Commit and push often, on branches, and squash when merging. Don't review commits, review the PR.
I've had people at various jobs accidentally delete a directory, effectively losing all their progress, sometimes weeks worth of work. I've experienced laptops being stolen.
If I used your system, over the years me and various colleagues would have lost work irretrievably a few times now, potentially sinking a startup due to not hitting a deadline.
I feel your approach shows a very "Nothing bad will ever happen" attitude.
Yes, of course you should have a backup. Most of those don't run every few minutes, though. Or even every few hours.
"Just trust the backup" feels like a really overkill solution for a system that has, as a core feature, pushing to a remote server. And frankly, a way to justify not using the feature.
They made it pretty clear they're talking about not-large commits. And they're contrasting that with any-size PRs.
`lazyjj` [1] makes it easier to navigate around the change log (aka commit history) with single keypresses. The only workflow it's currently missing for me is `split`.
For the times when I have had to push to a shared git repo, I used the same technique mentioned in the article to prevent making changes to other developer's commits [2].
It's been a seamless transition for me, and I intend to use Jujutsu for years.
[1] https://github.com/Cretezy/lazyjj [2] https://jj-vcs.github.io/jj/latest/config/#set-of-immutable-...
https://github.com/idursun/jjui
When my then-employer first stated using git, I managed to convince them to set up Gerrit -- I had my own personal Gerrit instance running at home, it was great. I think it lasted about a year before the popularity factor kicked in and we switched to GitLab.
At least part of the difficulty is that approximately everyone who would need to learn how to use a new mechanism is already familiar with PRs. And folk new to the industry learn pretty quickly that it's normal and "easy". The cultural inertia is immense, and that means we're much more likely to evolve our way out of the hole (or switch to a completely different paradigm) than we are to en-mass switch to a different way of working that's still Git.
There are ways to make the PR experience more reasonable; GitHub has introduced stacked PRs which queue up. Even without that, I find disabling merge commits puts my team in a sweet spot where even if I'd prefer Gerrit, I'm not tearing my hair out and neither are the rest of my team who (mostly) don't care all that much.
I think that MR is better for smaller projects i.e. ~10devs - it's lower overhead, just commit while you work then write up a description when you push.
I think rebase-CP is better for larger projects like ~100 devs committing to a repo - linear git history and every commit having a clear description+purpose+review is worth the overhead at that point.
So one-off tools and infra and stuff get chucked into gitlab while "the product" is in Gerrit.
I missed that. How does it work?
Gross.
For something as fundamental as source control in this day and age, I’d go with the former open source option (and have recently been learning Jujutsu)…
In JJ, however, changes are explicitly mutable, which means if you finish a commit and then realise you want to go can and change something, you've got multiple ways to squash your fix directly into the original commit. In addition to this, you also have a local history of every change you make to your project (including rewriting existing commits) so you still never lose any data locally. However, when you finish adjusting your commits and push them to someone else's machine (e.g. GitHub), they will only get the finished commits, and not the full history of how you developed them.
I know some Fossil people feel very strongly that any sort of commit rewriting should be banned and avoided at all costs, because it creates a false history - you can use Jujutsu like that, but it probably doesn't add much over Fossil at the point. But in practice, I think Jujutsu strikes the right balance between allowing you to craft the perfect patch/PR to get a clean history, and preventing you from losing any data while you do that.
how you deal with accidental committing of text that was not supposed to be there (say, accidental pasting a private email)?
JJ itself uses Github, where as fossil is very easy to self-host including issue tracking, wiki etc.
In some ways, this means jj has less features than git. For example, jj doesn't have an index. But that's not because you can't do what the index does for you, the index is just a commit, like any other. This means you can bring to bear your full set of tools for working on commits to the index, rather than it being its own special feature. It also means that commands don't need certain "and do this to the index" flags, making them simpler overall, while retaining the same power.