I find the sort of opinions on this post quite common on a subset of engineers - namely mid levels with some time in the career, who start to consider themselves senior engineers and want everyone to follow the same set of strict rules they decided make sense. It’s the same mindset that makes people pedantically apply DRY to every situation or forcing others to TDD basic apps.
In practice:
- smaller PRs aren’t necessarily easier to review (and this arbitrary obsession almost always leads to PR overload in chunks that don’t make any sense, reducing code quality as a result)
- nobody reads intermediate commit messages one by one on a PR, period. I worked on a team where the lead was adamant about this and started to write messages in the vein of “if you’re reading this message, I’ll give u $5”. I never paid anyone a dollar. Don’t waste your time writing stuff for no one.
- “every commit must compile” - again, unnecessary overzealousness. Every commit on the MAIN branch definitely should compile. Wasting your time with this in a branch, as you work towards a solution, is focusing on the wrong thing
You want PRs because they help others absorb what you’re doing (they’ll have to read that same code sooner or later). You don’t want to create a performance theater.
> nobody reads intermediate commit messages one by one on a PR, period. I worked on a team where the lead was adamant about this and started to write messages in the vein of “if you’re reading this message, I’ll give u $5”. I never paid anyone a dollar. Don’t waste your time writing stuff for no one.
I do. Especially if the author is competent.
That said, empirically, you're correct most people don't.
However, that said, I think changing the culture rather than throwing away the practice would be a better response.
Reading and reviewing clean history is really so much nicer. I'd also argue that actually making your history clean (as opposed to theatrically and thoughtlessly making small commits, say) forces you as the author to review it more carefully.
Replying to echo this, I also read every commit message, and review PRs commit by commit. This was common practice at my last job (a small, experienced team), and the expectation was that commits were atomic.
Yes, we griped that GitHub would not allow us to merge individual commits, but if it was ever urgent or helpful to do so, we cherry-picked a commit into a separate PR.
Everyone's workflow is a bit different, and it can be hard to redirect organizational inertia. But without a doubt, reading a clean commit history is a pleasure.
PR is the atomic level of work. I'd argue PR-level history (i.e. squash) is often enough and is way cleaner. Why would I care about "commit A", "change parts of A because I misunderstood a requirement", "improve A based on code review" etc.?
If I want that granularity, I'd go read the original PR and the discussion that took place.
I've met thousands of developers over my career and i could put them into two categories: those who don't give a shit about intermediate commit messages (majority) and those who browse every single intermediate commit message in a PR (very few). To be honest, the latter had some tendency to be difficult to work with. It was also a useful discriminator to avoid getting those into my teams.
> Reading and reviewing clean history is really so much nicer.
You can have both with git and it's not even hard. Unfortunately it seems many people pride themselves in what little they know of git. I'm not being sarcastic, I've read people say this almost word-for-word.
IMO there's no point having a clean history of commits within a PR. With rare exceptions, if you have a PR with a clean history of commits and each commit compiles and passes the tests... they should be separate PRs! If it isn't clean then it should be squashed.
A few exceptions:
1. When refactoring often your PR is "do an enormous search and replace, and then fix some stuff manually". In that case it's way easier to review if the mechanical stuff is in a separate commit.
2. Similarly when renaming and editing files, Git tracks it better if you do it in two commits.
3. Sometimes you genuinely have a big branch that's lasted months and has been worked on by many people and it's worth preserving history.
Also I really really wish GitHub had proper support for stacked PRs.
The "mid levels who consider themselves senior" are the exact type of people who I see saying what you're saying, i.e.
* Yes, TDD on production code is nice in theory, but it doesnt work in my case.
* Yes, short PRs are nice in theory, but it doesnt work in my case.
In every case, as far as I can see, it meant "It does work, I just dont know how to do it".
When I say "if you dont think it works in your case, come to me, Ill show you" they often demur and I end up with a huge PR anyway.
In practice I dont think ive ever seen a long PR that wouldnt have benefitted from being strategically broken up, but every other day I see another one that should have been.
> Yes, TDD on production code is nice in theory, but it doesnt work in my case...
Parent said something more along the lines of "they don't work in every case, and trying to force it in every case is misguided".
I agree that too big is more common than too small with respect to PR size, but you aren't putting forward much of an argument against parents "there are no absolutes" argument by straw manning them.
If I'm refactoring, truly refactoring, a 10k line PR where all the renaming happens at once is mandatory or else it won't compile. The only other option would be incremental refactors with an intermediate, parallel state that adds complexity, increases the likelihood of missing something and makes a 30 minute, 1 time PR review become 12x10 minute PR reviews.
Obviously it has to be a pure refactor, entirely isolated from functional changes but there are plenty of similar cases where doing it once is the least effort.
> “nobody reads intermediate commit messages one by one on a PR”
I clean my history so that intermediate commits make sense. Nobody reads these messages in a pull request, but when I run git blame on a bug six months later I want the commit message to tell me something other than "stopping for lunch".
> pedantically apply DRY to every situation or forcing others to TDD basic app
Sure, pedantically doing or forcing anything is bad, but in my experience, copy-paste coding with long methods and a lack of good testing is a far more common problem.
You may be 100% correct in your particular case, but in general if senior devs are complaining that your code is sloppy and under-tested, maybe they aren't just being pedantic.
I actually find the relevant PR/MR discussion a lot more useful than the commit messages themselves. So any git blame is just to get a commit hash and look that up in GitLab/GitHub to see the entire change set and any comments around it. It makes me wish those comments were bundled with the merge commit somehow and could easily be accessed in the terminal where I'm viewing the git history.
> Sure, pedantically doing or forcing anything is bad, but in my experience, copy-paste coding with long methods and a lack of good testing is a far more common problem.
This is a false dichotomy and an unproductive thing to focus at.
Experienced engineers know when to make an abstraction and to not. It is based in the knowledge about project.
Abstarct well and don't do compression. Easy said, and good engineers know how to do it.
PRs are emails to your team and to your future self.
Framed in that context it's easier to carry the correct tone and think about scoping / what's important.
---
> pedantically apply DRY to every situation
I swear DRY has done more damage to the software industry from the developer side than it has done good because it has manifested into this big stick with which to bludgeon people without taking context into account.
A great way to frame DRY that I heard from hackernews: "DRY things that are supposed to have the same behavior, not things that happen to have the same behavior"
So commit messages puts the information closer to the user. One hop doesn't seem much, but the time saved adds up as you go.
Also, as some other reader mentioned anecdotally, PRs may not be there forever. E.g. your team may migrate to a new platform PR text and reviews were left behind.
> PRs are emails to your team and to your future self.
Indeed! I've found many point on this discussion answered by the linux kernel idea of mailing lists where a change is discussed then approved, often with feedback acknowledged
> nobody reads intermediate commit messages one by one on a PR
I think it's fine to have a whole bunch of "WIP" commit messages on intermediate commits while the PR is in a draft stage, but then all of those garbage commits should really be squashed down into one commit and you should at least write a one liner that describes what the whole change is doing. I think it does materially make repo history harder to understand to merge in PR's with 10 garbage commits in them.
> nobody reads intermediate commit messages one by one on a PR, period.
I do! I find it the easiest way to review code when the author has taken the time to structure it in that way. I'm lucky to work with some great people.
>nobody reads intermediate commit messages one by one on a PR, period.
>Don’t waste your time writing stuff for no one.
I've thought about that as I continue to write them. I think I can justify it by saying they are mostly for me. Can I describe what I'm trying to do with a specific push into a few items. It let's me reflect if I'm waiting too long between commits or if my ideas are getting too spread apart and really should be in two different branches that each have their own PRs. Then there is the rare case on a slower project where an item gets deprioritized and I come back to it weeks or even months later. Having the messages help me catch back up to speed.
As such, I find the 20 seconds or so to type out 1 to 2 sentences to be worthwhile, even if the ones reviewing the eventual PR never check. I'm also not above throwing in a "ditto" or "fixed issue" when a single commit really is that small or insignificant.
>“every commit must compile”
I agree with your take this is overzealous, but to expand upon my previous point, if I know a commit on a branch won't compile (say just had something else come up and need to swap focus for a few days), then I'll try to make sure I call that out in my last message just in case anyone else happens to get put on the project.
If I were to summarize my approach, treat PR messages seriously, but treat branch commit messages like sticky notes that will likely end up in the trash by week's end.
It's clear you've never worked on a large open source project... There are good reasons for all the practices you're thoughtlessly dismissing.
I agree that for a common team of programmers working for a single company, the value isn't always there. But that's the easiest and least interesting case... in big distributed projects this stuff really matters.
> - nobody reads intermediate commit messages one by one on a PR, period [...]
>
> - “every commit must compile” - again, unnecessary overzealousness. [...]
In my part of the world both of these are true, and proudly so. We keep catching a myriad of errors, big and small. The history is easy to read, and helps anyone catching up with how a certain project evolved.
I understand it might not be true for everyone, every team, in every line of business; but this sort of discipline pays off in quality oboth of the code _and_ the team members' abilities.
Sometimes you're overhauling something which you can't do in chunks less than something 2,000 line long PR. There's no intermediate working system. The problem is, "take this very large bit of code and throw it entirely away and rebuild it completely differently". Trying to craft some evolutionary step between A and B is just going to take 10x longer and won't help any code reviewers.
When you have a large PR like this, here's how I like to get it reviewed.
1. Give reviewers sometime to become familiar with the PR. They might not understand all parts of it, but they should have at least a cursory understanding of the PR.
2. Have a meeting where the PR is explained in front of the group of reviewers. The reviewers will understand the PR better and they can ask questions in realtime.
3. Let folks review the PR after the meeting in case they spot anything else, or think of additional questions.
Most of the time PR review is done asynchronously, but doing most of the review in the meeting can also be a decent team building exercise.
While I agree about not having hard and fast rules, like LoC per PR, the principles of this are very relevant.
When reviewing a conglomerate commit in a PR, I have to reverse engineer how the different changes interact to figure out the intent. I then have to do this on each update they make. Contrast that to when someone breaks up their commits where I can zoom through variable renames, extracting functions, etc to see the one line that change that all of that unblocked that makes the difference. Then if updates are pushed, I only have to worry about the commits that were updated.
As for all commits compiling, that is helpful to review the individual commits.
Both of these (small commits, all compiling) are also great for bisecting. You get pointed to a very small change that you can more easily analyze vs dealing with breakages or having to analyze a large change to find what the problem is.
> “every commit must compile” - again, unnecessary overzealousness. Every commit on the MAIN branch definitely should compile. Wasting your time with this in a branch, as you work towards a solution, is focusing on the wrong thing
(With few exceptions,) I generally follow this practice; BUT, I think enforcing this on other developers feels like micromanagement. That being said, with few exceptions, committing code that doesn't compile feels like an incomplete sentence.
(Sometimes on massive refactors I make commits that don't compile. It gives me a place to roll back to. If someone thinks this is poor practice, than I think they're putting principles in place of practicality.)
A _branch_ is a unit of work that should be merged when done.
As the owner of a branch, an engineer has the ability to move into intermediate states. The larger the codebase, the larger the possibility of something unexpected breaking or not compiling. Just like editing a large body of text - you will have "incomplete sentences" through the process. It's part of writing. Expecting others to write their drafts the same way you like is just silly - it's putting rigid principles ahead of anything else that matters.
> nobody reads intermediate commit messages one by one on a PR, period.
Very common practice at my old company, and one I continue in my current role.
> “every commit must compile”
sucks ass for anyone else trying to rebase your branch onto the update main/master when they don't. Once your PR is out of "working on the feature" and into the "getting it merged" phase, do a little `git rebase -i` and squash your really intermediate commits into ones that compile. Ignore this if you have real grown up CI where your PRs never stay open for more than a day.
> Ignore this if you have real grown up CI where your PRs never stay open for more than a day.
A vast majority of the drama that comes out of source control is associated with branches living for far too long.
I've got an internal alarm that starts to go off somewhere around 72 hours. If something takes longer than this, I've probably screwed up in my planning phase. There are some things that do need to sit, but they should be rebased every morning like clockwork. The moment things start to conflict, the PR gets closed and the branch is now a reference for how to do it again when whatever blocker is cleared.
Another way to think about all of this is to pretend like everything you are touching is taking a synchronous lock out (even if it's not), similar to how tools like Perforce behave. So, you generally want to move as quickly as possible to get out from under lock contention. Git allows you to pretend like you aren't conflicting for a really long time, but at some point you must answer for all of this debt (with interest).
> “every commit must compile” - again, unnecessary overzealousness.
my understanding is that you commit when you are at the "good place", where the part of the code you are working on works. That way when you keep going and find yourself going in a direction that is not right, you can go back to the last good place. If your code doesn't even compile, that doesn't seem like a good place.
>> You want PRs because they help others absorb what you’re doing
That isn't really where it came from though. The idea was, if I want an open source maintainer to accept my changes, I make a request to pull them from my branch. Once the open source maintainer has merged it in, they own it. If they don't like it (even one little bit), they can reject it because quality / ownership / maintenance is completely on them.
On a team environment where no one owns anything it is a little less clear what the value is. You want to incentivize the "betterness" of "something" and are using "broadened knowledge" as a proxy for that. Usually this just goes unexamined but really it would be good to establish how broad and deep you want this knowledge to be and work back from there - is the 5 minute PR review the best way to achieve it?
> - smaller PRs aren’t necessarily easier to review (and this arbitrary obsession almost always leads to PR overload in chunks that don’t make any sense, reducing code quality as a result)
This is why I gave up reading the article shortly after reaching the point about making a history with commit messages. The comments—even if it is on a Git forum—will just be full of people that either say that it’s a waste of time or that it is literally impossible for this to be practiced by anyone.[1]
Your best bet is to find projects where this is practiced (and you don’t have to look far). But making the case to a general audience? No, too many loud voices that treat version control like “I am committing now because I need to pick up the dry-cleaning” arbitrary/random snapshot-maker.
[1] No one, period? Sounds like a bit of a strict ontological rule to me.
I’m in the opposite camp. Following these two practices often doesn’t make any difference but the few times it did saved me a ton of time.
Dropping commits or rebasing is much easier when you have descriptive, atomic commits. It’s also helpful when performing git blame archeology to try and understand why this code looks so weird and has no context. It’s also useful when bisecting (not so much a problem with small PRs, quite handy as they grow bigger and bigger)
As with everything it’s about context and circumstances. As you gain expérience you can appreciate and gauge when it’s required. When you don’t have the expérience then you follow rules so that you gain said expérience. That’s how I see it.
I disagree with everything you wrote here. Prs must absolutely limit their scope both in terms of length as well as what they accomplish (e.g. don't do unrelated refactorings in a pr delivering something else), large features must absolutely be broken into individual commits if not individual PRs, I definitely read each one and each one definitely has to complete and pass tests 100%, else they are leaking details into the next or that would "fix" the problem. This is also absolutely nothing like "forcing TDD" in people and these are all practices that junior devs should absolutely be doing since it will help them to think about code, change and maintainability a ton.
> - nobody reads intermediate commit messages one by one on a PR, period. I worked on a team where the lead was adamant about this and started to write messages in the vein of “if you’re reading this message, I’ll give u $5”. I never paid anyone a dollar. Don’t waste your time writing stuff for no one.
I often do, In a larger PR or in one where it's hard to tell what is being accomplished, like this article articulates the commits can tell a story of the engineers journey to solution. Even if I review a commit that is largely undone by future commits that piece of history is often key to my understanding.
This. Just last week I have split a coworker's single-commit-MR into multiple commits so that I could distinguish between unrelated changes and to check smaller chunks of code. It worked beautifully.
With commit messages you miss the point. It’s more like the final test of the commit. If you can’t formulate easily what you did and why, then you need to rethink your changes.
> and started to write messages in the vein of “if you’re reading this message, I’ll give u $5”. I never paid anyone a dollar.
Meh, most people wont address it or ask that dollar. It does not mean I did not read it, I chuckled and moved on.
I do read every commit on PR chain and every line. I am not necessary super attentive reviewer or something, but I never accept it without at least formally looking at it.
If every commit on the main branch must compile then why wouldn't it also compile in the PR branch? It doesn't make sense to ask people to review, then after that rebase and merge imo.
I write intermediate commit messages as notes to self. You don't always work continuously on the same PR. The commit messages are a useful context refresher.
Why advocate against this anyway? If no one reads them, it harms no one. Just like personal blogs. However, the writing of the blog is the useful act, not the reading.
Ironic that you are accusing TFA article of being an expert novice. I don't disagree your take on him / the article, but you are committing the same sin.
You missed the point entirely. The point is forcing others to do something that has no inherent value to them or to your process, just because you like it, is junior behavior.
I agree with you that this shouldn't be 100% hard defaults, but it's a good standard to have, and imo it's valuable to be explain why one is deviating from it.
> - smaller PRs aren’t necessarily easier to review (and this arbitrary obsession almost always leads to PR overload in chunks that don’t make any sense, reducing code quality as a result)
Oh but they sure can be reviewed more easily, because they are shorter? Doing so feels like less effort, and you get a dopamine hit from hitting that "submit review" button faster/more often (improved morale, and PR turnaround time!). Plus, if there's a longer discussion about X, it's great if it's not tangled up with Y and Z at the same time - allowing you to dig into X.
> - nobody reads intermediate commit messages one by one on a PR, period.
Come on, that's intellectually dishonest. 1. VSCode displays commit messages inline as blame for me (and many of my colleagues), so even when we don't read the commit messages one by one _on a PR_, I often read them later in the IDE (we don't squash merge PRs). I spend significantly more time reading code than writing, and commit messages, PR descriptions and linked issues provide extra context that is useful to me especially for complex code. If those messages were entirely unreadable, I'd be annoyed. 2. When someone invests time into telling a good story commit by commit, in my team they write "Review commit-by-commit is encouraged" in the PR description, to tell the reviewers that yes, they should read the individual commits, as that'll make understanding the PR easier. Often as reviewer, I follow that suggestion.
> Wasting your time with this in a branch, as you work towards a solution, is focusing on the wrong thing
It seams you're conflating "working on a feature" with "presenting it as PR to review". That's two very different things, and Edamagit in VSCode makes it so so easy to provide a reasonable commit history that hides some of your missteps, and to fill in commit messages.
you need to be careful with every single commit message, every commit must compile, etc, in your case. My comments apply if you squash-merge, in which case all that commit-level care is not necessary since intermediate commits go away on merge. You’re probably making your life harder for no reason for avoiding squash-merge, but that’s just my opinion
PR review is probably at least a little performative.
But I trust my colleagues to do good reviews when I ask them to, and to ignore my PRs when I don't. That's kind of the way we all want it.
I regularly ask for a review of specific changes by tagging them in a comment on the lines in question, with a description of the implications and a direct question that they can answer.
This, "throw the code at the wall for interpretation" style PR just seems like it's doomed to become lower priority for busy folks. You can add a fancy narrative to the comments if you like, but the root problem is that presenting a change block as-is and asking for a boolean blanket approval of the whole thing is an expensive request.
There's no substitute for shared understanding of the context and intent, and that should come out-of-band of the code in question.
For any PR above a few line change, if a developer has not done a self review, I don’t review it all.
Instead I request that it is self reviewed with context added, prior to requesting re-review.
I also tend to ask the question, “are any of these insights worth adding as comments directly to the code?”
9/10 the context they wrote down should be well thought out comments in the code itself. People are incredibly lazy sometimes, even unintentionally so. We need better lint tools to help the monkey brain perform better. I wish git platforms offered more UX friendly ways to force this kind of compliance. You can kind of fake it with CI/CD but that’s not good enough imo.
This is a huge pet peeve of mine. At work I'm an expert on part of the code base that sees a fair number of contributions.I get many private IMs from colleagues asking me to "Approve please" or something like that with a dump of 100s of lines, of which maybe 10 lines are relevant to me (touch files or behaviour that I'm an expert on, hence why they need my approval.)
Minimally, I would like context for the change, why it required a change to this part of the codebase, and the thought process behind the change. Sometimes but not often enough I send the review back and ask them for this info.
IMO many software developers are just not fast enough at writing or language so providing an explanation for their changes is a lot of work for them. Or they are checked out and they just followed the AI or IDE until things worked, so they don't have explanations to provide. People not taking the time is what makes reviews performative.
> Minimally, I would like context for the change, …
… the why is important
> IMO many software developers … don't have explanations to provide. People not taking the time is what makes reviews performative.
… a lot of developers only consider the how.
i’ve had a lot of experiences of “once my PR is submitted that’s my work/ticket finished” kind of attitude.
i spent a year mentoring some people in a gaming community to become dev team members. one of the first things i said about PRs was — a new PR is just your first draft, there is still more work to do.
it helped these folks were brand spanking new to development and weren’t sullied by some lazy teacher somewhere.
My team uses a github PR template with the following sections. Answers to each can be short yet it has been extraordinarily helpful to pass over important info to the reviewer that's not captured in code. It also borders on "checklist" that the code author has actually done the bare minimum to think things through.
# Goal
(why is this change needed at all)
# What I changed and why I did it this way
# What I'm not doing here and how I'll follow up
# How I know it works
(optional section, I include this only for teams with lots of very junior engs: "added a test" is often sufficient)
> IMO many software developers are just not fast enough at writing or language
I think this is the overwhelming factor, software engineering doesn't select for communication skills (and plenty of SEs will mock that as a field of study), or at least most SEs don't start out with them.
Commit descriptions are criminally under used for this purpose. You can add so much more context if you don't limit yourself to just the short commit message.
>that should come out-of-band of the code in question.
Ideally, yes. After a decade and something' under ZIRP, seems a lot of workers never had incentive to remain conscious of their intents in context long enough to conduct productive discourse about them. Half of the people I've worked with would feel slighted not by the bitterness the previous sentence, but by its length.
There's an impedance mismatch between what actually works, and what we're expected to expect to work. That's another immediate observation which people to painfully syntaxerror much more frequently than it causes them to actually clarify context and intent.
In that case the codebase remains the canonical representation of the context and intent of all contributors, even when they're not at their best, and honestly what's so bad about that? Maybe communicating them in-band instead of out-of-band might be a degraded experience. But when out-of-band can't be established, what else is there to do?
I'd be happy to see tools that facilitate this sort of communication through code. GitHub for example is in perfect position to do something like that and they don't. Git + PRs + Projects implement the exact opposite information flow, the one whose failure modes people these days do whole conference talks about.
Exactly, if you want people to think about your code/changes you should be able to give them the needed context.
If you don't know them, please realize your code isn't automatically a gift everybody waited for, you may see it that way, but from the other side this isn't clear until someone put in the work to figure out what you did.
In short: added code produces work. So the least you should do is try reducing that work by making it easy to figure out what your code is and isn't.
Sum up what changes you made (functionally), why you made them, which choices you made (if any) and why and what the state of the PR code is in your own opinion. Maybe a reasoning why it is needed, what future maintenance may look like (ideally low). In essence, ask yourself what you'd like to know if someone appeared at the door and gave you a thumb drive with a patch for your project and add that knowledge.
Also consider to add a draft PR for bigger features early on. This way you can avoid programming things that nobody wanted, or someone else was already working on. You also give maintainers a way to steer the direction and/or decline before you put in the work.
PRs should be optional, IMHO. Not all changes require peer review, and if we trust our colleagues then we should allow them to merge their branch without wasting time with performative PRs.
There is a difference between a code review and approval to merge/release.
Part of the difference is the idea you can catch all problems with piecemeal code review is nonsense, so you should have at least some sweeping QA somewhere.
Yes! I once read a great article I can no longer find that talked about 3 types of PRs.
Simple ones that you self approve.
Ones that you tag someone because you want to spread the knowledge of what has been done. And ones that need actual review. Everything being reviewed is simply unnecessary and exhausting.
It's a very common refrain but I don't really agree with it:
"How do you create a PR that can be reviewed in 5-10 minutes? By reducing the scope. A full feature should often be multiple PRs. A good rule of thumb is 300 lines of code changes - once you get above 500 lines, you're entering unreviewable territory."
The problem with doing this is if you're building something a lot bigger and more complex than 500 lines of code, splitting that up across multiple PR's will result in:
- A big queue of PR's for reviewers to review
- The of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
- You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
The right answer for the size of a PR is NOT in lines of code. Exercise judgement as to what is logically easier to review. Sometimes bigger is actually better, it depends. Learn from experience, communicate with each other, try to be kind when reviewing and don't block things up unnecessarily.
This is a feature. I would infinitely prefer 12 PRs that each take 5 minutes to review than 1 PR that takes an hour. Finding a few 5-15 minute chunks of time to make progress on the queue is much easier than finding an uninterrupted hour where it can be my primary focus.
> - The of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
It increases it a little bit, sure, but it also helps keep things focused. Reviewing, for example, a refactor plus a new feature enabled by that refactor in a single PR typically results in worse reviews of either part. And good tooling also helps. This style of code review needs PRs tied together in some way to keep track of the series. If I'm reading a PR and think "why are they doing it like this" I can always peek a couple PRs ahead and get an answer.
> - You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
This is a tooling problem. Git and Github are especially bad in this regard. Something like Graphite, Jujutsu, Sapling, git-branchless, or any VCS that supports stacks makes this essentially a non-issue.
I agree with this. One way to keep changes small but still compose them into a coherent PR is to make each commit in the final PR independently meaningful, rather than what actually transpired during local development. TFA touches on this somewhat, contradicting the bit you quoted.
A trivial example would be adding the core logic and associated tests in the first commit, and all the remaining scaffolding and ceremony in subsequent commits. I find this technique especially useful when an otherwise additive change requires refactoring of existing code, since the things I expect will be reviewed in each and the expertise it takes are often very different.
I don't mind squashing the branch before merging after the PR has been approved. The individual commits are only meaningful in the context of the review, but the PR is the unit that I care about preserving in git history.
The problem that I find myself in is that I almost always run into stuff I didn't expect. Some integration that I thought would be minor turns out to slowly get out of hand, and before I know it I've made way more changes than I meant to. And then it all gets tangled together.
Maybe it's just a me problem, maybe I need to be more disciplined. Not sure but it catches me quite often.
> You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
This shouldn't matter unless you are squashing commits further back in the tree before the PR or other people are also merging to main.
If a lot of people are merging back to main so you're worried about those causing problems, you could create a long life branch off main, branch from that and do smaller PRs back to it as you go, and then merge the whole thing back to main when your done. That merge might 2k lines of code (or whatever) but it's been reviewed along the way.
I don't necessarily disagree with you. Just pointing out that there are ways to manage it.
I also feel like what gets lost in this is not everything you are building is a bite size feature in large existing project. Sometimes you are adding an entire subsystem that is large to something relatively greenfield. if you broke that down into features, you will need 20 PRs and if you wait for review, or even don't wait but have to circle back to integrate lots of requested changes, what might be a couple of weeks of work turns into 2 to 3 months of work. That just does not work unless you are in a massive enterprise that is ok with moving like molasses. Do you wind up with something not as high quality? Probably. But that is just the trade-off with shipping faster.
If you are the only developer who ever going to work on something, maybe. Even then, I will argue you are more likely to deliver successfully if you are cutting your work into smaller pieces instead of not delivering anything at all for weeks at a time.
But for the company, having two people capable of working on a system is better than one, and usually you want a team. Which means the code needs to be something your coworkers understand, can read and agree with. Those changes they ask for aren't frivolous: they are an important part of building software collaboratively. And it shouldn't be that much feedback forever: after you have the conversation and you understand and agree with their feedback, the next time you can take that consideration into account when you are first writing the code.
If you want to speed that process up, you can start by pair programming and hashing out disagreements in real time, until you get confident you are mostly on the same page.
Professional programming isn't about closing tickets as fast as possible. It is about delivering business value as a team.
Here's an alternative approach: Discuss the design with your team beforehand, and have active ongoing discussions, sanity checks, and even pair programming during the development process. That way the review is not an exhaustive end-to-end review with the reviewer coming in cold. It's instead the final approval step in a long chain of decisions that have already been discussed and agreed upon.
Of course that won't work for all projects/teams/organizations. But I've found that it works pretty well in the kinds of projects/teams/organizations I've personally been a part of and contributed to.
100%. I think the right answer is to break features into atomic commits, but keep PRs at the feature level. This reduces the PR friction, while letting reviewers easily view change sets for specific features, and if a specific feature needs a patch you don't need to do any rebase gymnastics, just add a patch commit.
AI agents like frequent checkpoints because the git diff is like a form of working memory for a task, and it makes it easy to revert bad approaches. Agents can do this automatically so there isn't much of an excuse not to do it, but it does require some organization of work before prompting.
It's not about splitting up the PR: it is about splitting up the _work_.
If you don't have feature flags, that is step one. Even if you don't have a framework, you can use a Strategy or a configuration parameter to enable/disable the new feature, and still have automated testing with and without your changes.
> You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
+100 to this. My job should be thoughtfully building the solution, not playing around with git rebase for hours.
Keep merging each PR into master under a feature flag, that's how it's done. Huge PRs that implement a feature in one swoop are pretty much the worst case scenario for every stage: review, testing, deployment and monitoring.
I do agree with the common refrain, actually, and disagree with the idea that work can be so big and complex that it has to be in one pull request.
> A big queue of PR's for reviewers to review
Yes, yes please. When each one is small and understandable, reviewers better understand the changes, so quality goes up. Also, when priorities change and the team has to work on something else, they can stop in the middle, and at least some of the benefits from the changes have been merged.
The PR train doesn't need to be dumped out in one go. It can come one at a time, each one with context around why it's there and where it fits into the grander plan.
> The [totality] of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
A primary goal of code review is to build up the mental map of the feature in the reviewers' brains. I argue it's better for that to be constructed over time, piece by piece. The immediate cognitive load for each pull request is lower, and over time, the brain makes the connections to understand the bigger picture.
They'll rarely achieve the same understanding of the feature that you have, you who created and built it. This is whether they get the whole shebang at once or piecemeal. That's OK, though. Review is about reducing risk, not eliminating it.
> You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
I've learned not to charge too far ahead with feature work, because it does get harder to manage the farther you venture from the trunk. You will get conflicts. Saving up all the changes into one big hunk doesn't fix that.
A big benefit of trunk-based development, though, is that you're frequently merging back into the mainline, so all these problems shrink down. The way to do that is with lots of small changes.
One last thing: It is definitely more work, for you as the author, to split up a large set of changes into reviewable pieces. It is absolutely worth it, though. You get better quality reviews; you buy the ability to deprioritize at any time and come back later; most importantly for me, you grasp more about what you made during the effort. If you struggle to break up a big set of changes into pieces that others can understand, there's a good chance it has deeper problems, and you'll want to work those out before presenting them to your coworkers.
PR's should generally be the size of a feature, or a meaningful subfeature for large features.
When you arbitrarily split up PR's into something "300 lines" or "5-10 minutes" you can miss the forest for the trees. The little thing looks fine in isolation but doesn't make any sense as part of a larger approach. Different people are reviewing it piecemeal but nobody is reviewing the approach as a whole or making sure the parts fit together right.
And then the idea of "telling a story with commits" feels like a waste of time to me. I have no interest in the order in which you wrote the code, or what you wrote and then rewrote. The code itself needs to be legible. Your final code with its comments should speak for itself. Code is the what and comments are the why.
Now, what I will say is that the more junior the developer, the smaller their commits should be. But that's because they should be assigned smaller features, and have more handholding along the way. And when people are making larger architectural changes, they should be getting signoff on their overall approach from the start -- if you're frequently rejecting the whole approach to a problem in code review, something's going wrong with your communications processes.
Thank you for saying this. This drove me up the wall everywhere I worked that had those arbitrary ~300 line rules as all that happened was code review devolved into "I don't like this code pattern, use another code pattern". While the larger architectural ideas that actually became problematic were rarely reviewed since they weren't obvious when split across 5 PRs often with different reviewers.
Honestly I think it would be far more effective to just review a paragraph and maybe a diagram that explains "here's how I think I'm going to tackle this problem" and forget about line-by-line code review entirely. Other than for training juniors, I don't think there's much long term value in "I think you should use an anonymous function here instead of a named function".
The kinds of things that are usually brought up in code review are not what contributes to real long term technical debt, because a function name or code formatting can be changed in an instant, while larger architectural decisions cannot.
The other thing I noticed is that even when an architectural issue is obvious, there's a tendency to not want to change it because so much of the work has already been done in preparing the PR for review. If you point out a flaw in an architectural decision, it's not unusual for the person to reasonably say "I've just put together a chain of 5 PRs and now you're asking me to rewrite everything?"
Yeah. While this narrative style tries to explain what things are done, it instead often leaves the question: Why are we doing this at all?
Commit #1 adds a helper function for whatever, looks innocent enough, implementation is correct. Believe it or not, it even has tests, lgtm. Then only by commit #8 do you realize this helper function is not needed at all and the entire approach is wrong. Happens every time.
I started reviewing these chains backwards and refuse starting a review until the whole chain is available. That’s however not always easy either, when commit #2-#5 has incrementally refactored everything into something unrecognizable, so that both the left and right side of the diff are wrong! No, I’m not interested in ”this will be fixed 2 commits down the chain”. I just want to review the final state that goes into production, nothing else matters.
Yes, commits should be made small whenever possible and not include unrelated fixes or refactors. Just please, keep them meaningful on their own.
I wish GitHub's PR UI was better at walking through a PR one commit at a time. As someone who does try to make good commits, and as someone who does try to read PRs sometimes a commit at a time, GitHub's UI gets in the way and keeps trying to drive you back to "whole PR" reviews.
It is very telling in the article itself there is a screenshot of the commits tab in the PR workflow that many don't realize even exists and/or never think to use.
In the Files tab the commit picker has gotten better in recent years, but it is still overly focused on selecting ranges of commits over individual ones, and there's no shortcuts to easily jump Next or Previous commit, you have to remember your place and interact with the full pulldown every time. Also, it's hard to read the full descriptions of commits in the Files view and I find I often have to interrupt my flow to open the commit in another browser tab or flip back and forth between the Commits tab and the Files tab in the PR. The Commits tab also defaults to hiding most of the commit descriptions, so it's still not a particularly great reading experience.
It feels like a bit of a bad feedback loop that GitHub's UI doesn't make commit-by-commit reviewing clean/easy because GitHub themselves don't expect most developers to write good commits, but a lot of developers don't write good commits today simply because GitHub's PR interface is bad at reviewing individual commits and developers don't see as much of a point in it if they aren't going to be reviewed in that way.
Graphite addressees this issue in a different way - it lets the author split a pr into several smaller prs, so one can review it piece by piece, but still see the whole super-pr in one place. It does a decent job rebasing stacked prs as well.
If faced with this issue, I would probably just pull the remote/branch locally and step through it, commit by commit, using my preferred Git manager (Lazygit).
My big complaints are with the default experience in how that affects everyone's expectaions, but I do this sometimes, yes. The VS Code "GitHub Pull Requests and Issues" extension has gotten really good and gives a lot of tools for this. If using github.com (and not GHE) you can even use the "." shortcut in any PR to open that PR in github.dev, which is a web version of VS Code (that you can Settings Sync) to review the PR there without even needing the local checkout. But also that's a bit of a "hidden shortcut" and very few developers know about it, which again gets back to what the GitHub interface provides by default and how it makes things discoverable being something of an underlying issue.
Good commits can tell a story. (The article here discusses this, too, and suggests that good commits should tell a story.) When a commit author takes the time to fill out the commit message, it will include things like the "how" and "why" of each step, what they were thinking about as they worked on that part of the whole PR. Often I find that will save you from asking questions like "Why did you take this approach?" which I see all the time in PRs (and make all the time in PRs with "bad" or just "save point" commits).
The PR should be the smallest unit of integration (this works and builds and is ready to merge to the next steps), but the commit is the smallest unit of any progress at all. Ideas don't come fully formed and ready to compile. Progress sometimes includes back tracks and experiments. Good commits say things "hey, I learned from this thing that wasn't working and that's what pushed me into this next direction". They document the journey, why a specific path was taken, what obstacles were in the way, what other paths were explored and dismissed.
Some PR authors can capture a lot of that in a PR description as well, but commits tie that to specific context of the code at a point in time in ways that a PR description often can't, without linking to commits (in which case the commits again speak for themselves).
But yes, not everyone can or will write good commits. I see that partly as a tooling failure because our tools themselves like GitHub PRs don't encourage it, often fail to reward it. I've seen plenty PRs full of commits named only "WIP" and "Fix" and "oops", but the best PRs tell a story in the commits, have meaningful descriptions on each commit. I would love for our tools to encourage more of those because I think they are better PRs; often easier to review PRs and enough good PRs like that form a string of documentation that you can search through (through git blame and git log if nothing else, but that's still a lot of useful research data). (If you keep that data, I know a lot of people like squash merges because they distrust the git DAG and how many tools show ugly or hard to read "subway diagrams" for it instead of simpler collapsible views. But that's another long conversation.)
> [...] it should take the average reviewer 5-10 minutes [...]
Jane Street code review system kinda solves this problem by
- making each commit a branch,
- stacking branches on top of each other (gracefully handling rebases and everything that comes with it), and
- reviewing one iteration at a time
So one reviews single commits independently (takes probably around 5mins), and "forces" the reviewer to re-live the story that led to the bigger diff.
I do not work at Jane Street but I frequently find myself pondering on how broken the common code review system/culture is. I've heard of tools like graphite.dev that build on top of git to provide a code review system similar to the Jane Street one, but I'm not an active user yet (I just manually stack PRs, keep them small and ask for review to one at a time to my colleagues, and handle the rebasing etc. manually myself for now).
> handle the rebasing etc. manually myself for now
jj[1] is helpful for this. If you have a chain of commits like A -> B -> C, and you make a change to A, the rest of the chain is automatically rebased.
This is also the system we used with gerrit. Googles PR system. It uses cherry picks instead of branches but it's fantastic and gracefully splits the PR work between reviewer and committer
A lot depends on your goals for your code reviews. And your goals might even be different for different parts of the code base.
- Are you trying to make sure that more than one human has seen the code? Then simply reading through a PR in 10 minutes and replying with either a LGTM or a polite version of WTF can be fine. This works if you have a team with good taste and a lot of cleanly isolated modules implementing clear APIs. The worst damage is that one module might occasionally be a bit marginal, but that can be an acceptable tradeoff in large projects.
- Does every single change need to be thoroughly discussed? Then you may want up-front design discussions, pairing, illustrated design docs, and extremely close reviews (not just of the diffs, but also re-reviewing the entire module with the changes in context). You may even want the PR author to present their code and walk throuh it with one or more people. This can be appropriate for the key system "core" that shapes everything else in the system.
- Is half your code written by an AI that doesn't understand the big picture, that doesn't really understand large-scale maintainability, and that cuts corners and _knowingly_ violates your written policy and best practices? Then honestly you're probably headed for tech debt hell on the express train unless your team is willing to watch the AI like hawks. Even one clueless person allowing the AI to spew subtlety broken code could create a mess that no number of reviewers could easily undo. In which case, uh, maybe keep everything under 5,000 lines and burn it all down regularly, or something?
I think the other thing that often muddies the waters in discussions of code review is that open source projects and internal codebases are generally in rather different situations. An internal codebase is usually worked on by a fairly small group of experienced people, who are both creating and also reviewing PRs for it. So:
- the baseline "can I assume this person knows what they're doing?" level is higher
- making the "create PR" process take longer in order to make the review process faster is only a tradeoff of the time within the team
- if something is wrong with the committed code, the person who wrote it is going to be around to fix it
But in open source projects, there are much more often contributions by people outside the "core" long-term development team, where:
- you can't make the same assumptions that the contributor is familiar with the codebase, so you need to give things extra scrutiny
- there are often many fewer people doing the code review than there are submitting changes, so a process that requires more effort from the submitter in order to make the reviewer's job easier makes sense
- if there's a problem with the code, there's no guarantee that the submitter will be available or interested in fixing it once it's got upstream, so it's more important to catch subtle problems up front
and these tend to mean that the code-review process is tilted more towards "make it easy for reviewers, even if that requires more work from the submitter".
> - if there's a problem with the code, there's no guarantee that the submitter will be available or interested in fixing it once it's got upstream, so it's more important to catch subtle problems up front
It's also more important to have good tools to analyze subtle problems down the line, thus increasing the importance of bisection and good commit messages.
An underrated benefit of "make it easy for reviewers" is that when a bug is found, everybody becomes a potential reviewer. Thus the benefit does not finish when the PR is merged.
If your goal is to lower the velocity of your organization, e.g. because in practice code churn or poor quality are major problems, then by all means do this.
If you still need to move fast, then don't.
This is the "don't run in the hallways" version of software culture, but I would contend that you should choose your pace based on your situation. It's just like gradient descent really. The be efficient sometimes you need to make big hops, sometimes small ones.
Your comment includes capital letters and punctuation! These are skills we all learn to ensure our writing is as legible as possible to the reader. Quotes, periods, commas, clauses, paragraphs etc. are all stylistic decisions that support the underlying text and message.
I contend that learning the art of story telling through a stack of patches is just as important and, once learned, comes just as naturally as utilizing vocabulary, grammar, syntax and style with the written word.
Yeah, cowboy coding as a strategy makes most sense if it's very likely that the next person to look at the code will be you.
I'm not advocating either way as superior, but cowboy coding shouldn't mean that you don't pay your tech debt. It just means that it's much more common to roll a bug fix or small factoring improvement in with a feature, probably because you were already touching that code and testing it.
If prod bugs are so critical that there will be a rollback and a forensic retrospective on each one, then yeah you should bite the bullet and use all the most defensive PR tactics. If prod bugs have small costs and you can quickly "roll forwards" (ship fixes) then it's better to get some free QA from your users, who probably won't mind the occasional rough edge if they're confident that overall quality is OK and bugs they do find won't stay unfixed for years.
In practice:
- smaller PRs aren’t necessarily easier to review (and this arbitrary obsession almost always leads to PR overload in chunks that don’t make any sense, reducing code quality as a result)
- nobody reads intermediate commit messages one by one on a PR, period. I worked on a team where the lead was adamant about this and started to write messages in the vein of “if you’re reading this message, I’ll give u $5”. I never paid anyone a dollar. Don’t waste your time writing stuff for no one.
- “every commit must compile” - again, unnecessary overzealousness. Every commit on the MAIN branch definitely should compile. Wasting your time with this in a branch, as you work towards a solution, is focusing on the wrong thing
You want PRs because they help others absorb what you’re doing (they’ll have to read that same code sooner or later). You don’t want to create a performance theater.
I do. Especially if the author is competent.
That said, empirically, you're correct most people don't.
However, that said, I think changing the culture rather than throwing away the practice would be a better response.
Reading and reviewing clean history is really so much nicer. I'd also argue that actually making your history clean (as opposed to theatrically and thoughtlessly making small commits, say) forces you as the author to review it more carefully.
Yes, we griped that GitHub would not allow us to merge individual commits, but if it was ever urgent or helpful to do so, we cherry-picked a commit into a separate PR.
Everyone's workflow is a bit different, and it can be hard to redirect organizational inertia. But without a doubt, reading a clean commit history is a pleasure.
If I want that granularity, I'd go read the original PR and the discussion that took place.
You can have both with git and it's not even hard. Unfortunately it seems many people pride themselves in what little they know of git. I'm not being sarcastic, I've read people say this almost word-for-word.
A few exceptions:
1. When refactoring often your PR is "do an enormous search and replace, and then fix some stuff manually". In that case it's way easier to review if the mechanical stuff is in a separate commit.
2. Similarly when renaming and editing files, Git tracks it better if you do it in two commits.
3. Sometimes you genuinely have a big branch that's lasted months and has been worked on by many people and it's worth preserving history.
Also I really really wish GitHub had proper support for stacked PRs.
* Yes, TDD on production code is nice in theory, but it doesnt work in my case.
* Yes, short PRs are nice in theory, but it doesnt work in my case.
In every case, as far as I can see, it meant "It does work, I just dont know how to do it".
When I say "if you dont think it works in your case, come to me, Ill show you" they often demur and I end up with a huge PR anyway.
In practice I dont think ive ever seen a long PR that wouldnt have benefitted from being strategically broken up, but every other day I see another one that should have been.
Parent said something more along the lines of "they don't work in every case, and trying to force it in every case is misguided".
I agree that too big is more common than too small with respect to PR size, but you aren't putting forward much of an argument against parents "there are no absolutes" argument by straw manning them.
Obviously it has to be a pure refactor, entirely isolated from functional changes but there are plenty of similar cases where doing it once is the least effort.
I clean my history so that intermediate commits make sense. Nobody reads these messages in a pull request, but when I run git blame on a bug six months later I want the commit message to tell me something other than "stopping for lunch".
> pedantically apply DRY to every situation or forcing others to TDD basic app
Sure, pedantically doing or forcing anything is bad, but in my experience, copy-paste coding with long methods and a lack of good testing is a far more common problem.
You may be 100% correct in your particular case, but in general if senior devs are complaining that your code is sloppy and under-tested, maybe they aren't just being pedantic.
This is a false dichotomy and an unproductive thing to focus at.
Experienced engineers know when to make an abstraction and to not. It is based in the knowledge about project.
Abstarct well and don't do compression. Easy said, and good engineers know how to do it.
PRs are emails to your team and to your future self.
Framed in that context it's easier to carry the correct tone and think about scoping / what's important.
---
> pedantically apply DRY to every situation
I swear DRY has done more damage to the software industry from the developer side than it has done good because it has manifested into this big stick with which to bludgeon people without taking context into account.
This should be commits though. Typically, developers would look for clues in this order:
code -> code comment -> commit message -> PR text -> external document
So commit messages puts the information closer to the user. One hop doesn't seem much, but the time saved adds up as you go.
Also, as some other reader mentioned anecdotally, PRs may not be there forever. E.g. your team may migrate to a new platform PR text and reviews were left behind.
Indeed! I've found many point on this discussion answered by the linux kernel idea of mailing lists where a change is discussed then approved, often with feedback acknowledged
I think it's fine to have a whole bunch of "WIP" commit messages on intermediate commits while the PR is in a draft stage, but then all of those garbage commits should really be squashed down into one commit and you should at least write a one liner that describes what the whole change is doing. I think it does materially make repo history harder to understand to merge in PR's with 10 garbage commits in them.
I do! I find it the easiest way to review code when the author has taken the time to structure it in that way. I'm lucky to work with some great people.
>Don’t waste your time writing stuff for no one.
I've thought about that as I continue to write them. I think I can justify it by saying they are mostly for me. Can I describe what I'm trying to do with a specific push into a few items. It let's me reflect if I'm waiting too long between commits or if my ideas are getting too spread apart and really should be in two different branches that each have their own PRs. Then there is the rare case on a slower project where an item gets deprioritized and I come back to it weeks or even months later. Having the messages help me catch back up to speed.
As such, I find the 20 seconds or so to type out 1 to 2 sentences to be worthwhile, even if the ones reviewing the eventual PR never check. I'm also not above throwing in a "ditto" or "fixed issue" when a single commit really is that small or insignificant.
>“every commit must compile”
I agree with your take this is overzealous, but to expand upon my previous point, if I know a commit on a branch won't compile (say just had something else come up and need to swap focus for a few days), then I'll try to make sure I call that out in my last message just in case anyone else happens to get put on the project.
If I were to summarize my approach, treat PR messages seriously, but treat branch commit messages like sticky notes that will likely end up in the trash by week's end.
I agree that for a common team of programmers working for a single company, the value isn't always there. But that's the easiest and least interesting case... in big distributed projects this stuff really matters.
In my part of the world both of these are true, and proudly so. We keep catching a myriad of errors, big and small. The history is easy to read, and helps anyone catching up with how a certain project evolved.
I understand it might not be true for everyone, every team, in every line of business; but this sort of discipline pays off in quality oboth of the code _and_ the team members' abilities.
When you have a large PR like this, here's how I like to get it reviewed.
1. Give reviewers sometime to become familiar with the PR. They might not understand all parts of it, but they should have at least a cursory understanding of the PR.
2. Have a meeting where the PR is explained in front of the group of reviewers. The reviewers will understand the PR better and they can ask questions in realtime.
3. Let folks review the PR after the meeting in case they spot anything else, or think of additional questions.
Most of the time PR review is done asynchronously, but doing most of the review in the meeting can also be a decent team building exercise.
When reviewing a conglomerate commit in a PR, I have to reverse engineer how the different changes interact to figure out the intent. I then have to do this on each update they make. Contrast that to when someone breaks up their commits where I can zoom through variable renames, extracting functions, etc to see the one line that change that all of that unblocked that makes the difference. Then if updates are pushed, I only have to worry about the commits that were updated.
As for all commits compiling, that is helpful to review the individual commits.
Both of these (small commits, all compiling) are also great for bisecting. You get pointed to a very small change that you can more easily analyze vs dealing with breakages or having to analyze a large change to find what the problem is.
Focus on customer outcomes, and keep main clean.
(With few exceptions,) I generally follow this practice; BUT, I think enforcing this on other developers feels like micromanagement. That being said, with few exceptions, committing code that doesn't compile feels like an incomplete sentence.
(Sometimes on massive refactors I make commits that don't compile. It gives me a place to roll back to. If someone thinks this is poor practice, than I think they're putting principles in place of practicality.)
A _branch_ is a unit of work that should be merged when done.
As the owner of a branch, an engineer has the ability to move into intermediate states. The larger the codebase, the larger the possibility of something unexpected breaking or not compiling. Just like editing a large body of text - you will have "incomplete sentences" through the process. It's part of writing. Expecting others to write their drafts the same way you like is just silly - it's putting rigid principles ahead of anything else that matters.
Very common practice at my old company, and one I continue in my current role.
> “every commit must compile”
sucks ass for anyone else trying to rebase your branch onto the update main/master when they don't. Once your PR is out of "working on the feature" and into the "getting it merged" phase, do a little `git rebase -i` and squash your really intermediate commits into ones that compile. Ignore this if you have real grown up CI where your PRs never stay open for more than a day.
A vast majority of the drama that comes out of source control is associated with branches living for far too long.
I've got an internal alarm that starts to go off somewhere around 72 hours. If something takes longer than this, I've probably screwed up in my planning phase. There are some things that do need to sit, but they should be rebased every morning like clockwork. The moment things start to conflict, the PR gets closed and the branch is now a reference for how to do it again when whatever blocker is cleared.
Another way to think about all of this is to pretend like everything you are touching is taking a synchronous lock out (even if it's not), similar to how tools like Perforce behave. So, you generally want to move as quickly as possible to get out from under lock contention. Git allows you to pretend like you aren't conflicting for a really long time, but at some point you must answer for all of this debt (with interest).
why would anyone else rebase your branches? YOU should rebase your branches.
my understanding is that you commit when you are at the "good place", where the part of the code you are working on works. That way when you keep going and find yourself going in a direction that is not right, you can go back to the last good place. If your code doesn't even compile, that doesn't seem like a good place.
That isn't really where it came from though. The idea was, if I want an open source maintainer to accept my changes, I make a request to pull them from my branch. Once the open source maintainer has merged it in, they own it. If they don't like it (even one little bit), they can reject it because quality / ownership / maintenance is completely on them.
On a team environment where no one owns anything it is a little less clear what the value is. You want to incentivize the "betterness" of "something" and are using "broadened knowledge" as a proxy for that. Usually this just goes unexamined but really it would be good to establish how broad and deep you want this knowledge to be and work back from there - is the 5 minute PR review the best way to achieve it?
This is why I gave up reading the article shortly after reaching the point about making a history with commit messages. The comments—even if it is on a Git forum—will just be full of people that either say that it’s a waste of time or that it is literally impossible for this to be practiced by anyone.[1]
Your best bet is to find projects where this is practiced (and you don’t have to look far). But making the case to a general audience? No, too many loud voices that treat version control like “I am committing now because I need to pick up the dry-cleaning” arbitrary/random snapshot-maker.
[1] No one, period? Sounds like a bit of a strict ontological rule to me.
> every commit must compile
I’m in the opposite camp. Following these two practices often doesn’t make any difference but the few times it did saved me a ton of time.
Dropping commits or rebasing is much easier when you have descriptive, atomic commits. It’s also helpful when performing git blame archeology to try and understand why this code looks so weird and has no context. It’s also useful when bisecting (not so much a problem with small PRs, quite handy as they grow bigger and bigger)
As with everything it’s about context and circumstances. As you gain expérience you can appreciate and gauge when it’s required. When you don’t have the expérience then you follow rules so that you gain said expérience. That’s how I see it.
I often do, In a larger PR or in one where it's hard to tell what is being accomplished, like this article articulates the commits can tell a story of the engineers journey to solution. Even if I review a commit that is largely undone by future commits that piece of history is often key to my understanding.
Until you need to `git bisect`. Then you'll require that every commit compile, pass tests, etc.; even if that means rebase/squashing to do it.
Main has clean history and every commit is good.
So you're the one breaking git bisect all the time. Grrrr.
Use stgit and make decent commits instead of rolling in the dirt like an animal.
Meh, most people wont address it or ask that dollar. It does not mean I did not read it, I chuckled and moved on.
I do read every commit on PR chain and every line. I am not necessary super attentive reviewer or something, but I never accept it without at least formally looking at it.
If every commit on the main branch must compile then why wouldn't it also compile in the PR branch? It doesn't make sense to ask people to review, then after that rebase and merge imo.
Why advocate against this anyway? If no one reads them, it harms no one. Just like personal blogs. However, the writing of the blog is the useful act, not the reading.
Ironic that you are accusing TFA article of being an expert novice. I don't disagree your take on him / the article, but you are committing the same sin.
> - smaller PRs aren’t necessarily easier to review (and this arbitrary obsession almost always leads to PR overload in chunks that don’t make any sense, reducing code quality as a result)
Oh but they sure can be reviewed more easily, because they are shorter? Doing so feels like less effort, and you get a dopamine hit from hitting that "submit review" button faster/more often (improved morale, and PR turnaround time!). Plus, if there's a longer discussion about X, it's great if it's not tangled up with Y and Z at the same time - allowing you to dig into X.
> - nobody reads intermediate commit messages one by one on a PR, period.
Come on, that's intellectually dishonest. 1. VSCode displays commit messages inline as blame for me (and many of my colleagues), so even when we don't read the commit messages one by one _on a PR_, I often read them later in the IDE (we don't squash merge PRs). I spend significantly more time reading code than writing, and commit messages, PR descriptions and linked issues provide extra context that is useful to me especially for complex code. If those messages were entirely unreadable, I'd be annoyed. 2. When someone invests time into telling a good story commit by commit, in my team they write "Review commit-by-commit is encouraged" in the PR description, to tell the reviewers that yes, they should read the individual commits, as that'll make understanding the PR easier. Often as reviewer, I follow that suggestion.
> Wasting your time with this in a branch, as you work towards a solution, is focusing on the wrong thing
It seams you're conflating "working on a feature" with "presenting it as PR to review". That's two very different things, and Edamagit in VSCode makes it so so easy to provide a reasonable commit history that hides some of your missteps, and to fill in commit messages.
you need to be careful with every single commit message, every commit must compile, etc, in your case. My comments apply if you squash-merge, in which case all that commit-level care is not necessary since intermediate commits go away on merge. You’re probably making your life harder for no reason for avoiding squash-merge, but that’s just my opinion
But I trust my colleagues to do good reviews when I ask them to, and to ignore my PRs when I don't. That's kind of the way we all want it.
I regularly ask for a review of specific changes by tagging them in a comment on the lines in question, with a description of the implications and a direct question that they can answer.
This, "throw the code at the wall for interpretation" style PR just seems like it's doomed to become lower priority for busy folks. You can add a fancy narrative to the comments if you like, but the root problem is that presenting a change block as-is and asking for a boolean blanket approval of the whole thing is an expensive request.
There's no substitute for shared understanding of the context and intent, and that should come out-of-band of the code in question.
Instead I request that it is self reviewed with context added, prior to requesting re-review.
I also tend to ask the question, “are any of these insights worth adding as comments directly to the code?”
9/10 the context they wrote down should be well thought out comments in the code itself. People are incredibly lazy sometimes, even unintentionally so. We need better lint tools to help the monkey brain perform better. I wish git platforms offered more UX friendly ways to force this kind of compliance. You can kind of fake it with CI/CD but that’s not good enough imo.
Minimally, I would like context for the change, why it required a change to this part of the codebase, and the thought process behind the change. Sometimes but not often enough I send the review back and ask them for this info.
IMO many software developers are just not fast enough at writing or language so providing an explanation for their changes is a lot of work for them. Or they are checked out and they just followed the AI or IDE until things worked, so they don't have explanations to provide. People not taking the time is what makes reviews performative.
… the why is important
> IMO many software developers … don't have explanations to provide. People not taking the time is what makes reviews performative.
… a lot of developers only consider the how.
i’ve had a lot of experiences of “once my PR is submitted that’s my work/ticket finished” kind of attitude.
i spent a year mentoring some people in a gaming community to become dev team members. one of the first things i said about PRs was — a new PR is just your first draft, there is still more work to do.
it helped these folks were brand spanking new to development and weren’t sullied by some lazy teacher somewhere.
# Goal (why is this change needed at all)
# What I changed and why I did it this way
# What I'm not doing here and how I'll follow up
# How I know it works (optional section, I include this only for teams with lots of very junior engs: "added a test" is often sufficient)
I think this is the overwhelming factor, software engineering doesn't select for communication skills (and plenty of SEs will mock that as a field of study), or at least most SEs don't start out with them.
Ideally, yes. After a decade and something' under ZIRP, seems a lot of workers never had incentive to remain conscious of their intents in context long enough to conduct productive discourse about them. Half of the people I've worked with would feel slighted not by the bitterness the previous sentence, but by its length.
There's an impedance mismatch between what actually works, and what we're expected to expect to work. That's another immediate observation which people to painfully syntaxerror much more frequently than it causes them to actually clarify context and intent.
In that case the codebase remains the canonical representation of the context and intent of all contributors, even when they're not at their best, and honestly what's so bad about that? Maybe communicating them in-band instead of out-of-band might be a degraded experience. But when out-of-band can't be established, what else is there to do?
I'd be happy to see tools that facilitate this sort of communication through code. GitHub for example is in perfect position to do something like that and they don't. Git + PRs + Projects implement the exact opposite information flow, the one whose failure modes people these days do whole conference talks about.
If you don't know them, please realize your code isn't automatically a gift everybody waited for, you may see it that way, but from the other side this isn't clear until someone put in the work to figure out what you did.
In short: added code produces work. So the least you should do is try reducing that work by making it easy to figure out what your code is and isn't.
Sum up what changes you made (functionally), why you made them, which choices you made (if any) and why and what the state of the PR code is in your own opinion. Maybe a reasoning why it is needed, what future maintenance may look like (ideally low). In essence, ask yourself what you'd like to know if someone appeared at the door and gave you a thumb drive with a patch for your project and add that knowledge.
Also consider to add a draft PR for bigger features early on. This way you can avoid programming things that nobody wanted, or someone else was already working on. You also give maintainers a way to steer the direction and/or decline before you put in the work.
Dead Comment
Dead Comment
PRs should be optional, IMHO. Not all changes require peer review, and if we trust our colleagues then we should allow them to merge their branch without wasting time with performative PRs.
Part of the difference is the idea you can catch all problems with piecemeal code review is nonsense, so you should have at least some sweeping QA somewhere.
At $DAY_JOB we need approvals from peers due to industry regulation.
No single person being able to make changes to a system is a tenant of that.
"How do you create a PR that can be reviewed in 5-10 minutes? By reducing the scope. A full feature should often be multiple PRs. A good rule of thumb is 300 lines of code changes - once you get above 500 lines, you're entering unreviewable territory."
The problem with doing this is if you're building something a lot bigger and more complex than 500 lines of code, splitting that up across multiple PR's will result in:
- A big queue of PR's for reviewers to review
- The of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
- You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
The right answer for the size of a PR is NOT in lines of code. Exercise judgement as to what is logically easier to review. Sometimes bigger is actually better, it depends. Learn from experience, communicate with each other, try to be kind when reviewing and don't block things up unnecessarily.
This is a feature. I would infinitely prefer 12 PRs that each take 5 minutes to review than 1 PR that takes an hour. Finding a few 5-15 minute chunks of time to make progress on the queue is much easier than finding an uninterrupted hour where it can be my primary focus.
> - The of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
It increases it a little bit, sure, but it also helps keep things focused. Reviewing, for example, a refactor plus a new feature enabled by that refactor in a single PR typically results in worse reviews of either part. And good tooling also helps. This style of code review needs PRs tied together in some way to keep track of the series. If I'm reading a PR and think "why are they doing it like this" I can always peek a couple PRs ahead and get an answer.
> - You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
This is a tooling problem. Git and Github are especially bad in this regard. Something like Graphite, Jujutsu, Sapling, git-branchless, or any VCS that supports stacks makes this essentially a non-issue.
the point is not queue progression, it is about dissemination of knowledge
one holistic change to a project = one PR
simple stuff really
A trivial example would be adding the core logic and associated tests in the first commit, and all the remaining scaffolding and ceremony in subsequent commits. I find this technique especially useful when an otherwise additive change requires refactoring of existing code, since the things I expect will be reviewed in each and the expertise it takes are often very different.
I don't mind squashing the branch before merging after the PR has been approved. The individual commits are only meaningful in the context of the review, but the PR is the unit that I care about preserving in git history.
Maybe it's just a me problem, maybe I need to be more disciplined. Not sure but it catches me quite often.
This shouldn't matter unless you are squashing commits further back in the tree before the PR or other people are also merging to main.
If a lot of people are merging back to main so you're worried about those causing problems, you could create a long life branch off main, branch from that and do smaller PRs back to it as you go, and then merge the whole thing back to main when your done. That merge might 2k lines of code (or whatever) but it's been reviewed along the way.
I don't necessarily disagree with you. Just pointing out that there are ways to manage it.
But for the company, having two people capable of working on a system is better than one, and usually you want a team. Which means the code needs to be something your coworkers understand, can read and agree with. Those changes they ask for aren't frivolous: they are an important part of building software collaboratively. And it shouldn't be that much feedback forever: after you have the conversation and you understand and agree with their feedback, the next time you can take that consideration into account when you are first writing the code.
If you want to speed that process up, you can start by pair programming and hashing out disagreements in real time, until you get confident you are mostly on the same page.
Professional programming isn't about closing tickets as fast as possible. It is about delivering business value as a team.
Of course that won't work for all projects/teams/organizations. But I've found that it works pretty well in the kinds of projects/teams/organizations I've personally been a part of and contributed to.
Use jujutsu and then stacking branches is a breeze
AI agents like frequent checkpoints because the git diff is like a form of working memory for a task, and it makes it easy to revert bad approaches. Agents can do this automatically so there isn't much of an excuse not to do it, but it does require some organization of work before prompting.
If you don't have feature flags, that is step one. Even if you don't have a framework, you can use a Strategy or a configuration parameter to enable/disable the new feature, and still have automated testing with and without your changes.
+100 to this. My job should be thoughtfully building the solution, not playing around with git rebase for hours.
Suddenly rebasing a stack of branches becomes 1 command.
> A big queue of PR's for reviewers to review
Yes, yes please. When each one is small and understandable, reviewers better understand the changes, so quality goes up. Also, when priorities change and the team has to work on something else, they can stop in the middle, and at least some of the benefits from the changes have been merged.
The PR train doesn't need to be dumped out in one go. It can come one at a time, each one with context around why it's there and where it fits into the grander plan.
> The [totality] of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
A primary goal of code review is to build up the mental map of the feature in the reviewers' brains. I argue it's better for that to be constructed over time, piece by piece. The immediate cognitive load for each pull request is lower, and over time, the brain makes the connections to understand the bigger picture.
They'll rarely achieve the same understanding of the feature that you have, you who created and built it. This is whether they get the whole shebang at once or piecemeal. That's OK, though. Review is about reducing risk, not eliminating it.
> You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
I've learned not to charge too far ahead with feature work, because it does get harder to manage the farther you venture from the trunk. You will get conflicts. Saving up all the changes into one big hunk doesn't fix that.
A big benefit of trunk-based development, though, is that you're frequently merging back into the mainline, so all these problems shrink down. The way to do that is with lots of small changes.
One last thing: It is definitely more work, for you as the author, to split up a large set of changes into reviewable pieces. It is absolutely worth it, though. You get better quality reviews; you buy the ability to deprioritize at any time and come back later; most importantly for me, you grasp more about what you made during the effort. If you struggle to break up a big set of changes into pieces that others can understand, there's a good chance it has deeper problems, and you'll want to work those out before presenting them to your coworkers.
PR's should generally be the size of a feature, or a meaningful subfeature for large features.
When you arbitrarily split up PR's into something "300 lines" or "5-10 minutes" you can miss the forest for the trees. The little thing looks fine in isolation but doesn't make any sense as part of a larger approach. Different people are reviewing it piecemeal but nobody is reviewing the approach as a whole or making sure the parts fit together right.
And then the idea of "telling a story with commits" feels like a waste of time to me. I have no interest in the order in which you wrote the code, or what you wrote and then rewrote. The code itself needs to be legible. Your final code with its comments should speak for itself. Code is the what and comments are the why.
Now, what I will say is that the more junior the developer, the smaller their commits should be. But that's because they should be assigned smaller features, and have more handholding along the way. And when people are making larger architectural changes, they should be getting signoff on their overall approach from the start -- if you're frequently rejecting the whole approach to a problem in code review, something's going wrong with your communications processes.
Honestly I think it would be far more effective to just review a paragraph and maybe a diagram that explains "here's how I think I'm going to tackle this problem" and forget about line-by-line code review entirely. Other than for training juniors, I don't think there's much long term value in "I think you should use an anonymous function here instead of a named function".
The kinds of things that are usually brought up in code review are not what contributes to real long term technical debt, because a function name or code formatting can be changed in an instant, while larger architectural decisions cannot.
The other thing I noticed is that even when an architectural issue is obvious, there's a tendency to not want to change it because so much of the work has already been done in preparing the PR for review. If you point out a flaw in an architectural decision, it's not unusual for the person to reasonably say "I've just put together a chain of 5 PRs and now you're asking me to rewrite everything?"
Commit #1 adds a helper function for whatever, looks innocent enough, implementation is correct. Believe it or not, it even has tests, lgtm. Then only by commit #8 do you realize this helper function is not needed at all and the entire approach is wrong. Happens every time.
I started reviewing these chains backwards and refuse starting a review until the whole chain is available. That’s however not always easy either, when commit #2-#5 has incrementally refactored everything into something unrecognizable, so that both the left and right side of the diff are wrong! No, I’m not interested in ”this will be fixed 2 commits down the chain”. I just want to review the final state that goes into production, nothing else matters.
Yes, commits should be made small whenever possible and not include unrelated fixes or refactors. Just please, keep them meaningful on their own.
It is very telling in the article itself there is a screenshot of the commits tab in the PR workflow that many don't realize even exists and/or never think to use.
In the Files tab the commit picker has gotten better in recent years, but it is still overly focused on selecting ranges of commits over individual ones, and there's no shortcuts to easily jump Next or Previous commit, you have to remember your place and interact with the full pulldown every time. Also, it's hard to read the full descriptions of commits in the Files view and I find I often have to interrupt my flow to open the commit in another browser tab or flip back and forth between the Commits tab and the Files tab in the PR. The Commits tab also defaults to hiding most of the commit descriptions, so it's still not a particularly great reading experience.
It feels like a bit of a bad feedback loop that GitHub's UI doesn't make commit-by-commit reviewing clean/easy because GitHub themselves don't expect most developers to write good commits, but a lot of developers don't write good commits today simply because GitHub's PR interface is bad at reviewing individual commits and developers don't see as much of a point in it if they aren't going to be reviewed in that way.
The PR should be the smallest unit of integration (this works and builds and is ready to merge to the next steps), but the commit is the smallest unit of any progress at all. Ideas don't come fully formed and ready to compile. Progress sometimes includes back tracks and experiments. Good commits say things "hey, I learned from this thing that wasn't working and that's what pushed me into this next direction". They document the journey, why a specific path was taken, what obstacles were in the way, what other paths were explored and dismissed.
Some PR authors can capture a lot of that in a PR description as well, but commits tie that to specific context of the code at a point in time in ways that a PR description often can't, without linking to commits (in which case the commits again speak for themselves).
But yes, not everyone can or will write good commits. I see that partly as a tooling failure because our tools themselves like GitHub PRs don't encourage it, often fail to reward it. I've seen plenty PRs full of commits named only "WIP" and "Fix" and "oops", but the best PRs tell a story in the commits, have meaningful descriptions on each commit. I would love for our tools to encourage more of those because I think they are better PRs; often easier to review PRs and enough good PRs like that form a string of documentation that you can search through (through git blame and git log if nothing else, but that's still a lot of useful research data). (If you keep that data, I know a lot of people like squash merges because they distrust the git DAG and how many tools show ugly or hard to read "subway diagrams" for it instead of simpler collapsible views. But that's another long conversation.)
> [...] Telling a Story with Commits [...]
> [...] it should take the average reviewer 5-10 minutes [...]
Jane Street code review system kinda solves this problem by
- making each commit a branch,
- stacking branches on top of each other (gracefully handling rebases and everything that comes with it), and
- reviewing one iteration at a time
So one reviews single commits independently (takes probably around 5mins), and "forces" the reviewer to re-live the story that led to the bigger diff.
I do not work at Jane Street but I frequently find myself pondering on how broken the common code review system/culture is. I've heard of tools like graphite.dev that build on top of git to provide a code review system similar to the Jane Street one, but I'm not an active user yet (I just manually stack PRs, keep them small and ask for review to one at a time to my colleagues, and handle the rebasing etc. manually myself for now).
jj[1] is helpful for this. If you have a chain of commits like A -> B -> C, and you make a change to A, the rest of the chain is automatically rebased.
[1] https://jj-vcs.github.io/
Deleted Comment
Dead Comment
- Are you trying to make sure that more than one human has seen the code? Then simply reading through a PR in 10 minutes and replying with either a LGTM or a polite version of WTF can be fine. This works if you have a team with good taste and a lot of cleanly isolated modules implementing clear APIs. The worst damage is that one module might occasionally be a bit marginal, but that can be an acceptable tradeoff in large projects.
- Does every single change need to be thoroughly discussed? Then you may want up-front design discussions, pairing, illustrated design docs, and extremely close reviews (not just of the diffs, but also re-reviewing the entire module with the changes in context). You may even want the PR author to present their code and walk throuh it with one or more people. This can be appropriate for the key system "core" that shapes everything else in the system.
- Is half your code written by an AI that doesn't understand the big picture, that doesn't really understand large-scale maintainability, and that cuts corners and _knowingly_ violates your written policy and best practices? Then honestly you're probably headed for tech debt hell on the express train unless your team is willing to watch the AI like hawks. Even one clueless person allowing the AI to spew subtlety broken code could create a mess that no number of reviewers could easily undo. In which case, uh, maybe keep everything under 5,000 lines and burn it all down regularly, or something?
- the baseline "can I assume this person knows what they're doing?" level is higher
- making the "create PR" process take longer in order to make the review process faster is only a tradeoff of the time within the team
- if something is wrong with the committed code, the person who wrote it is going to be around to fix it
But in open source projects, there are much more often contributions by people outside the "core" long-term development team, where:
- you can't make the same assumptions that the contributor is familiar with the codebase, so you need to give things extra scrutiny
- there are often many fewer people doing the code review than there are submitting changes, so a process that requires more effort from the submitter in order to make the reviewer's job easier makes sense
- if there's a problem with the code, there's no guarantee that the submitter will be available or interested in fixing it once it's got upstream, so it's more important to catch subtle problems up front
and these tend to mean that the code-review process is tilted more towards "make it easy for reviewers, even if that requires more work from the submitter".
It's also more important to have good tools to analyze subtle problems down the line, thus increasing the importance of bisection and good commit messages.
An underrated benefit of "make it easy for reviewers" is that when a bug is found, everybody becomes a potential reviewer. Thus the benefit does not finish when the PR is merged.
If you still need to move fast, then don't.
This is the "don't run in the hallways" version of software culture, but I would contend that you should choose your pace based on your situation. It's just like gradient descent really. The be efficient sometimes you need to make big hops, sometimes small ones.
I contend that learning the art of story telling through a stack of patches is just as important and, once learned, comes just as naturally as utilizing vocabulary, grammar, syntax and style with the written word.
Deleted Comment
If you need to move fast for the next two weeks, sure. If you need to move fast for the next year, you are better off collaborating.
I'm not advocating either way as superior, but cowboy coding shouldn't mean that you don't pay your tech debt. It just means that it's much more common to roll a bug fix or small factoring improvement in with a feature, probably because you were already touching that code and testing it.
If prod bugs are so critical that there will be a rollback and a forensic retrospective on each one, then yeah you should bite the bullet and use all the most defensive PR tactics. If prod bugs have small costs and you can quickly "roll forwards" (ship fixes) then it's better to get some free QA from your users, who probably won't mind the occasional rough edge if they're confident that overall quality is OK and bugs they do find won't stay unfixed for years.