Claim: the ideal PR is 50 lines long

appplication · 2 years ago

I have found that teams who allow larger PR sizes without complaining about it can get a lot done a lot faster. Which is to say having 50 line PRs are not ubiquitously ideal, but are somewhere on the spectrum of acceptable but come with definitive tradeoffs. Reviewing and merging 10x60 line PRs is, in my experience, more time intensive than reviewing one 600 line PR.

Most of the tests for and new functionality alone in any of our PRs well exceed 50 lines. Indeed we have 2-3x more test code than actual code. So should we be writing less tests? Merging the code first and later the tests? Merging the broken, codeless tests first and then the code?

It’s all ridiculous. Just make PRs that represent unit change, whatever the means to you. The unit functionality is independent of lines of code. Sometimes that is 12 lines, sometimes it’s 800. Yeah large PRs are harder to understand but that’s why you have tests. Also XXL PRs aren’t usually happening every day. If they are, you have a very productive team and maybe you should count yourself lucky.

bryanlarsen · 2 years ago

But why does it take longer for 10x60 line PR's?

If it's because those PR's are getting reviewed more thoroughly, that's likely a great use of time.

If it's because the small PR's are getting bikeshedded, it's not.

If it's because of tooling, then the OP at graphite.dev has something to sell you. I'd be buying but we're a GitLab shop.

> Indeed we have 2-3x more test code than actual code. So should we be writing less tests? Merging the code first and later the tests? Merging the broken, codeless tests first and then the code?

I find that test code is rarely reviewed properly. If you want it reviewed properly, then create MR's that are just tests. You can create an MR with the change and happy path code, and then subsequent MR's with each of the rest of the tests.

If you're fine with tests not getting the same level of attention (and that's likely OK), then tests don't count as part of the 50 lines, IMO.

johnfn · 2 years ago

It's because every time I switch from coding to reviewing, there's a context switch, and every time I switch back from reviewing to coding, there's another context switch. I can easily spend 30 minutes getting back into the flow of coding after doing something else.

One PR has 1 context switch. 10 PRs have 10.

To say nothing of the cascading effects of CR on early PRs. Oh, you want me to rename a variable that I used in the first PR? Looks like I'm spending the next 10 minutes combing through my other 9 diffs updating names. That would have been a 5 second refactor if I only had a single monolithic PR.

ysavir · 2 years ago

Not the OP, but in my experience, it's because those 60 line PRs lack a lot of context, and it's not until you see the cumulative changes across those 10 60-line PRs that you really understand how they come together to form a feature--and are able to review them within that context. So instead of having one 600 line PR in which the whole picture is clear, you have to go back and forth between 10 different PRs and comparing notes between them to make sure you really understand any of them.

I still think 600 is a lot, but there's a happy medium between "minimal change that excluded broader context" and "monster PR that has all context, all changes, and eats left-pad packages for breakfast". Line count shouldn't be the goal. The goal should be making as small a change possible that is self-contained (ie all context is either within the PR or already in the codebase), and (IMO) that can be delployed to production as-is.

danielovichdk · 2 years ago

"I have found that teams who allow larger PR sizes without complaining about it can get a lot done a lot faster".

Sure.

How many times is the code touched after the PR has been merged is a better question.

Faster is not better. Better is not faster.

Draiken · 2 years ago

Weirdly, what the article claims is that it's the ideal size because of speed. So they are trying to say it's faster, therefore it's better.

Which is in my view a very silly claim devoid of any context.

herpdyderp · 2 years ago

I couldn't disagree more. My current team has a history of large PRs and everything was a dumpster fire until we started really hammering down on PR size. Prod broke all the time, reverting code was a nightmare because of how big the "units" were, code review was not thorough, problems problems problems.

> large PRs are harder to understand but that’s why you have tests

I'd rather have comprehensible, well-reviewed code than test coverage (I'd rather have both, of course). We've got a mission critical module that is a crusty piece of junk that everyone is scared to change because of how much of a mess it is, even with its amazing test coverage (frankly the code is so bad that I assume the tests are just as bad anyway).

> XXL PRs aren’t usually happening every day. If they are, you have a very productive team and maybe you should count yourself lucky

I've never seen an XXL PR that improved productivity. They pile up tech debt, bugs, and down time. I don't care how fast the devs are moving if they bring prod down every week. That is absolutely unacceptable.

On a team of superstar, perfect developers that all understand the code base perfectly, I'd probably agree with you. If that's your situation then I envy you :) But I've never been on such a team.

bvrmn · 2 years ago

> everyone is scared to change because of how much of a mess it is

As a rule messy code a way more easier to cleanup in one big chunk. Teams with CR limit prefer not to touch such nests.

christophilus · 2 years ago

My team has a history of large (and small) PRs. If it's a huge feature / vertical, it's often done in a few large PRs. Very few reverts in our 5 year history.

I guess, what I'm saying is, it depends on the team, on their seniority and capabilities, on the product, the culture, etc.

wrs · 2 years ago

Measuring the total size of the PR seems to me to be missing a dimension, namely the commit size. What if a 600-line PR consist of 10 60-line reviewable commits? Which, if not for GitHub’s broken review UI, is what I would hope is the goal for such a thing. Then you have a reasonable amount of code to look at per commit, and all in context of the larger overall change.

bvrmn · 2 years ago

Refactorings and redesings could easily span 1k lines even for small projects. Yes, almost always you could make it incremental, but total lines counts would increase and it requires hard planing beforehand. Too much fuzz.

Deleted Comment

wk_end · 2 years ago

In re: to

> Reviewing and merging 10x60 line PRs is, in my experience, more time intensive than reviewing one 600 line PR.

The article disagrees, and the article has data:

> Some folks might wonder if writing small PRs leads to less code output in total. We all want to be high velocity, but there are times when you need to be high volume. Consistently writing sub-20 line PRs will have a significant impact on your net coding ability - but interestingly, so will writing PRs greater than 100+ lines. The highest volume coders and repositories have a median change size of only 40-80 lines. I suspect this is because the volume of code is change-size * speed of changes. Too small of changes, and the faster merge time doesn't make up for it. Too large, and you start getting weighted down by slower review and merge cycles.

I agree that you shouldn't be dogmatic about PR size - there's an art to engineering as much as it's a science. But part of that art might also be recognizing that "unit change, whatever that means to you" is - as you suggest - a flexible concept. The takeaway for me here is, given what the data shows, when you end up with a 600-line unit it might be worthwhile to try to, if possible, find a way to break that unit down.

ffgjgf1 · 2 years ago

> and the article has data

How does that data prove anything? The ideal PR size is 50 lines because that’s the median size on Github. Seem like a pretty worthless claim.

hugocbp · 2 years ago

I'm probably in the minority here, but personally I'd much rather review a 300 line PR instead of 6 50-line ones if the change is a single context.

I briefly worked with a hard line-count-limit for PRs and I thought it made everything much worse. It is fine for changes that are actually small, but when you need to go back and re-open 4, 5 merged PRs in different tabs to get the full context again, the time to review increases tenfold with tiny PRs that don't really make complete sense by themselves.

I have worked with co-workers that have the complete opposite preference, though, and anything over a set amount of lines wouldn't even be reviewed.

Interesting to see the numbers on the article, however. My anecdotal experience would make me guess the opposite. I feel like work slows to a crawl once the PRs are artificially limited and broekn up like that, specially in fast moving companies and startups.

jjav · 2 years ago

Line count (or count of any units in the PR) is completely meaningless.

I feel that in the last decade somehow there has been a loss of context of what is the purpose of version control.

Why do we bother with version control? Sure, there are many reasons.

But a primary reason is that if we discover behavior X turns out to cause regressions (or is otherwise a bad idea), we can easily revert it by reverting its commit.

That's a why a single commit should contain a single behavior change, and contain the entirety of that single behavior change.

If the change is split among multiple commits, it'll be a pain to revert it. If the change is contained within a commit that changes other things, it'll be an ever larger pain to revert it.

So that's my rule. A commit is single behavior change, all of it, and nothing else.

You can't express that in lines. Might be 1 line might be lots.

wubrr · 2 years ago

100% agree, if you have multiple commits to complete the change during development, just squash them into one for the PR.

That being said many companies (including some FAANG) use commit counts as a performance metric, thereby directly disincentivizing clean, readable commits.

mandelbrotwurst · 2 years ago

Sure, and at the same time a PR can contain multiple commits. You don't have to squash and merge.

jvans · 2 years ago

> but when you need to go back and re-open 4, 5 merged PRs in different tabs to get the full context again, the time to review increases tenfold with tiny PRs

You can't really say much about the first 4 PRs because if you don't know how that code is going to be used, it's kind of hard to giving meaningful feedback. By PR 5, it's too late to give feedback on PRs 1-4.

I also have worked with people that prefer that but I never understood it. Code without context isn't reviewable imo

klodolph · 2 years ago

A skill that you see in some of the more experienced devs is the ability to front-load the most important stuff in a series of changes, or make changes that can at least be understood in isolation.

It’s not always reasonable to do that, but you can often deliver incremental changes where each incremental change either brings a whole feature visibly closer to completion, or provides some self-contained utility, or makes some refactoring changes with the promise of “this will make it possible to do X, later”.

giancarlostoro · 2 years ago

My bar is if its more than 20 files, or breaks the version control UI, which I've seen in bitbucket some years ago, then your PR is a little too much, and if its a major automated refactor of some sort, it should not contain any other code changes outside of what's necessary to make that specific change work as intended. Other changes should be their own discrete PRs as follow ups.

I just try to be reasonable about PRs. Everyone will have their preference, I just think as long as you can go back and figure out what happened, why the change was made, and do it quickly enough, then you're golden.

degenerate · 2 years ago

Lines with actual code changes aren't a problem for me, it's the automated IDE indent/spacing/bracketing that really drives me up a wall and fatigues the hell out of me.

But this might only be a problem with those of us working on legacy codebases. The kind of PRs I see in OSS projects I could review 1000 lines at a time - it's so clean!

joshka · 2 years ago

Split the reformatting and real change. Make the reformatting change first and have a policy of merging those change fast.

JohnFen · 2 years ago

I tend to agree.

For my part, ever since I was a wee lad learning to program for the first time, I've been of the opinion that line counts themselves are a largely worthless metric for anything.

How many lines code takes up is too variable between languages, too variable within the same language, too dependent on irrelevancies like formatting, etc.

What actually matters is logical and conceptual complexity, not line counts.

ysavir · 2 years ago

How reliable is reversion as a metric? In my experience, the problem with 50 line PRs (that aren't just making a quick adjustment to something, but actually trying to build new functionality in sets of 50 line PRs) is that each change is so isolated, that when everything has to be combined into an actual working feature, the small changes aren't coordinated with each other. The PR that adds the DB migration doesn't add the appropriate columns needed to actually store the information sent from the frontend, etc. All the small things that can't be planned for in advance and are only discovered during implementation.

Those small PRs might not be reverted, but there are probably a fair number of 50 line PR follow ups that end up modifying the original implementation, and using up more total time and effort than if it was just a larger PR implementing the feature in its entirety.

I wonder how many of the large PRs they say take longer to merge and get less comments are actually feature branches into which the smaller PRs are merged, and which we expect to hang open for long periods of time with little interaction.

This analysis seems to take a set of data without much investigation into what the data actually represents and makes an immense effort to find some conclusions about the context-less data.

bombcar · 2 years ago

Reversion is an unreliable metric, because some PRs will be large enough that they NEVER get reverted, even if they should because it's easier to just fix the broken PR with another PR (about 50 lines should do it).

Zealotux · 2 years ago

I’m reading this article on a break from reviewing a PR with more than 300 lines changed over 20+ files and I wonder if some people simply live in some sort of alternative wonderland universe.

The “ideals” are almost never a thing I see in my daily job, and I’ve worked for countless companies as a contractor.

xbar · 2 years ago

Good luck.

For me, I feel like the difficulty score of performing an effective review is slightly greater than nlogn but less than n*root(n).

300 is a bit beyond my favorite limit of 200.

waffletower · 2 years ago

I file this under the category of "needless rules created without sufficient context". The author's experience is not sufficient to generalize to the diversity of development environments where pull requests are utilized. Hopefully, OP is a good listener and can compromise when their strong opinions meet resistance from other developers at their organization.

martpie · 2 years ago

Graphite is a code review tool (to organize PRs in stacked diffs), their data probably comes from their userbase, not just their internal team.

wk_end · 2 years ago

Yes, it says as much.

> Our sample set

> All of the data-based statements in this piece are made using private and public PRs and repos that have been synced with Graphite.

avgcorrection · 2 years ago

Also file under “product advertisement”.

tcmart14 · 2 years ago

For me personally, it shouldn't be about line count. A PR should have scope of a single feature/piece of functionality. Or if it is a PR for a bug fix, 1 bug fix per PR. If its 1 line, 2 lines, 50 lines, 500 lines, I don't care. Each PR should address one single problem.

jitl · 2 years ago

If I need make a new thing - for example, introduce an end-to-end benchmark suite for a service, it may take something like 2000 lines. If I split that up into 50 line PRs, it’s 40 PRs. I’ll also need to write a pre-read document explaining the overall design because PR descriptions for each change individually won’t be enough context for a reviewer to understand.

If CI takes 15 minutes and I create PRs one by one, I’ll need to wait a total of 10 hours just for CI, which is impossible, I’ll need to use stacked PRs. If I’m using stacked PRs, that means I do all the work up front, and then try to painstakingly split up my final code into little 50 line chunks that build incrementally. I’m sure someone will say “oh just write incremental small commits to begin with”, but that never seems viable to me. Code and design evolve as I build in ways that would make the review of the first 50 lines I commit on the project pointless since none were retained in the final product. So not only do I have to build this feature, and write 40 PR descriptions, and write an overall meta-PR-description for the change in aggregate, I also need to cut my finished code up into puzzle pieces for reviewers to mentally re-assemble.

Anyways, maybe for maintaining mostly-done features 50 lines is nice, but I’d much rather a new feature or larger change be a handful (1-3) of larger (500-2000 line) PRs possibly accompanied by pair-review/walkthrough discussion.

GravityLab · 2 years ago

The correct PR is the one that solves the problem the best. The correct PR's size has nothing to do with the # of lines it is. Tracking time to merge is not that meaningful because merging code does not in itself say anything about the value of that code in production. That is the type of metric that may become a target for a scrum master or project manager despite it being fairly divorced from the reality of moving the right code into the production environment.