Readit News logoReadit News
slaymaker1907 · 2 years ago
Lol, the commit message is just great https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

It's good that Linus is really exercising those 3rd party tools! They should send some money his way for helping them test their code.

sampo · 2 years ago
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

This web view renders tabs as spaces, so it's not possible so see what was changed.

infogulch · 2 years ago
These are a bit easier to see what's going on:

https://github.com/torvalds/linux/commit/d5cf50dafc9dd5faa1e...

https://github.com/torvalds/linux/blob/d5cf50dafc9dd5faa1e61...

Unfortunately Github doesn't have a way to render symbols for whitespace, but you can tell by selecting the spaces that the previous version had leading tabs. Linus changed it so that the tokens `default` and the number e.g. `12` are also separated by a tab. This is tricky, because the token "default" is seven characters, it will always give this added tab a width of 1 char which makes it always layout the same as if it were a space no matter if you use tab widths of 1, 2, 4, or 8.

everybodyknows · 2 years ago
This gets to a dimension of the problem that is often overlooked: Git web viewers, like every other code viewer we use carries its own notion of the position of the tab stops.

Notably, this includes CLI shells, connected to a "terminal emulator", where what is being emulated is an ancient piece of hardware:

https://en.wikipedia.org/wiki/Teletype_Model_33

Semantics of the ASCII tab byte code:

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1...

A far-downstream consequence of this is that source code formatted to an assumption of tab stops at other than 8-column intervals, as is not uncommon in Javascript, produces unreadable CLI output from diff, git-diff, ...

rickdeckard · 2 years ago
Parser fails to parse data --> Fixed by modifying ingested data

Just that the ingested data is a part of the Linux kernel codebase. Quite some hubris to proceed and apply such a "fix" by making a commit to the Linux kernel...

bjackman · 2 years ago
Arguably it would be fine if the community benefited from the parser. After all it's just a custom, undocumented format that lives specifically in this one repo.

But I totally sympathise with Linus' annoyance when the issue is with an external tool, and the author didn't explain which tool or give any reason why it's hard to fix that tool.

karma_pharmer · 2 years ago
The nom_kconfig crate has tabs-vs-spaces bugs too.

Unfortunately until somebody spends the time to create a Kconfig test suite, the kernel itself needs to be the test case for this oddity.

https://docs.rs/nom-kconfig/latest/nom_kconfig/

IncreasePosts · 2 years ago
The author and committer are two separate people, so it isn't like one person unilaterally submitting this.
bigiain · 2 years ago
And next we discover the dev who committed the change was Jia Tan...
HPsquared · 2 years ago
Even worse!
cqqxo4zV46cp · 2 years ago
You cant use “hubris” as a pejorative and go on to claim that Linus is the good guy. There’s clearly hubris on both sides, and only one of the sides made a big deal about it.

Developers change things for parsers all the time. That’s, like, coding.

Brian_K_White · 2 years ago
Incorrect. It's only hubris on one side.

It's both Linus's plain arbitrary right, and his plain job, his defined role and office, to make exactly such decisions for this project. That's not hubris. It's just a role that affects a lot of people.

What makes it hubris on one side is "Who do you think you are making such a change to the Linux kernel that everyone else will have to accept?"

The reason things like that are phrased as questions is to allow for the possibility that there might be an answer.

For one of these parties, the question is rhetorical.

For one of these parties, the question is not rhetorical.

rickdeckard · 2 years ago
It's hubris to think that the issue with a parser to identify whitespaces properly is warranting to change code in the KERNEL of arguably the most widespread operating system in the world.

And THEN not even providing more justification for this in the description.

I'm not claiming anywhere that Linus is the good guy. Or bad guy.

In this case I agree with his stance that whatever this parser is, it should better fail harder in order to get it fixed.

And if you know how he reacts when he's making a big deal of something, you know that this one isn't one of those times...

andrewmcwatters · 2 years ago
There's only two ways to write code where the lines of text stay legible regardless of tab size configuration:

1. Use all spaces

2. Use tabs for indentation and spaces for alignment

Unfortunately, only individual developers seem competent on their own to do #2, so everyone who cares about readability inevitably practices #1 by default.

You can never use only tabs.

tczMUFlmoNk · 2 years ago
A special case of (2) that is easy to do is to use tabs* for indentation and not do column alignment at all. To be clear, by "column alignment" I think we are both referring to patterns like this:

    myVars = {
        first:     123,
        second:    234,
        afterward: 345,
        also:      456,
    }   //~~~~~~~~ <- alignment

    // or:
    def my_long_function(arg1, arg2,
                         more_args,
                         and_some_more)
    //~~~~~~~~~~~~~~~~~~ <- alignment
This is, e.g., what Go uses for struct fields, and what some Python style guides use for hanging function definitions. Regardless of tabs/spaces preference, both of these are independently bad because they churn diffs unnecessarily: if you change `afterward` to `afterward2` then you need to change all the nearby lines, and likewise if you change `my_long_function` to `my_longer_function`. Some formatters, like Black, Prettier, and (mostly) Rustfmt, avoid this pattern entirely, and they are better for it.

* You can do this and still use spaces if you prefer, too.

junon · 2 years ago
Or just configure your code formatter to do this and forget about ever getting it wrong again.
saghm · 2 years ago
> Use tabs for indentation and spaces for alignment

I'm not surprised that this isn't something that projects have been able to adopt successfully very often because I've never found it very intuitive that those are separate things. In what way is "indentation" not also a form of "alignment"?

zarzavat · 2 years ago
The other problem with it is that it assumes that people have visible whitespace on, and that their tools even have that option to show whitespace, otherwise it’s like navigating the Fuchsia Gym.

I don’t mind if people use tabs but mixing the two is not great.

Deleted Comment

arghwhat · 2 years ago
Aligning a character on one line with an arbitrary character on another line is purely a choice of style, not a requirement.

It is perfectly doable to do only tabs, but many end up mixing in spaces.

The curse of space-only files is in people that manage commit indentation errors, breaking auto-detection in some editors, which propagate to even more indentation errors... All it takes is an inattentive reviewer, or review-less merge.

deathanatos · 2 years ago
Readability of the code is not mere style, and can directly translate into errors being more visible. Compare:

    a_variable = (
       'lorum ipsum dolor sit amet ' +
       'my poor memory has left me quite upset ' +
       'for i cannot remember what word comes next '
       'in this long descriptive text' +
       'surely this is bound for the incinerator ' +
       'but remember any haiku can end, refrigerator.'
    )
But now if I choose to align certain characters:

    a_variable = (
       'lorum ipsum dolor sit amet'
       + ' my poor memory has left me quite upset'
       * ' for i cannot remember what word comes next'
       ' in this long descriptive text'
       + 'surely this is bound for the incinerator '
       + ' but remember any haiku can end, refrigerator.'
    )
… the errors in the first version are now plainly obvious. (Both the missed space, as well as the missed +.)

(This is an example. Yes, there are languages for which you don't need the +. There are some for which you do, however. There are also some that resist having the + moved about: for example, in Javascript, the parens become required, or you'll trigger the horrid auto-semicolon "feature".)

thecopy · 2 years ago
>purely a choice of style

Sure, but could i not say the same of using any indentation at all?

apelapan · 2 years ago
With a variable width font you can't use spaces for alignment...
MiddleEndian · 2 years ago
If you program in a variable-width font, you're on your own!
mmis1000 · 2 years ago
> Use tabs for indentation and spaces for alignment

The pains is, most website or editor never handled that well enough. You end up have mixed tab/space at unexpected position and never knew about it.

Just banning the tab is probably not the most 'correct' option to fix it. But it is the most feasible one to get the job done. Because fixing all the tool, editors and websites is nearly impossible for an average man.

edave64 · 2 years ago
Simple solution: Auto formatters

- If you have project wide automatic code formatting: Tabs

- Otherwise: Spaces

Nowadays, most of my projects use option 1.

GoblinSlayer · 2 years ago
3. Use tabs for indentation
starspangled · 2 years ago
> Unfortunately, only individual developers seem competent on their own to do #2,

What do you mean?

> You can never use only tabs.

You can.

quectophoton · 2 years ago
Thankfully there's a trend of new programming languages including their own formatters, thus solving the problem once and for all.
karma_pharmer · 2 years ago
Thankfully there's a trend of excessive merge conflicts caused by reformatters.

Reformat-on-every-commit really only works for highly-centralized, tightly-coupled, monorepo-using monolithic organizations. Basically the exact opposite of kerneldev. For those folks reformat-on-every-commit works great.

ghnws · 2 years ago
What do you mean by reformat? Any decent code formatter keeps a consistent style. Getting conflics only happens if you missconfigure your editor or don't have checks to catch invalid formatting before merging to remote.
GoblinSlayer · 2 years ago
It's much worse for monolithic organizations, because they develop complex software and reformatting scrambles code history and it becomes difficult to untangle business logic.
em-bee · 2 years ago
actually, what we want is code management tools that work with tokenized code and do not depend on formatting. i want a diff tool that shows me exactly which tokens have changed, and which haven't, regardless of how they are laid out. when we get that, then we should get even less merge conflicts.
quectophoton · 2 years ago
My bad, I should have added "/s" (because Cunningham's Law). It was a reference to Futurama, where a problem was not solved at all.

But on a more serious note, in my experience I've not had any issues with Go or Rust codebases (for example). Not using their formatters is heavily frown upon, so I haven't really seen any reformat happen at all; not in my bubble at least[1].

Other languages, on the other hand? Yeah, good luck with trying to have consistent formatting. Even if a project has formatting rules "enforced", there's always (always) going to be an exception, bikeshedding, etc.

[1]: Unless it's someone obviously very junior. The few times I've noticed badly formatted code in Go, has been in random repos from someone who clearly didn't have that much programming experience in general (looking at how code was written).

cherryteastain · 2 years ago
Linux, as a project, could also decide ro require every patch to be formatted via clang-format. They just don't.
jraph · 2 years ago
Discussion here from two days ago: https://news.ycombinator.com/item?id=40043110
Brian_K_White · 2 years ago
Awww did your 3rd party tool bump it's widdle nose on a tab? There there, mommy will make it all better...FIREHOSE OF TABS

Deleted Comment