> Developers spend less time staring at hex dumps of data, preferring more structured output, tracepoints, and interactive debuggers as ways of tracking down problems. Debugging features in the kernel's memory allocators mean that many sorts of memory-corruption issues will be caught directly. Magic numbers are just not as helpful as they once were.
In most, maybe almost all of the places in the kernel, probably. If you get down to the very low-level nitty gritty: Bootstrapping code, low level memory management, exception handling, maybe context switching... expect to look at hex dumps a lot, possibly even directly through a hardware debugging interface that does not even know what debugging symbols are, let alone what "C code" looks like, and experience the joy the recognition of a magic number brings. Did Santa bring you saved state from an exception handler early this year?
But the article, or the patch, are not wrong, far from it. You are most likely not in that situation while debugging terminal code, as the original magic number file states as example.
For the uninitiated who might find the idea particularly daunting, looking at hex dumps is not really looking at hex (I mean it kind of is after a while but that’s not the main idea).
You have this raw binary dump of the content of something - generally memory or network flow - and tools which can help you turn the binary dump into a structured view. The issue is mapping the structures onto the dump.
That’s where magic values and known values are useful. You can find them in the middle of the apparent garbage and use them to align your decoder (or to start reading the dump - if you do it often enough it becomes very Matrix like, “blonde, brunette” but with ip packets and C structs).
...But it's also familiar even to little ole me the neophyte, essentially the opposite of a class of 1978 greybeard, from good old Cheat Engine save file manipulation.
Just funny how times change.
Even some of the zoomers will know exactly what you're talking about because of Cheat Engine.
I used to work on a medium-model kernel back in the 8086 days. When somebody brought me a crash-dump to look at, one process I'd use is to just page through the kernel data segment, watching the hex go by and relaxing my mind.
As often as not I'd see the corrupt data go by. A string copied where no strings should be. A zero in a dense table that should not have any zeroes. A block duplicated.
Whatever it was, your brain can pattern match way more than we imagine it can.
> if you do it often enough it becomes very Matrix like, “blonde, brunette” but with ip packets and C structs
At one job I used to spend a lot of time looking at crash dumps from a game, and at some point I was able to pretty reliably identify various objects in the engine (models, textures, physics volumes..etc) just from how they looked in a hex view of the crash dumps.
If its a system you're familiar with enough, it really is Matrix-like as you say.
No, I did actually mean looking at the hex values themselves, with not much more than eyes, mind, and an internal concept of how things work.
In things like exception handling and bootstrapping code, there's often not much a tool that decodes structures would give you that you don't see with your own eyes and thoughts anyway. That is very different from debugging, say, networking code.
Also, NT kernel's pool allocation code receives a 4-char string called "tag" to identify an allocated block. Almost all kernel allocated memory use that feature.
> There is no clear point where the development community made a collective decision to move away from the magic-number practice; it just sort of faded away. The are probably a few reasons behind this change. The kernel community has, for many years now, tried to use type-safe interfaces rather than passing void pointers around, making it less likely that the wrong structure type will be passed into a function. Developers spend less time staring at hex dumps of data, preferring more structured output, tracepoints, and interactive debuggers as ways of tracking down problems. Debugging features in the kernel's memory allocators mean that many sorts of memory-corruption issues will be caught directly. Magic numbers are just not as helpful as they once were.
I'd be interested in specific examples of some ways that type safety has been improved. Or does the author say that there is less of a need for the kind of "type safety" that magic numbers provide because problems will be caught in other ways - by more generic kernel code? If so I'd still be interested in some examples.
I think they’re just having fun with the title - I don’t think lwn tries to be “click baity” in the news media sense. You pretty much know what you’re going to get on lwn.
“click bait” is very slowly taking on a more general meaning as any title or link that doesn't precisely and concisely describe the article, including where humour is being applied (like here).
It is literally like the word literally often being used to mean figuratively these days, as in this example…
I don't know that it was intentional, but I clicked the link thinking it translated to something like "Someone highly involved in Linux kernel development is about to step down". Maybe not click bait in the typical sense, but I was baited into clicking it. You get a pass for cheeky titles though when the article is actually interesting and well written like this one.
This reminds me of a technique I learned from this StackOverflow answer [1] that is about creating structs/records with TCL lists and dicts. It apparently comes from LISP philosophy:
proc mkFooBarRecord {foo bar} {
# Keep index #0 for a "type" for easier debugging
return [list "fooBarRecord" $foo $bar]
}
proc getFoo {fooBarRecord} {
if {[lindex $fooBarRecord 0] ne "fooBarRecord"} {error "not fooBarRecord"}
return [lindex $fooBarRecord 1]
}
The magic number is sort of like the "0" index of the "Type" field with dicts. field.
magic numbers are vital for filesystem reference in kernel code to distinguish load order of partitions in a disk, and probably used for memory allocation, and permissions. The reason why the magic code exists among code itself seems a little redundant with type checking and garbage collection or proper inheritance.
In most, maybe almost all of the places in the kernel, probably. If you get down to the very low-level nitty gritty: Bootstrapping code, low level memory management, exception handling, maybe context switching... expect to look at hex dumps a lot, possibly even directly through a hardware debugging interface that does not even know what debugging symbols are, let alone what "C code" looks like, and experience the joy the recognition of a magic number brings. Did Santa bring you saved state from an exception handler early this year?
But the article, or the patch, are not wrong, far from it. You are most likely not in that situation while debugging terminal code, as the original magic number file states as example.
You have this raw binary dump of the content of something - generally memory or network flow - and tools which can help you turn the binary dump into a structured view. The issue is mapping the structures onto the dump.
That’s where magic values and known values are useful. You can find them in the middle of the apparent garbage and use them to align your decoder (or to start reading the dump - if you do it often enough it becomes very Matrix like, “blonde, brunette” but with ip packets and C structs).
...But it's also familiar even to little ole me the neophyte, essentially the opposite of a class of 1978 greybeard, from good old Cheat Engine save file manipulation.
Just funny how times change.
Even some of the zoomers will know exactly what you're talking about because of Cheat Engine.
I used to work on a medium-model kernel back in the 8086 days. When somebody brought me a crash-dump to look at, one process I'd use is to just page through the kernel data segment, watching the hex go by and relaxing my mind.
As often as not I'd see the corrupt data go by. A string copied where no strings should be. A zero in a dense table that should not have any zeroes. A block duplicated.
Whatever it was, your brain can pattern match way more than we imagine it can.
At one job I used to spend a lot of time looking at crash dumps from a game, and at some point I was able to pretty reliably identify various objects in the engine (models, textures, physics volumes..etc) just from how they looked in a hex view of the crash dumps.
If its a system you're familiar with enough, it really is Matrix-like as you say.
In things like exception handling and bootstrapping code, there's often not much a tool that decodes structures would give you that you don't see with your own eyes and thoughts anyway. That is very different from debugging, say, networking code.
Deleted Comment
I'd be interested in specific examples of some ways that type safety has been improved. Or does the author say that there is less of a need for the kind of "type safety" that magic numbers provide because problems will be caught in other ways - by more generic kernel code? If so I'd still be interested in some examples.
Example: https://lwn.net/Articles/750306/
It is literally like the word literally often being used to mean figuratively these days, as in this example…
E.g. something like "Safety Feature Quietly Being Removed from Linux Kernel"
There are two major products that come out of Berkeley: LSD and UNIX. We don't believe this to be a coincidence. - Jeremy S. Anderson.
... quote via https://github.com/globalcitizen/taoup
[1] https://stackoverflow.com/a/5532898
Deleted Comment