Linux /proc/pid/stat parsing bugs

xeeeeeeeeeeenu · 3 years ago

In my opinion, the fact that procfs is the only API for so many things is one of the biggest problems with Linux. BSDs have sysctl(), macOS has mach_* functions and, of course, Windows has a real API too.

Plain text interfaces lead to complicated, potentially insecure code (especially in C!), they're prone to race conditions and slow.

I wish it was possible to retrieve that information using real syscalls. I think it's a better approach than, for example, inventing a faster way to read procfs: https://lwn.net/Articles/813827/

xxpor · 3 years ago

Even if they insist on a file based interface (it is a UNIX, so fair enough), in modern times it would be nice if they used a "real" data format. Yeah, it's not like JSON parsers have never had bugs, but on average they'll be MUCH better than everyone and their mother hand rolling a C based bespoke parser. Obviously you'd need a new name to not break backwards compatibility.

rdtsc · 3 years ago

> Yeah, it's not like JSON parsers have never had bugs, but on average they'll be MUCH better than everyone and their mother hand rolling a C based bespoke parser.

Currently, but if this idea started when Linux was become popular the real data format would have been XML. It might have been nice at the time, but today we would have laughed at it and said how outdated and silly it looks probably.

kbrazil · 3 years ago

jc[0] supports proc files. Converts them to JSON or YAML. (I am the author)

[0] https://kellyjonbrazil.github.io/jc/docs/parsers/proc

yuuta · 3 years ago

Indeed. Parsing files is a less robust way compared to calling some APIs or at least parsing some files with a schema (e.g. JSON or XML). For example, uptime(1) on Linux:

% strace uptime 2> /tmp/strace && grep proc /tmp/strace

17:35:24 up 3 days, 7:47, 1 user, load average: 2.29, 1.85, 1.56

openat(AT_FDCWD, "/usr/lib/libprocps.so.8", O_RDONLY|O_CLOEXEC) = 3

openat(AT_FDCWD, "/proc/self/auxv", O_RDONLY) = 3

openat(AT_FDCWD, "/proc/sys/kernel/osrelease", O_RDONLY) = 3

openat(AT_FDCWD, "/proc/self/auxv", O_RDONLY) = 3

openat(AT_FDCWD, "/proc/uptime", O_RDONLY) = 3

openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 4

marcodiego · 3 years ago

I love the fact that I can play with the leds of my device without specialized tools. Yes, it is sysfs, not procfs, but it is the same idea.

nijave · 3 years ago

Many of the APIs on Windows are pretty trivial to interact with using PowerShell commandlets. Similarly, many SaaS based tools have CLIs to interact with their arbitrarily complex APIs.

You can still have easy abstractions while providing a way around them for times they don't work well (acquiring structured data)

CamJN · 3 years ago

MacOS’ KERN_PROCARGS2 sysctl is an exception to this, it is very unintuitive to parse and every single piece of code that tries to parse the results that I’ve found on the internet has been wrong, including those from Apple, Google, and Microsoft. I wound up making a library to do it (https://getargv.narzt.cam/) because apparently people need help.

convolvatron · 3 years ago

I just ran into this and its not documented and there are very examples. I will definitely be looking at your library, thank you.

it sounds fishy, but just because sysctl is a mess doesn't necessarily imply that structured kernel interfaces are a bad idea

woodruffw · 3 years ago

Linux does have libproc, which is meant (IIUC) to mirror the BSD-style libproc. It wouldn't surprise me if it's just parsing the same files under the hood, however, and correspondingly has the same bugs. But then again, bugs in one place is potentially a better state of affairs than bugs in many places (?).

hamburglar · 3 years ago

Totally agreed. Any time I’ve found myself parsing proc, I’ve felt like I was doing something foolish and unsafe in lieu of a “real” api.

the8472 · 3 years ago

Having C functions isn't all that much better. You have replaced a crufty text format with crufty data structures full of paddings, unions, bitfields, VLAs, unaligned nested structs and other crazy stuff. Look at ioctls or cmsg. With C structs + 3rd-party kernel drivers you can even get UB because the driver returns data that is invalid under the struct definition (e.g. incorrect alignment, invalid bools).

touisteur · 3 years ago

I think a proper formal grammar would do the trick, with maybe a canonical implementation...

mdaverde · 3 years ago

This is changing! (or technically, has changed!)

eBPF recently added the ability to look through internal data structures through iterators [0] so instead of parsing text we can run a program that traverses through all the task_structs and pushes the exact information we want to userspace in the form the developer wants.

So, alongside other tradeoffs, it's more flexible than syscalls.

[0] https://developers.facebook.com/blog/post/2022/03/31/bpf-ite...

touisteur · 3 years ago

Netlink is the place to look for some of these info, https://twitter.com/dvyukov/status/1605539242506997765 . Loads and loads of stuff in netlink.

ilyt · 3 years ago

I wish /proc|/sys would just agree on serialization format and just serialize the data into some defined format instead of having a bunch of files that all need their own parser

st_goliath · 3 years ago

While procfs has a lot of historical baggage, sysfs is rather specific about the layout and providing only a single value per file, as plain ASCII, rather than using anything complex that has to be parsed. Structure is implemented via the filesystem.

In return, the kernel side API for sysfs is also a lot cleaner and allows to more-or-less expose individual variables as tuning knobs for a driver.

Of course there are edge cases, and there are e.g. some binary interfaces as well (e.g. for providing direct register access, or implementing a firmware upload interface for a device).

ABI compat issues aside, I think that implementing "a standardized [structured] record format" as suggested in the comments here is a rather bad idea, going into exactly the wrong direction by adding complexity rather than reducing it, which would definitely cause even more parsing related issues in the long run.

ilyt · 3 years ago

>While procfs has a lot of historical baggage, sysfs is rather specific about the layout and providing only a single value per file, as plain ASCII, rather than using anything complex that has to be parsed. Structure is implemented via the filesystem.

I'd rather have structured file than to have open 30k files (for say conntrack)

Hell, just example from the article, /proc/<PID>/stat has 52 parameters. That would be 52 opens and reads with single value per file.

> ABI compat issues aside, I think that implementing "a standardized [structured] record format" as suggested in the comments here is a rather bad idea, going into exactly the wrong direction by adding complexity rather than reducing it, which would definitely cause even more parsing related issues in the long run.

It's literally the opposite. You have to implement it once on kernel side and once in userspace vs every special format that currently needs

eminence32 · 3 years ago

I've been working on a library[1] that aims to have fairly complete support for the procfs filesystem, so that you can hide away these annoying parsing quirks. But for some casual usage of /proc/ where you only need one tiny bit of information, it's often better to just roll your own parser instead of bringing in a 3rd party library. It's these small one-off cases that would really benefit from a standardized serialization format like you propose.

[1] https://github.com/eminence/procfs

idealmedtech · 3 years ago

It would be great if the kernel itself provided a header only definition of such a format, so you could focus on the data and not the parsing. Would also be able to integrate into their extensive testing infrastructure.

ajross · 3 years ago

FWIW: sysfs tried to do this already. In general each node corresponds to one "thing", with a reasonably standard set of stringification schemes, and with a path that acts as a self-describing schema. Obviously in practice it ends up that every driver or subsystem ends up doing funny nonsense (e.g. uevent nodes have their own sub-schema with shell-style variables, etc...).

You can't really prevent that. People do funny nonsense in other self-describing data formats like JSON and XML all the time too. There's only so much you can do with a framework.

But /proc is... extremely old, and very heavily used by userspace. In practice it's never going to change.

ilyt · 3 years ago

> You can't really prevent that. People do funny nonsense in other self-describing data formats like JSON and XML all the time too. There's only so much you can do with a framework.

Sure but you will get more of that if the convention is too simplistic. "one file per value" breaks really fast, just cat /proc/net/nf_conntrack or even just proc/<pid>/stats and see just how many values single entry (file/connection) has.

Doesn't need to be some ASN.1 monstrosity, could be simple conventions like "this is how key/value proc/sys file should look, this is how tabular file should look etc."

Make all escaping use same syntax, make every table separator be \t etc.

> But /proc is... extremely old, and very heavily used by userspace. In practice it's never going to change.

eh, just mount it in /proc2

saalweachter · 3 years ago

While we're wishing in one hand, how is it that our programs still take an input of an array of strings, that get escaped and unescaped and split randomly by our shell scripts?

ilyt · 3 years ago

That is entirely due to sh/bash and friends being a terrible programming language.

All sensible ones allow you to just pass an array of parameters to command execution and not worry about spaces in them

bradfitz · 3 years ago

Advent of Proc

stefan_ · 3 years ago

Well it's too late now. But I thought the plan was for all of that stuff to move to Netlink? Not that that isn't a terrible very horrible API either.

emmelaich · 3 years ago

osquery covers those fwiw, and can produce json.

jcelerier · 3 years ago

we could even name the tool to query such serialized data, procctl, provided by the systemd-proc package

mzs · 3 years ago

> sudo was bitten by this back in the day (CVE-2017-1000367):

> https://www.openwall.com/lists/oss-security/2017/05/30/16

https://www.openwall.com/lists/oss-security/2022/12/22/5

idealmedtech · 3 years ago

> This allows any sudoers user to obtain full root privileges

The way most sudoers files are set up, if you're in the wheel or sudo group, you're only a "sudo -i" from a root command prompt, so I'm not sure I see why this is a vulnerability. Can anyone elaborate?

woodruffw · 3 years ago

The /proc/<pid>/* hierarchy has always been a bit of a mess to parse.

/proc/<pid>/maps is similarly frustrating: there's no clear distinction between "special" maps (like the stack) and a file that might just happen to be named `[stack]`. Similarly, the handling for a mapped region on a deleted file is simply to append " (deleted)"[1].

[1]: https://github.com/woodruffw/procmaps.rs/blob/79bd474104e9b3...

esprehn · 3 years ago

The system level fix is to create a structured record format. That could mean quoting all the records or maybe Linux should finally adopt a standardized format like JSON.

zokier · 3 years ago

Strictly speaking it is structured; the structure is described in the man page and it is machine-parseable

jwilk · 3 years ago

It's not described correctly. The man page says you can parse it with scanf(), which is wrong.

jbverschoor · 3 years ago

Why do you have to parse this kind of stuff at all?

Time to let go of the everything is a stream of unorganized characters

kbrazil · 3 years ago

Fortunately `jc`[0] does parse `/proc/<pid>/stat` correctly. I, of course, originally implemented it the naive/incorrect way until a contributor fixed it. :)

    $ cat /proc/2001/stat | jc --proc
    {"pid":2001,"comm":"my program with\nsp","state":"S","ppid":1888,"pgrp":2001,"session":1888,"tty_nr":34816,"tpg_id":2001,"flags":4202496,"minflt":428,"cminflt":0,"majflt":0,"cmajflt":0,"utime":0,"stime":0,"cutime":0,"cstime":0,"priority":20,"nice":0,"num_threads":1,"itrealvalue":0,"starttime":75513,"vsize":115900416,"rss":297,"rsslim":18446744073709551615,"startcode":4194304,"endcode":5100612,"startstack":140737020052256,"kstkeep":140737020050904,"kstkeip":140096699233308,"signal":0,"blocked":65536,"sigignore":4,"sigcatch":65538,"wchan":18446744072034584486,"nswap":0,"cnswap":0,"exit_signal":17,"processor":0,"rt_priority":0,"policy":0,"delayacct_blkio_ticks":0,"guest_time":0,"cguest_time":0,"start_data":7200240,"end_data":7236240,"start_brk":35389440,"arg_start":140737020057179,"arg_end":140737020057223,"env_start":140737020057223,"env_end":140737020059606,"exit_code":0,"state_pretty":"Sleeping in an interruptible wait"}

[0] https://kellyjonbrazil.github.io/jc/docs/parsers/proc_pid_st...

smasher164 · 3 years ago

makes you wonder if it's really that valuable to have all our infrastructure built on parsing text

raldi · 3 years ago

Or to have filesystems that support names with every character but slash and NUL.

capitol_ · 3 years ago

I agree, not supporting NUL is really a historical c-ism that we should get rid of.

/s