Readit News logoReadit News
stabbles commented on What Is Ruliology?   writings.stephenwolfram.c... · Posted by u/helloplanets
meindnoch · 3 days ago
Wolfram's eulogy will be titled: "A life wasted on cellular automata"
stabbles · 3 days ago
Whenever Wolfram brings up cellular automata again, I think of John Conway who got tired of being known for Conway's Game of Life.
stabbles commented on Why do RSS readers look like email clients?   terrygodier.com/phantom-o... · Posted by u/jasonpeacock
tecoholic · 12 days ago
What’s happening in this site? The page loads and number starts going up from 47 and the it says “You fell behind reading this”. And I start scrolling and paras of text start floating up. I am really confused
stabbles commented on I built a 2x faster lexer, then discovered I/O was the real bottleneck   modulovalue.com/blog/sysc... · Posted by u/modulovalue
direwolf20 · 15 days ago
This is a (3) man page which means it's not a syscall. Have you checked it doesn't call lstat on each file?
stabbles · 15 days ago
Fair, https://www.man7.org/linux/man-pages/man2/getdents64.2.html is a better link. You'd have to call lstat when d_type is DT_UNKNOWN
stabbles commented on I built a 2x faster lexer, then discovered I/O was the real bottleneck   modulovalue.com/blog/sysc... · Posted by u/modulovalue
zokier · 15 days ago
in what way does scandir avoid stat syscalls?
stabbles · 15 days ago
Because you get an iterator over `struct dirent`, which includes `d_type` for popular filesystems.

Notice that this avoids `lstat` calls; for symlinks you may still need to do a stat call if you want to stat the target.

stabbles commented on I built a 2x faster lexer, then discovered I/O was the real bottleneck   modulovalue.com/blog/sysc... · Posted by u/modulovalue
ciupicri · 16 days ago
Wouldn't you still have a lot of syscalls?
stabbles · 16 days ago
Yes, but with much lower latency. The squashfs file ensures the files are close together and you benefit from fs cache a lot.
stabbles commented on I built a 2x faster lexer, then discovered I/O was the real bottleneck   modulovalue.com/blog/sysc... · Posted by u/modulovalue
solatic · 16 days ago
Headline is wrong. I/O wasn't the bottleneck, syscalls were the bottleneck.

Stupid question: why can't we get a syscall to load an entire directory into an array of file descriptors (minus an array of paths to ignore), instead of calling open() on every individual file in that directory? Seems like the simplest solution, no?

stabbles · 16 days ago
What comes closest is scandir [1], which gives you an iterator of direntries, and can be used to avoid lstat syscalls for each file.

Otherwise you can open a dir and pass its fd to openat together with a relative path to a file, to reduce the kernel overhead of resolving absolute paths for each file.

[1] https://man7.org/linux/man-pages/man3/scandir.3.html

stabbles commented on I built a 2x faster lexer, then discovered I/O was the real bottleneck   modulovalue.com/blog/sysc... · Posted by u/modulovalue
marginalia_nu · 16 days ago
Zip with no compression is a nice contender for a container format that shouldn't be slept on. It effectively reduces the I/O, while unlike TAR, allowing direct random to the files without "extracting" them or seeking through the entire file, this is possible even via mmap, over HTTP range queries, etc.

You can still get the compression benefits by serving files with Content-Encoding: gzip or whatever. Though it has builtin compression, you can just not use that and use external compression instead, especially over the wire.

It's pretty widely used, though often dressed up as something else. JAR files or APK files or whatever.

I think the articles complaints about lacking unix access rights and metadata is a bit strange. That seems like a feature more than a bug, as I wouldn't expect this to be something that transfers between machines. I don't want to unpack an archive and have to scrutinize it for files with o+rxst permissions, or have their creation date be anything other than when I unpacked them.

stabbles · 16 days ago
> Zip with no compression is a nice contender for a container format that shouldn't be slept on

SquashFS with zstd compression is used by various container runtimes, and is popular in HPC where filesystems often have high latency. It can be mounted natively or with FUSE, and the decompression overhead is not really felt.

stabbles commented on I built a 2x faster lexer, then discovered I/O was the real bottleneck   modulovalue.com/blog/sysc... · Posted by u/modulovalue
stabbles · 16 days ago
"I/O is the bottleneck" is only true in the loose sense that "reading files" is slow.

Strictly speaking, the bottleneck was latency, not bandwidth.

stabbles commented on I Like GitLab   whileforloop.com/en/blog/... · Posted by u/lukas346
stabbles · 16 days ago
Another interesting choice of GitLab's CI is to effectively display `head -c N` of the logs instead of `tail -c N`.

Some builds produce a lot of output, and Gitlab simply truncates it. Your job failed? Good luck figuring out what went wrong :)

Showing the last N bytes makes so much more sense as a solution to the artificial problem of CI output being too large.

u/stabbles

KarmaCake day2899April 19, 2016View Original