amitprasad (u/amitprasad)

amitprasad commented on Notes on Sorted Data amit.prasad.me/blog/sorte... · Posted by u/surprisetalk

kristianp · 2 days ago

I find this approach strange. Surely if you're sorting data, don't do it naively by raw bytes. Use your knowledge of the data type and sort using the appropriate comparison operations of the data type. E.g. sorting by 64 bit ints little endian, do the comparison based on that. If using packed numbers, they must be unpacked first.

amitprasad · 2 days ago

Generic data stores often don’t have this luxury — if you’re designing a system in which data is relatively opaque, you’re often forced to work with bytes. (e.g. rocksdb, etc)

amitprasad commented on Notes on Sorted Data amit.prasad.me/blog/sorte... · Posted by u/surprisetalk

Sesse__ · 2 days ago

In contrast, I found it rather lacking. No discussion of the most common way to sort floats as bytes (shift the sign bit down and XOR the other 31 bits with the resulting masks), nor NaNs and +/-0 for that matter. Varint sorting introduces its own homegrown serialization but doesn't discuss the issue of overlong encodings. Nothing about string collation or Unicode issues in general. Composite data suggests adding NULs, but what if there are NULs in the actual data? (It is briefly mentioned, but only as in “you can't”.)

amitprasad · 2 days ago

author here -- agreed on all fronts. Mentioned this in the other comment but I approached the topic from a relatively narrow perspective (I was working on a specific project at the time)

I think it's worth including these things in a future update to the post, but I didn't have the time / need to explore it back then.

In the meantime, I'd point to the following post on Unicode that remains very nice to read >20 years later: https://www.joelonsoftware.com/2003/10/08/the-absolute-minim...

amitprasad commented on Notes on Sorted Data amit.prasad.me/blog/sorte... · Posted by u/surprisetalk

amitprasad · 2 days ago

Unexpected seeing this posted here.

I wrote this post mostly out of interest for a personal project and thus it's not actually a very holistic exploration of the topic. May revisit and update it in the future :)

amitprasad commented on Minisforum Stuffs Entire Arm Homelab in the MS-R1 jeffgeerling.com/blog/202... · Posted by u/kencausey

jauntywundrkind · a month ago

I still bought my 795s7 (a cheap 16-core MoDT (mobile on desktop) zen4 box) anyways, but accepting that I would probably never get bios updates ever was a hard pill to swallow. I haven't had to deal with support for anything defective, but support as in available bios updates being so so so spotty is really unfortunate.

AMD is talking about replacing closed AGESA BIOS with open OpenSIL bios some day, and maybe perhaps possibly it means end users get some chance to maintain & upgrade bios themselves, eventually, possibly. Given how some vendors seem uninterested in doing the work themselves, this sliver of a hope would be nice to see happen.

amitprasad · a month ago

How is the 795s7? What do you put it to work on?

Just sprung for one at a good price.

amitprasad commented on We built a cloud GPU notebook that boots in seconds modal.com/blog/notebooks-... · Posted by u/birdculture

hhthrowaway1230 · a month ago

Also curious! I was also wondering if criu frozen containers would help here. I.e. load the notebooks, snapshot them, and then restore them.

amitprasad · a month ago

This is notoriously hard when you start to involve GPUs

amitprasad commented on The Journey Before main() amit.prasad.me/blog/befor... · Posted by u/amitprasad

Animats · 2 months ago

From the title, I thought this was going to be about the parts of a program that run before the main function is entered. Static objects have to be constructed. Quite a bit of code can run. Order of initialization can be a problem. What happens if you try to do I/O from a static constructor? Does that even work?

amitprasad · 2 months ago

This is heavily language runtime dependent — there’s nothing that fundamentally stops you from doing anything during the phase between jumping to an entry point and the main()

amitprasad commented on The Journey Before main() amit.prasad.me/blog/befor... · Posted by u/amitprasad

bignerd_95 · 2 months ago

If you're referring to little-endianness, it means the CPU stores multi-byte values in memory with the least significant byte first (at the lowest address).

This convention started on early Intel chips and was kept for backward compatibility. It also has a practical benefit: it makes basic arithmetic and type widening cheaper in hardware. The "low" part of the value is always at the base address, so the CPU can load 8 bits, then 16 bits, then 32 bits, etc. starting from the same address without extra offset math.

So when you say an address like 0xABCD shows up in memory as [0xCD, 0xAB] byte-by-byte, that's not the address being "reversed". That's just the little-endian in-memory layout of that numeric value.

There are also big-endian architectures, where the most significant byte is stored at the lowest address. That matches how humans usually write numbers (0xABCD in memory as [0xAB, 0xCD]). But most mainstream desktop/server CPUs today are little-endian, so you mostly see the little-endian view.

amitprasad · 2 months ago

Not so much the confusion of what little endian is, but how we tend to represent it in notation. Of course this confusion was back when I was first learning things in high school, but I imagine I’m not alone in it

amitprasad commented on The Journey Before main() amit.prasad.me/blog/befor... · Posted by u/amitprasad

bignerd_95 · 2 months ago

As someone who teaches this stuff at university, I see students getting confused every single year by how textbooks draw memory. The problem is mostly visual, not conceptual.

Most diagrams in books and slides use an old hardware-centric convention: they draw higher addresses at the top of the page and lower addresses at the bottom. People sometimes justify this with an analogy like “floors in a building go up,” so address 0x7fffffffe000 is drawn “higher” than 0x400000.

But this is backwards from how humans read almost everything today. When you look at code in VS Code or any other IDE, line 1 is at the top, then line 2 is below it, then 3, 4, etc. Numbers go up as you go down. Your brain learns: “down = bigger index.”

Memory in a real Linux process actually matches the VS Code model much more closely than the textbook diagrams suggest.

You can see it yourself with:

cat /proc/$$/maps

(pick any PID instead of $$).

...

[0x00000000] lower addresses

...

[0x00620000] HEAP start

[0x00643000] HEAP extended ↓ (more allocations => higher addresses)

...

[0x7ffd8c3f7000] STACK top (<- stack pointer)

                  ↑ the stack pointer starts here and moves upward

                  (toward lower addresses) when you push

[0x7ffd8c418000] STACK start

...

[0xffffffffff600000] higher addresses

...

The output is printed from low addresses to high addresses. At the top of the output you'll usually see the binary, shared libs, heap, etc. Those all live at lower virtual addresses. Farther down in the output you'll eventually see the stack, which lives at a higher virtual address. In other words: as you scroll down, the addresses get bigger. Exactly like scrolling down in an editor gives you bigger line numbers.

The phrases “the heap grows up” and “the stack grows down” aren't wrong. They're just describing what happens to the numeric addresses: the heap expands toward higher addresses, and the stack moves into lower addresses.

The real problem is how we draw it. We label “up” on the page as “higher address,” which is the opposite of how people read code or even how /proc/<pid>/maps is printed. So students have to mentally flip the diagram before they can even think about what the stack and heap are doing.

If we just drew memory like an editor (low addresses at the top, high addresses further down) it would click instantly. Scroll down, addresses go up, and the stack sits at the bottom. At that point it’s no longer “the stack grows down”: it’s just the stack pointer being decremented, moving to lower addresses (which, in the diagram, means moving upward).

amitprasad · 2 months ago

I think I got stuck in the same rut that I learned address space in whilst writing that diagram. I would tend to agree with you that your model makes much more sense to the student.

Related: In notation, one thing that I used to struggle with is how addresses (e.g. 0xAB_CD) actually have the bit representation of [0xCD, 0xAB]. Wonder if there's a common way to address that?

amitprasad commented on The Journey Before main() amit.prasad.me/blog/befor... · Posted by u/amitprasad

fweimer · 2 months ago

> The ELF file contains a dynamic section which tells the kernel which shared libraries to load, and another section which tells the kernel to dynamically “relocate” pointers to those functions, so everything checks out.

This is not how dynamic linking works on GNU/Linux. The kernel processes the program headers for the main program (mapping the PT_LOAD segments, without relocating them) and notices the PT_INTERP program interpreter (the path to the dynamic linker) among the program headers. The kernel then loads the dynamic linker in much the same way as the main program (again without relocation) and transfers control to its entry point. It's up to the dynamic linker to self-relocate, load the referenced share objects (this time using plain mmap and mprotect, the kernel ELF loader is not used for that), relocate them and the main program, and then transfer control to the main program.

The scheme is not that dissimilar to the #! shebang lines, with the dynamic linker taking the role of the script interpreter, except that ELF is a binary format.

amitprasad · 2 months ago

You’re right, and I knew this back in February when I wrote most of this post. I must have revised it down incorrectly before posting; will correct. Bit of a facepalm from my side.

amitprasad commented on The Journey Before main() amit.prasad.me/blog/befor... · Posted by u/amitprasad

hagbard_c · 2 months ago

On the subject of symbols:

> Yeah, that’s it. Now, 2308 may be slightly bloated because we link against musl instead of glibc, but the point still stands: There’s a lot of stuff going on behind the scenes here.

Slightly bloated is a slight understatement. The same program linked to glibc tops at 36 symbols in .symtab:

    $ readelf -a hello|grep "'.symtab'"
    Symbol table '.symtab' contains 36 entries:

amitprasad · 2 months ago

Ah I should have taken the time to verify; It might also have something to do with the way I was compiling / cross-compiling for RISC-V!

More generally, I'm not surprised at the symtab bloat from statically-linking given the absolute size increase of the binary.