Readit News logoReadit News
rowanG077 · 2 years ago
I'm getting pretty tired of this false advertisement. This is not a C compiler. It doesn't have many crucial features required to compile most larger C programs. I do have to say it's impressive what they have squeezed into 512 byte.
distcs · 2 years ago
> This is not a C compiler.

This is technically true! What is posted here is not a C compiler. It is an implementation of a subset of C.

I'd prefer that the title honestly mentions that like: A compiler for a subset of C that fits in the 512 byte boot sector.

It is still a remarkable feat. But honestly, when I read the original title I was in complete disbelief that someone could implement a whole C compiler in 512 bytes.

But with the new context that it is a subset of C (not the whole C), the initial great surprise is gone. It is still very impressive though.

Brian_K_White · 2 years ago
Or even, interpreter. It compiles and executes on the fly, in ram, function by function. It doesn't even compile the whole input but just a bit and immediately executes that bit before moving to the next bit, and doesn't save the compilation result anywhere. To me, that's an interpreter.

So it's a c subset interpreter.

And a very cool thing. This is not a denegration or critique at all, just terminology.

I think it's perfectly fine for a bootstrapper to be a drastic subset. They all already are drastically limited in countless other ways anyways like not knowing how to use any of the crazy hardware, networking, etc. A forth bootloader is a full turing language that can eventually do anything, but it itself can do almost nothing initially besides use bios-provided features and start interpreting code which then provides more functionality.

jcul · 2 years ago
To be fair, the first sentence states that it supports a subset of C.

>SectorC is a C compiler written in x86-16 assembly that fits within the 512 byte boot sector of an x86 machine. It supports a subset of C that is large enough to write real and interesting programs.

The post title could include this, but perhaps it's a little verbose.

In any case, agreed it's impressive to fit it in 512 bytes!

tomjakubowski · 2 years ago
i'm sorry to nitpick but it's the second sentence which mentions it only supports a subset, not the first sentence. and the first sentence calls it a "C compiler" without qualification

Deleted Comment

humanrebar · 2 years ago
Agreed. It's a cool project, but it's a compiler for a DSL that is a subset of C.
userbinator · 2 years ago
It's closer to B than C.
jjtheblunt · 2 years ago
I wonder how long until some LLM filters fraudulent titles wrt the article contents
userbinator · 2 years ago
dang · 2 years ago
Yes, it's a great topic but since it had significant attention in the last year, it counts as a dupe for now. This is in the FAQ: https://news.ycombinator.com/newsfaq.html.
baq · 2 years ago
How large would a compiler which can build tcc and can be built by this be?

tccboot https://bellard.org/tcc/tccboot.html clocks in at a ginormous 138kB by comparison.

'Can we boot to linux from source in 512b' is the wrong question to ask ;)

mati365 · 2 years ago
Slightly different idea but currently I'm working on C + Assembler Compiler to prototype and run such 512B games / apps for retro CPUs.

https://github.com/Mati365/ts-c-compiler

andai · 2 years ago
Blog post with more details: https://xorvoid.com/sectorc.html
distcs · 2 years ago
When we talk about these sector$LANG implementations, I am guessing we are talking about the boot sector that BIOS recognizes, right?

Does the 512 byte limit for a boot sector exist in UEFI too? I don't know much about UEFI so if someone could educate me about how the boot sector and its size limit differs in UEFI, I'd love to know.

sspiff · 2 years ago
No, UEFI loads PE executables from a special partition called the EFI System Partition or ESP. There's no real size restriction there as far as I know.

Before the ESP is accessed, there is no standardized way to customize the boot process. You could put these kinds of sectorX toys into the firmware directly, which would come with more constraints, but it would be vendor-specific.

There is a platform-independent VM running a special EFI byte code that is part of the EFI specification, which allows you to extend the UEFI system with things like additional drivers, but those are also loaded from the ESP.

distcs · 2 years ago
Thanks for the answer! I've got some more questions now. Sorry, but if anyone is willing to take a stab at these questions, it'd be helpful to me.

1. IIUC PE executables are Windows executables. So a Linux system that targets UEFI ends up writing a PE executable to the EFI System Partition?

2. I know that some UEFIs (or is it all?) support BIOS boot sector as backward compatibility feature? How does that work? If I write a "hello world" program in pure machine code in the 1st sector of the boot disk, would UEFI read that and execute that? How would it even know whether what's in the first sector is valid code or garbage? By checking the magic 0x55 0xaa at the end of the boot sector?

oynqr · 2 years ago
Since FAT32 is the only FS that must be supported, I'd guess one potential limit is 4 gigs.
danbruc · 2 years ago
A classical PC master boot record does not actually have 512 byte for code as it also contains the partition table and a signature, you have 446 bytes for code. Not sure what exactly the BIOS validates, you might be able to get away with an invalid partition table. In general there is not really any limit unless you want to be compatible with something existing, you can define whatever disk layout you like. At worst you will have to load additional sectors yourself because the BIOS has no clue where you put them. I no longer remember what a floppy boot sector looks like, how much room you have there.
distcs · 2 years ago
It's been a long time since I've done ASM but do I understand it right that this implementation compiles each function and then executes it immediately? Or does it really compile the whole source code and then execute the binary generated?

And where is the compiled binary saved? Is it kept temporarily in memory itself for immediate execution? Or is the compiled binary saved back to the disk?

If someone could point me to the right sections of the code that answer these questions, it'd be of great help! Thanks!

bluetomcat · 2 years ago
Looks like a recursive-descent parser that emits instructions in memory as it parses. Then it executes them immediately (sectorc.s):

    ;; done compiling, execute the binary
    execute:
    push es                       ; push the codegen segment
    push word [bx]                ; push the offset to "_start()"
    push 0x4000                   ; load new segment for variable data
    pop ds
    retf                          ; jump into it via "retf"

userbinator · 2 years ago
Doesn't even have room for recursive descent or any sort of operator precedence.
doener · 2 years ago
I found this in a Golem article (German): https://www.golem.de/news/milliforth-eine-programmiersprache...