Readit News logoReadit News
HarHarVeryFunny · 2 months ago
According to this page, LLVM-MOS seems to be pretty soundly beaten in performance of generated code by Oscar64.

https://thred.github.io/c-bench-64/

I think the ideal compiler for 6502, and maybe any of the memory-poor 8-bit systems would be one that supported both native code generation where speed is needed as well as virtual machine code for compactness. Ideally would also support inline assembler.

The LLVM-MOS approach of reserving some of zero page as registers is a good start, but given how valuable zero page is, it would also be useful to be able to designate static/global variables as zero page or not.

sehugg · 2 months ago
I've implemented Atari 2600 library support for both LLVM-MOS and CC65, but there are too many compromises to make it suitable for writing a game.

The lack of RAM is a major factor; stack usage must be kept to a minimum and you can forget any kind of heap. RAM can be extended with a special mapper, but due to the lack of a R/W pin on the cartridge, reads and writes use different address ranges, and C does not handle this without a hacky macro solution.

Not to mention the timing constraints with 2600 display kernels and page-crossing limitations, bank switching, inefficient pointer chasing, etc. etc. My intuition is you'd need a SMT solver to write a language that compiles for this system without needing inline assembly.

ddingus · 2 months ago
A very simple BASIC compiled pretty well! It did feature online assembly, and I agree with you on this necessary point especially concerning the 2600!

See Batari Basic

zozbot234 · 2 months ago
AIUI, Oscar64 does not aim to implement a standard C/C++ compiler as LLVM does, so the LLVM-MOS approach is still very much worthwhile. You can help by figuring out which relevant optimizations LLVM-MOS seems to be missing compared to SOTA (compiled or human-written) 6502 code, and filing issues.
asiekierka · 2 months ago
We already know what the main remaining issue is - LLVM-MOS's register allocator is far from optimal for the 6502 architecture. mysterymath is slowly working on what may become a more sutiable allocator.

Deleted Comment

djmips · 2 months ago
I feel like no amount of optimizations will close the gap - it's an intractable problem.
kwertyoowiyop · 2 months ago
Aztec C had both native and interpreted code generation, back in the day.
bbbbbr · 2 months ago
With regard to code size in this comparison someone associated with llvm-mos remarked that some factors are: their libc is written in C and tries to be multi-platform friendly, stdio takes up space, the division functions are large, and their float support is not asm optimized.
HarHarVeryFunny · 2 months ago
I wasn't really thinking of the binary sizes presented in the benchmarks, but more in general. 6502 assembler is compact enough if you are manipulating bytes, but not if you are manipulating 16 bit pointers or doing things like array indexing, which is where a 16-bit virtual machine (with zero page registers?) would help. Obviously there is a trade-off between speed and memory size, but on a 6502 target both are an issue and it'd be useful to be able to choose - perhaps VM by default and native code for "fast" procedures or code sections.

A lot of the C library outside of math isn't going to be speed critical - things like IO and heap for example, and there could also be dual versions to choose from if needed. Especially for retrocomputing, IO devices themselves were so slow that software overhead is less important.

zozbot234 · 2 months ago
> I think the ideal compiler for 6502, and maybe any of the memory-poor 8-bit systems would be one that supported both native code generation where speed is needed as well as virtual machine code for compactness.

Threaded code might be a worthwhile middle-of-the-way approach that spans freely across the "native" and "pure VM interpreter" extremes.

anthk · 2 months ago
If it runs fast under an AppleI, it will run fine in the rest.
self_awareness · 2 months ago
Rust fork that works on this LLVM fork, for 6502, genering code that can be executed on a Commodore-64: https://github.com/mrk-its/rust-mos
mtklein · 2 months ago
This was a nice surprise when learning to code for NES, that I could write pretty much normal C and have it work on the 6502. A lot of tutorials warn you, "prepare for weird code" and this pretty much moots that.
cmrdporcupine · 2 months ago
It's been amazing to see the progress on this project over the last 5 years. As someone who poked around looking at the feasibility of this myself, and gave up thinking it'd never be practical, I'm super happy to see how far they've gotten.

Maybe someday the 65816 target will get out there, a challenge in itself.

jacquesm · 2 months ago
Instead of the 65816 we got the ARM, which I think was the better thing to happen in the longer term.
gregsadetsky · 2 months ago
I don't know this world well (I know what llvm is) but - does anyone know why this was made as a fork vs. contributing to llvm? I suppose it's harder to contribute code to the real llvm..?

Thanks

mysterymath · 2 months ago
Hey, llvm-mos maintainer here. I actually work on LLVM in my dayjob too, and I don't particularly want llvm-mos upstream. It stretches LLVM's assumptions a lot, which is a good thing in the name of generality, but the way it stretches those assumptions isn't particularly relevant anymore. That is, it's difficult to find modern platforms that break the same assumptions.

Also, maintaining a fork is difficult, but doable. I work on LLVM a ton, so it's pretty easy for it to fold in to my work week-to-week. And quite surprisingly, I used AI to help last time, and it actually helped quite a lot!

nineteen999 · 2 months ago
What's your take on sdcc 6502 support at the moment, if you have one? Im just happy to finally have an 8-bit C compiler that supports both targets, even if the codegen for 6502 needs a lot of work right now.

I'd happily take a llvm-z80 and llvm-6502 over sdcc if both were available

Edit: oh wow, look at that https://github.com/grapereader/llvm-z80. Aw but not touched for 12 years.

zozbot234 · 2 months ago
Even if y'all don't particularly care about having the full backend upstream just yet, it still seems worthwhile to comprehensively document these assumptions within the project, and perhaps to upstream a few of the simpler custom passes where not too much "stretching" of assumptions is involved, if only to ease future forward-porting work.
weinzierl · 2 months ago
These processors were very very different from what we have today.

They usually only had a single general purpose register (plus some helpers). Registers were 8-bit but addresses (pointers) were 16-bit. Memory was highly non-uniform, with (fast) SRAM, DRAM and (slow) ROM all in one single address space. Instructions often involved RAM directly and there were a plethora of complicated addressing modes.

Partly this was because there was no big gap between processing speed and memory access, but this makes it very unlikely that similar architectures will ever come back.

As interesting as experiments like LLVM-MOS are, they would not be a good fit for upstream LLVM.

zozbot234 · 2 months ago
> ... there was no big gap between processing speed and memory access, but this makes it very unlikely that similar architectures will ever come back. ...

Don't think "memory access" (i.e. RAM), think "accessing generic (addressable) scratchpad storage" as a viable alternative to both low-level cache and a conventional register file. This is not too different from how GPU low-level architectures might be said to work these days.

jjmarr · 2 months ago
LLVM has very high quality standards in my experience. Much higher than I've ever had even at work. It might be a challenge to get this upstreamed.

LLVM is also very modular which makes it easy to maintain forks for a specific backend that don't touch core functionality.

codebje · 2 months ago
My experience is that while LLVM is very modular, it also has a pretty high amount of change in the boundaries, both in where they're drawn and in the interfaces between them. Maintaining a fork of LLVM with a new back-end is very hard.
gregsadetsky · 2 months ago
Super interesting, thanks. I specifically thought that its modular aspect made it possible to just "load" architectures or parsers as ... "plugins"

But I'm sure it's more complicated than that. :-)

Thanks again

Sharlin · 2 months ago
Pretty sure that the prospects of successfully pitching the LLVM upstream to include a 6502 (or any 8/16-bit arch) backend are only slightly better than a snowball’s chances in hell.
alexrp · 2 months ago
Worth noting that LLVM has AVR and MSP430 backends, so there's no particular resistance to 8-bit/16-bit targets.
bbbbbr · 2 months ago
There is a similar project for the Game Boy (sm83 cpu) with a fork of LLVM.

https://github.com/DaveDuck321/gb-llvm

https://github.com/DaveDuck321/libgbxx

It seems to be first reasonably successful attempt (can actually be used) among a handful of previous abandoned llvm Game Boy attempts.

retrac · 2 months ago
Presumably it would be straightforward to port the GB code generation to the Intel 8080 / Z80. There have been a few attempts for LLVM for those CPUs over the years. But none which panned out, I think?
zozbot234 · 2 months ago
Most attempts at developing new LLVM downstream architectures simply fail at keeping up with upstream LLVM, especially across major releases. Perhaps these projects should focus a bit more on getting at least some of their simpler self-contained changes to be adopted upstream, such as custom optimization passes. Once that is done successfully, it might be easier to make an argument for also including support for a newly added ISA, especially a well-known ISA that can act as convenient reference code for the project as a whole.
codebje · 2 months ago
The CE-dev community's LLVM back-end for the (e)Z80 'panned out' in that it produced pretty decent Z80 assembly code, but like most hobby-level back-ends the effort to keep up to date with LLVM's changes overwhelmed the few contributors active and it's now three years since the last release. It still works, so long as you're OK using the older LLVM (and clang).

This is why these back-ends aren't accepted by the LLVM project: without a significant commitment to supporting them, they're a liability for LLVM.

avadodin · 2 months ago
A lot of people try to write backends for LLVM to support obscure architectures but these guys are the only ones I know that have ever been successful to any degree.

Portable assembly has a nice ring to it but reality is a harsh mistress and she only speaks C++.

Even the hobby underdog qbe seems ill-suited to 6502 repurposing.

iberator · 2 months ago
Slightly off-topic. If you want to learn low level assembly programming in the XXI century, 6502 is still an EXCELLENT choice!

Simple architecture and really really joyful to use even for casual programmers born a decade, or two later :)

1000100_1000101 · 2 months ago
I'd argue that 68K is simpler to learn and use. You get a similar instruction set, but 32-bit registers, many of them. It's even got a relocatable stack so it can handle threading when you get to that point.
chihuahua · 2 months ago
I agree, I feel like the 68k architecture was a dream for assembly programming. each register is large enough to store useful values, there are lots of them, there are instructions for multiply and divide. This allows you to focus on the essence of what you want to accomplish, and not have to get side-tracked into how to represent the X-coordinate of some object because it's just over 8 bits wide, or how to multiply to integers. Both of these seemingly trivial things already require thought on the 6502.
monocasa · 2 months ago
And registers are actually pointer width, so you don't have to go through memory just to do arbitrary pointer arithmetic.
jacquesm · 2 months ago
If 8 bit: 6809. If 32 bit: 68K. Those are miles ahead of the 6502. Otoh if you want to see a fun quirky chip the 6502 is definitely great, and I'd recommend you use a (virtual) BBC Micro to start you off with.
bsder · 2 months ago
Yeah, the 6809 is just ridiculously good to learn assembly language on. Motorola cleaned up all the idiocies from the 6800 on the 6809.

The attention the 6502 get is just because of history. The advantage the 6502 had was that it was cheap--on every other axis the 6502 sucked.

retrac · 2 months ago
LLVM includes an equivalent to binutils, with a macro assembler, linker, objdump with disassembler, library tools, handling formats like ELF and raw binary etc.

LLVM-MOS includes all of that. It is the same process as using LLVM to cross-assemble targeting an embedded ARM board. Same GNU assembler syntax just with 6502 opcodes. This incidentally makes LLVM-MOS one of the best available 6502 assembly development environments, if you like Unix-style cross-assemblers.

BoredomIsFun · 2 months ago
I'd argue Atmel AVR is a better choice - this is a very a much alive platform, that did not change much last 30 years.