M/o/Vfuscator: A single instruction C compiler

bombcar · 2 years ago

https://github.com/xoreaxeaxeax/movfuscator/blob/6342ae76b58...

Related.

> The mov-only DOOM renders approximately one frame every 7 hours, so playing this version requires somewhat increased patience.

colatkinson · 2 years ago

For anyone else who immediately thought, "I've gotta try that!" and hit compilation errors: there appears to be a more maintained fork at [0].

And if you're on a 64-bit system, you'll want to make sure it finds the 32-bit libc and libm binaries (see [1]). On Arch, the following worked for me:

    ./build/movcc -L/usr/lib32 test.c

[0]: https://github.com/xoreaxeaxeax/movfuscator

[1]: https://github.com/xoreaxeaxeax/movfuscator/issues/39

azalemeth · 2 years ago

If you haven't seen it, the original conference talk on the Movfuscator is excellent – https://www.youtube.com/watch?v=R7EEoWg6Ekk – it goes through it in detail and describes how version 0.0 had Brainfuck as a totally legitimately valid intermediate language ;-)

thegeekpirate · 2 years ago

He's my favorite speaker (watch all his talks... they're worth it!), but he seems to have gone MIA (possibly after joining Intel? I don't know the timeline), much to my dismay.

Mr_Twinkles · 2 years ago

Yes, you got the timeline right, he went MIA after joining Intel. Turns out a great way to stop vulnerability disclosures is to protect them behind an NDA, plus for any that are found you can give first dibs to all of the 3-letter agencies that are shoveling money your way.

nerpderp82 · 2 years ago

I want Christopher Domas for an older brother, in the 90s.

nielsbot · 2 years ago

Branching with only MOV? How does that work?

Is there no actual flow control but instead conditional manipulation (MOVs) of values?

LegionMammal978 · 2 years ago

Others have described how branching works. But I'm not convinced it's possible to loop around to the beginning using only MOV instructions. movfuscator as currently implemented performs a CALL at the beginning to set up a signal handler, under the justification that it's not part of the real program and could be done with only MOVs, if the program were running in ring 0. (Even if you got rid of libc, it would still take an INT 0x80, SYSENTER, or SYSCALL to pass control to the kernel.)

However, it seems to me like if you booted the processor directly into the program instead of having a bootloader and kernel to help you, then you'd run into a brick wall trying to move from 16-bit real mode to 32-bit protected mode, since doing so requires setting up a global descriptor table (GDT), and the only way I know of to do that is with the LGDT instruction. Unless I'm forgetting something?

themoonisachees · 2 years ago

What would be the roadblocks to simply running the program in 16-bit real mode? Like yes, obviously it isn't best practices but we're way past that.

acegopher · 2 years ago

There can be control flow. MOV an address into the right spot in the interrupt vector table then do a MOV that causes a fault that calls the right interrupt (such as a page fault).

anyfoo · 2 years ago

But now you're just employing another machine, which is not just made out of MOVs.

The video describes in detail what actually happens. I encourage watching it, it's pretty genius, but the very short version is that you turn execution "off" for any code that is not in the "current" branch by switching over all MOVs to target a scratch space instead of their real targets, which means they still execute, but don't have an actual effect. Before each possible branch target, you check whether to turn execution "on", if it's the current one, i.e. making the instructions which are part of that branch affect "actual" memory again.

userbinator · 2 years ago

Incidentally, the fault handling (presumably done in microcode) is itself Turing-complete:

https://news.ycombinator.com/item?id=5261598

okanat · 2 years ago

"All problems in computer science can be solved by another level of indirection." - David Wheeler

Basically the branching points are pointers where one writes a junk memory address or the actual one to operate on. Where to get the correct addresses? A lookup table, the addresses moved into registers.

I think this could be a hack that's repeatable to a pretty good degree in many architectures.

If you'd like to see how exactly, watch the presentation. The presentation links were already shared in discussion and searching movfuscator in Youtube also gets you the result.

charcircuit · 2 years ago

That trick doesn't work for writing an infinite loop.

vlovich123 · 2 years ago

It depends on x86 addressing modes, but tldr lookup tables:

> mov eax, [base + eax*4]

You load 1 address if eax is 0 and a different on if it’s 1. There’s also a jump instruction so you can implement a conditional jump through mov.

This is based on the Stephen Dolan paper: https://harrisonwl.github.io/assets/courses/malware/spring20...

meindnoch · 2 years ago

X86 MOV is quite a beast.

anyfoo · 2 years ago

I'm not entirely through with the video yet, but the core of the idea doesn't need that much from the MOV instruction. No control flow happens in the traditional sense from the CPUs point of view, it's all just linear execution. Instead, there's an abstraction on top that implements control flow by making all code that is not part of the current branch ineffective (but still executed).

ordu · 2 years ago

Hmm... you can try to move branches around. If you .code is writable of course. Though without 'rep movs' it could take a lot of code to move a branch.

folmar · 2 years ago

Previous discussion: May 19, 2021 - https://news.ycombinator.com/item?id=27202801 (40 comments)

musicale · 2 years ago

"It is well-known that the x86 instruction set is baroque, overcomplicated, and redundantly redundant. We show just how much fluff it has by demonstrating that it remains Turing-complete when reduced to just one instruction." – MOV is Turing-complete, Stephen Dolan, 2013

discussion (2013) https://news.ycombinator.com/item?id=6309631

[While I concur that x86 MOV is impressively powerful, it is also well-known that simple NAND gates can be combined to compute arbitrary logic functions. There are also many simple single instruction/operation CPU designs.]

loxias · 2 years ago

> [While I concur that x86 MOV is impressively powerful, it is also well-known that simple NAND gates can be combined to compute arbitrary logic functions. There are also many simple single instruction/operation CPU designs.]

Strong agree. My fav example in the same category as "all circuits can be made with just nand" and "all you need is x86 MOV" is the iota formal language.

If the lambda calculus is too luxurious, and SKI combinators are too bloated, try using iota for all your language theory needs! No unnecessary concepts, or the two redundant operators of SKI! just one "simple" primitive:

i := λf.((fλa.λb.λc.((ac)(bc)))λd.λe.d)

There, now you're turning complete.

;-)

edgyquant · 2 years ago

As with the last time, the side by side comparison with GCC, especially the control flow, is hilarious.