Fascinating work. I know that people did the same for Super Mario 64 [1]. It is still unbelievable to me that they can generate a bit-by-bit identical copy of the original ROM by simply running some old gcc on actual C source code files. IIRC the main insight was that Nintendo did not use any optimization flags which made it possible to create a 100% matching binary. I've never really looked into it, but I guess the did some try and error to figure out the exact gcc version that Nintendo used to make sure that they get 100% identical binary code?
SM64 uses SGI's IRIS Development Option (IDO) compiler. And yes, it's unoptimised.
Paper Mario, however, /does/ use GCC, and it's optimised. Figuring out the compiler version was fairly easy as there's a limited number of options - we know when the game began development, so we looked for releases around that time. The harder parts were figuring out compiler flags (consider all the -f flags affecting code generation; papermario used -fforce-addr) and coming to the terrifying conclusion that the compiler was modified!
The majority of papermario was built with a modified build of GCC 2.8.1 [1] at -O2. The SDK code (libultra, nusystem) was built with GCC 2.7.2 at -O3. The iQue Player version, i.e. the Chinese release, was built with EGCS.
The changes were much more extensive than just removing the debug flags.
They were also able to get local coop and a 4 player split screen working which were features originally intended for the game but cut for performance reasons.
But seriously, it's nice to see the number of decomps and disassemblies for old video games we're getting recently. First Super Mario 64 and The Legend of Zelda: Ocarina of Time, now just about every other NES to N64 era Nintendo game you can think of. This will certainly be useful for mods in future.
The part of decompilation that most amazes me, is getting binaries to match 1-to-1 (up to hashing it seems) – the level of knowledge about executable formats, layout, and how to control the linker is beyond me.
Checkout http://decomp.me - it’s a community built tool used by a lot of video game decompilation projects. You put in the original bytecode, it will attempt a decomp, and then you fiddle with the source (using the same toolchain & flags known/best guessed to be used by original devs) until it matches perfectly. It’s super cool.
Building decomp.me was really worth it. It even helped us match the very last function - someone from the Metal Gear Solid decompilation team finished it off.
Thank you for that link! It's good to be learning about decompilation/RE tools that aren't just Ghidra, IDA, Hopper. It looks like it works by function, which I'm sure will help future decompilation efforts.
This is another decompilation project that hosts copyright infringing code in their github repo. Considering how common these projects do this it would be wise if someone were to create a project to help these projects not contain the decompiled code itself. What I would expect is that instead of code there is a decompiler, a symbol map, a way to add comments, and a way to fix up messy code. Then there is a build step to create this from the ROM. The project already handles extracting assets from the ROM instead of just including it in the github repo like the code.
In practice this makes little difference: Nintendo's gonna C&D and sue you anyway, and you're not going to have the money to fight it. So might as well make it more convenient.
It's actually the inverse. In practice Nintendo has allowed all of these projects to exist, some have speculated that it's because Nintendo cares more about the assets or that Nintendo focus more on taking away more direct routes of piracy. It's true that it makes little difference right now, but I don't think publishing decompiled code is good practice or a good habit for the game decompilation scene to get into.
To my knowledge: not really. Porting to other platforms is usually an explicit non-goal by the decompilation team. For some reason, port projects tend to attract the wrong kinds of attention, both from the public at large and from dodgy people in the romhacking community.
Do all decompilation projects for 90s video games out there aim for perfect matching?
I get why perfect decompilation is a big deal, I'm just wondering if there are other approaches out there besides perfect decompilation in the community at large.
As far as I know, yes.. Besides simple differences like register allocation, it's difficult to prove that your code behaves the same as the target if its nonmatching. It's also just really satisfying when you get a match.
When doing standard reverse engineering, you might use something like Ghidra or Hex-Rays. This is what the developer of noclip.website [1] did to reimplement a lot of Mario Galaxy code, such as enemy AI.
I first played Paper Mario on a PAL N64 when I was around 6 or 7 years old, and I recall not being able to get past certain sections because I could barely understand the dialogue. The cartridge had a previous owner who left a completed savefile, so after getting stuck I loaded it and eventually defeated the final boss. I remember the day vividly where I left the console on overnight displaying the "The End" screen because I thought there might be an easter egg - I'd just learned about Totaka's song in Luigi's Mansion (GCN). There is no postgame in Paper Mario, but I always dreamed of one.
In my time in the modding community I've found that I prefer to create documentation and tools for others to express their creativity than create my own, which is why tools like [1] and [2] exist. The decompilation lets users make mods even more flexibly than previous tools, so I hope to see some people build some cool stuff.
I also just love learning about how this game from my childhood works. It feels kinda like archaeology: discovering parts of the engine where hacks were thrown in at the last minute, finding code that was linked against an earlier version of the engine, etc.
Paper Mario is one of those games that I'm still yet to finish. Love the story, the controller, the graphics... I've tried to play it many times using emulators with no success, there's always something glitching in some area of the game, making the progress very hard. So i end up subscribing to the Nintendo online on my switch just to play it.
This is great! I wonder how long until we see GPT-assisted decompilation.
Taking a peek at the source, it's so interesting to see the a piece of history. For example, this was released in Japan in 2000, then internationally months later. As I recall, there was awareness building around the idea that vibrating controllers (here, the N64 Rumble Pak accessory) cause RSI or carpal tunnel. Since the developers shortened the rumble length outside of Japan, it looks like they were aware as well: https://github.com/nanaian/papermario-dx/blob/main/src/rumbl...
I wonder what led to this decision being made at the exclusion of the JP release.
If current AI can barely do maths, decompilation is not something I'd expect it to do well. It will of course try and come up with something plausible, but often subtly wrong.
I've actually been working on a mod for Paper Mario that aims to make it blind-accessible! However, I'm not sure what the best practices or prior art is as to how to represent certain features. Do you have any good resources?
[1] https://github.com/n64decomp/sm64
Paper Mario, however, /does/ use GCC, and it's optimised. Figuring out the compiler version was fairly easy as there's a limited number of options - we know when the game began development, so we looked for releases around that time. The harder parts were figuring out compiler flags (consider all the -f flags affecting code generation; papermario used -fforce-addr) and coming to the terrifying conclusion that the compiler was modified!
The majority of papermario was built with a modified build of GCC 2.8.1 [1] at -O2. The SDK code (libultra, nusystem) was built with GCC 2.7.2 at -O3. The iQue Player version, i.e. the Chinese release, was built with EGCS.
[1] https://github.com/pmret/gcc-papermario
https://youtu.be/t_rzYnXEQlE?si=ZWpp7-74cMdbsdba
The changes were much more extensive than just removing the debug flags.
They were also able to get local coop and a 4 player split screen working which were features originally intended for the game but cut for performance reasons.
The official SDKs have been leaked for decades, so they probably just looked there.
Hopefully they shipped the GPL source with it!
But seriously, it's nice to see the number of decomps and disassemblies for old video games we're getting recently. First Super Mario 64 and The Legend of Zelda: Ocarina of Time, now just about every other NES to N64 era Nintendo game you can think of. This will certainly be useful for mods in future.
https://decomp.me/scratch/GImYChttps://github.com/pmret/papermario/pull/1019
I get why perfect decompilation is a big deal, I'm just wondering if there are other approaches out there besides perfect decompilation in the community at large.
When doing standard reverse engineering, you might use something like Ghidra or Hex-Rays. This is what the developer of noclip.website [1] did to reimplement a lot of Mario Galaxy code, such as enemy AI.
[1] https://noclip.website/#smg/AstroGalaxy
great work btw
I first played Paper Mario on a PAL N64 when I was around 6 or 7 years old, and I recall not being able to get past certain sections because I could barely understand the dialogue. The cartridge had a previous owner who left a completed savefile, so after getting stuck I loaded it and eventually defeated the final boss. I remember the day vividly where I left the console on overnight displaying the "The End" screen because I thought there might be an easter egg - I'd just learned about Totaka's song in Luigi's Mansion (GCN). There is no postgame in Paper Mario, but I always dreamed of one.
In my time in the modding community I've found that I prefer to create documentation and tools for others to express their creativity than create my own, which is why tools like [1] and [2] exist. The decompilation lets users make mods even more flexibly than previous tools, so I hope to see some people build some cool stuff.
I also just love learning about how this game from my childhood works. It feels kinda like archaeology: discovering parts of the engine where hacks were thrown in at the last minute, finding code that was linked against an earlier version of the engine, etc.
[1] https://mamar.nanaian.town/ [2] https://github.com/nanaian/papermario-dx
It was a fantastic game. If you wanna play you’re in for a treat.
Taking a peek at the source, it's so interesting to see the a piece of history. For example, this was released in Japan in 2000, then internationally months later. As I recall, there was awareness building around the idea that vibrating controllers (here, the N64 Rumble Pak accessory) cause RSI or carpal tunnel. Since the developers shortened the rumble length outside of Japan, it looks like they were aware as well: https://github.com/nanaian/papermario-dx/blob/main/src/rumbl...
I wonder what led to this decision being made at the exclusion of the JP release.
Now to test emitting.