Decompiling the Game

Good news!

Avast has released the source code for their Retargetable Decompiler (RetDec for short).

Why am I excited about this?

Because can decompile PowerPC binaries (i.e. elf files) into a programming language (C or Python).

The main executable program in any Gamecube/Wii game is a Start.dol file, which is a variant of the elf format.  Also, the GC/Wii are both based on the PowerPC architecture.

Most decompilers will only break a game down into what’s called Assembly.  This is a type of machine code that is difficult at best to reverse-engineer.  And those that support recompiling to other programming languages typically only support the x86 architecture.

Previously, one could try RetDec online, but it had a very small limit on decompilation time (which was not enough for a file as large as the game’s main program code).  But now, I can run it on my own machine for as long as needed to run the decompilation process.

I’ve been able to successfully install RetDec and it’s dependencies.  I was then able to convert AWL’s Start.dol file into a Start.elf file using DolTool 0.3 and begin the decompilation process.

Unfortunately the decompilation failed part-way through due to not having enough memory (my laptop only has 4GB RAM).  My roommate offered to let me try on his laptop (which has 8GB), so once I’m able to try on that, I’ll post an update.

From the partial decompilation, I can tell that I’m on the right track though.  I’ve been able to find a few lines in the LLVM code that reference CLZ files and could be the key to their decompression algorithm.

@global_var_80298180.828 = constant [22 x i8] c”mainchapter%d.arc.clz\00″

%v4_800131f8 = call i32 @function_80238268(i32 %v2_800131f0, i32 ptrtoint ([22 x i8]* @global_var_80298180.828 to i32), i32 %v0_800131e8)

It appears that the clz file is passed through to a variable, which is them passed through some sort of function (possibly to be decompressed).

I’ll be analyzing this code as much as I can to try and figure out how the files work.

On a separate note, decompiling the game’s primary source code could create a better understanding of the game overall and could open the door for additional future mods from other authors..


Update

I tried running the decompilation on my roommate’s 8GM RAM machine, but it still failed after running out of memory (though at a later point).

But on the bright side, I was eventually able to get the decompilation to work by turning off code optimizations with the –backend-no-opts flag in retdec.

The main downside to this is that the code is much larger and complicated than it would be if I had been able to run the optimizations.

function_80238268(v105, (int32_t)“mainchapter%d.arc.clz”, v35);

At least it’s somewhat easier to follow now than the partially-decompiled LLVM code obtained previously.


If anyone has a machine with higher memory (e.g. 32GB) you could try running RetDec to decompile the binary yourself.  I’ve made a downloadable [folder] with all of the needed files, including an installation guide to get everything set up.

[Please PM me if you’d be able to help out with this]


For anyone else that feels like helping out, I’ve uploaded the decompiled source code [link].

The code is unfortunately fairly large and unoptimized.

I have versions written in both C and Python.

Please feel free to take a look and see if you can find any useful information.