CLZ-Compressed files remain one of the biggest hurdles for the Proud Life mod.
For this reason, I’ve decided to document everything currently known about these files.
Archives vs. Compression
The easiest way to understand how the game handles compression is to understand the difference between “archiving” and “compression”.
Most compression tools on Windows will do both of these in the same step (e.g. .zip files).
However, archiving is the act of taking multiple files, and putting them into one file.
Compression is when this “archive” is compressed to save space.
Both A Wonderful Life and Another Wonderful Life separate these two steps by creating archives (in the form of U8 archive or .arc files), then compressing them (using clz compression).
Essentially, we need to figure out how to undo the compression step, do whatever we want to the uncompressed data, then redo the compression step.
All CLZ files will follow a similar format.
- File header (“CLZ”)
- The size of uncompressed data in hex (twice). This is useful since we can use it to confirm if an attempt at uncompressing a CLZ file was successful or not.
- Compressed data
While I haven’t been able to find a match for the compression used, the file extension alludes to it being a Custom Lempel-Ziv (CLZ) algorithm.
The files utilize a big endian byte encoding.
They seem to use some sort of dynamic dictionary generation. Which can be most seen when viewing the chapter arc.clz files. It’ll display the first occurrence of a pattern, but not subsequent occurrences. In this case _0.arc is listed for “boy0_0.arc” but not the files afterwards.
It seems to be generating some sort of dynamic dictionary.
Comparing Compressed / Decompressed Files
Certain data on the disc can be used to see both compressed and decompressed versions of some files.
The disc:\test\Scripts folder contains numerous script (.sb) files.
Of note, this folder also contains a U8 archive test.arc.
While this U8 archive is formatted in such a way that you can’t directly see the filenames, they can be viewed in a hex editor.
By comparing the file size in the CLZ header, we can match up these CLZ files with the uncompressed sb files.
I’ve managed to match up the compressed/uncompressed files (and have renamed them accordingly). They can be found [here] for comparison/research purposes.