[ Beneath the Waves ]

TDR: Soul Reaver

article and software by Ben Lincoln

 

This is a basic walkthrough of decompiling a debug build of Soul Reaver using Ghidra and This Dust Remembers What It Once Was.

TDR was developed and tested with debug builds of Soul Reaver, so this walkthrough will assume you have one of those. I used the one with a build date of 1999-06-01, but any prototype version should work as long as it includes the .SYM file. Important: some of the European retail versions of Soul Reaver include .SYM files, but I don't think they actually match up with the game binary, as they're much older. Those probably won't work.

You'll probably want to copy all of the TDR binaries and DLLs to C:\Windows for convenience, or at least add their location to your PATH environment variable.

  1. In NTSC builds of Soul Reaver, the game binary is named SLUS_007.08, so the first step is to copy that to a working directory and rename it to KAIN2.EXE.
  2. Debug builds of Soul Reaver have a DEBUG directory on the disc. For NTSC debug builds, the symbols will be in the NTSC subdirectory of that. You really only need the KAIN2.SYM file, but I usually copy all of them into my working directory.
  3. Open a command prompt or PowerShell prompt and change directory to your working directory.
  4. Convert KAIN2.EXE to ELF format by running the following command:

    PlayStationELFConverter.exe --exe2elf KAIN2.EXE KAIN2.ELF > PlayStationELFConverter_Log.txt 2>&1

  5. Generate the JSON version of the debug symbols by running the following command:

    SymDumpTE.exe --rename-for-compatibility --json KAIN2.SYM KAIN2.json > SymDumpTE_Log.txt 2>&1

  6. Generate the monolithic header and Ghidra Java scripts by running the following command:

    CreateSkeleton.exe --create-playstation-memory --assume-sn-gp-base --name KAIN2 --externs-to-labels --output Output KAIN2.json > CreateSkeleton_Log.txt 2>&1

  7. Examine the contents of README-KAIN2-CreateSkeleton-Manual_Changes_Required.txt in the Output directory. There will most likely be a lot of manual cleanup suggested. For purposes of this walkthrough, there's only one part that's strictly necessary, and it's detailed below. Skip down to that section, then come back here when you've made the changes. The more you correct the JSON file and re-run CreateSkeleton, the better Ghidra will do with the data, however. Once you've finished making any changes you want to perform, re-run the previous CreateSkeleton.exe command to regenerate the output files.
  8. Launch Ghidra, and create a new project. For the base directory, use the Output directory created by TDR. For the project name, use KAIN2.
  9. Import the ELF file you generated earlier. Ghidra will default to 64-bit MIPS, which is wrong. Click the ... button next to the Language field. Scroll up in the list and choose MIPS/default/32/little/default processor architecture, which will show up as MIPS:LE:32:default:default in the import file window. Click OK to begin the import.
  10. Close the import summary dialogue.
  11. Double-click on KAIN2.ELF in the project list.
  12. An Analyze prompt will appear. Click No, because you don't want that to happen until the debug symbols have been imported.
  13. From the Edit menu, choose Tool Options.
  14. Expand Decompiler, and select Analysis. Uncheck Eliminate unreachable code. Click OK.
  15. From the File menu, choose Parse C Source option. Click the green plus sign button. Open the KAIN2.H file in the Output/source-stubs directory. Click Parse to Program.
  16. Click Parse to Program. Click Continue. Click Continue?.
  17. After a moment, you should receive a message indicating that the header has been parsed successfully. If you don't, make sure you resolved any naming conflicts in the JSON file, re-run the CreateSkeleton.exe above, and then re-import the KAIN2.H file. Otherwise, Click OK, then click Dismiss.
  18. Copy the KAIN2TDRDefineFunctions.java and KAIN2TDRDecompile.java scripts from the Output/ghidra_files/ directory into your own Ghidra scripts directory (probably something like C:\Users\yourname\ghidra_scripts). Note: these files are dynamically generated, so you will need to re-copy them (overwriting the existing copies if necessary) every time they change, or when working on multiple projects.
  19. In Ghidra, from the Window menu, choose Script Manager option.
  20. In the Script Manager window, click on the the KAIN2TDRDefineFunctions.java entry, then click the green-and-white play button in the upper-right corner of the window.
  21. After a noticeable delay, you should see a KAIN2TDRDefineFunctions.java> Finished! message in the console at the bottom of the main Ghidra window.
  22. From the Analysis menu, choose Auto Analyze 'KAIN2.ELF'. Check the Decompiler Parameter ID box if it's not already checked. Switch to the MIPS Constant Reference Analyzer section. Uncheck Recover global GP register writes if it's checked. Optionally, check Attempt to recover switch tables. Click Analyze.
  23. Wait for the analysis to complete (progress is in the lower-right corner of the main Ghidra window.
  24. This should be enough for a basic demonstration of the toolchain. However, if you're really trying to fully reverse-engineer the game, at this point, you would do all of that work in Ghidra. That will take awhile, and is outside the scope of this walkthrough.
  25. When you're ready to proceed with generating source code, go back to the Script Manager window, and run the KAIN2TDRDecompile.java script.
  26. Wait for the decompilation to happen. This will usually take awhile. You should see a KAIN2TDRDecompile.java> Finished! message in the console at the bottom of the main Ghidra window when it's complete.
  27. Back in the command prompt, create another set of C source code files which contain the decompiled functions output by Ghidra by running the following command:

    PopulateSkeleton.exe --name KAIN2 --input-json KAIN2.json --input-source Output\KAIN2.C --output Output > PopulateSkeleton_Log.txt 2>&1

  28. Examine the contents of Output/source-decompiled, which should contain TDR's best attempt at reconstructing the original source code in all of the separate files that were originally used. Anything not matched to one of those files will be placed in Unmatched_Decompiled_Functions.C instead.

The one required change for the Soul Reaver files (mentioned above):

As discussed above, there are lots of changes that would be good to make, but one is absolutely required for Soul Reaver, because if you don't, Ghidra won't be able to parse the C header file.

Open KAIN2.json and search for "name": "_walbossAttributes". You should find a section that looks like this:

"UsedByFunctions": [],

"struct_member_signature": "struct _253fake attackDeltas[0]; // size=0, offset=24",

"class_type": "struct_member",

"c_type": "struct _253fake[0]",

"type_name": "struct _253fake[0]",

"size": 0,

"offset": 24,

"parent_name": "struct _walbossAttributes",

"parent_hashcode": 0,

"name": "attackDeltas",

"source_file": null,

"hashcode": 0

}

],

"name": "_walbossAttributes",

"source_file": null,

"hashcode": -1378913783

There should be three instances of the text struct _253fake, or similar depending on which build of Soul Reaver you're looking at (struct .255fake, etc.) Change all three instances of the struct name so that they read struct _wba253fake. There's nothing special about this replacement name, it just needs to be unique because there's another struct or union with the same name elsewhere in the debug symbols.

When you're done, the section should look something like this:

"UsedByFunctions": [],

"struct_member_signature": "struct _wba253fake attackDeltas[0]; // size=0, offset=24",

"class_type": "struct_member",

"c_type": "struct _wba253fake[0]",

"type_name": "struct _wba253fake[0]",

"size": 0,

"offset": 24,

"parent_name": "struct _walbossAttributes",

"parent_hashcode": 0,

"name": "attackDeltas",

"source_file": null,

"hashcode": 0

}

],

"name": "_walbossAttributes",

"source_file": null,

"hashcode": -1378913783

Scroll up just past the beginning of the _walbossAttributes struct definition, and you should find the definition of the _253fake struct that it references. It should look something like this:

{

"UsedByFunctions": [],

"members": [

{

"UsedByFunctions": [],

"struct_member_signature": "short plusDelta; // size=0, offset=0",

"class_type": "struct_member",

"c_type": "short",

"type_name": "short",

"size": 2,

"offset": 0,

"parent_name": "struct _253fake",

"parent_hashcode": 0,

"name": "plusDelta",

"source_file": null,

"hashcode": 0

},

{

"UsedByFunctions": [],

"struct_member_signature": "short minusDelta; // size=0, offset=2",

"class_type": "struct_member",

"c_type": "short",

"type_name": "short",

"size": 2,

"offset": 2,

"parent_name": "struct _253fake",

"parent_hashcode": 0,

"name": "minusDelta",

"source_file": null,

"hashcode": 0

},

{

"UsedByFunctions": [],

"struct_member_signature": "short validAtHitPoint; // size=0, offset=4",

"class_type": "struct_member",

"c_type": "short",

"type_name": "short",

"size": 2,

"offset": 4,

"parent_name": "struct _253fake",

"parent_hashcode": 0,

"name": "validAtHitPoint",

"source_file": null,

"hashcode": 0

}

],

"name": "_253fake",

"source_file": null,

"hashcode": -1080400537

},

Replace all the occurrences of _253fake (or whatever it's called in the build you're looking at) with your replacement name, so that it looks like this:

{

"UsedByFunctions": [],

"members": [

{

"UsedByFunctions": [],

"struct_member_signature": "short plusDelta; // size=0, offset=0",

"class_type": "struct_member",

"c_type": "short",

"type_name": "short",

"size": 2,

"offset": 0,

"parent_name": "struct _wba253fake",

"parent_hashcode": 0,

"name": "plusDelta",

"source_file": null,

"hashcode": 0

},

{

"UsedByFunctions": [],

"struct_member_signature": "short minusDelta; // size=0, offset=2",

"class_type": "struct_member",

"c_type": "short",

"type_name": "short",

"size": 2,

"offset": 2,

"parent_name": "struct _wba253fake",

"parent_hashcode": 0,

"name": "minusDelta",

"source_file": null,

"hashcode": 0

},

{

"UsedByFunctions": [],

"struct_member_signature": "short validAtHitPoint; // size=0, offset=4",

"class_type": "struct_member",

"c_type": "short",

"type_name": "short",

"size": 2,

"offset": 4,

"parent_name": "struct _wba253fake",

"parent_hashcode": 0,

"name": "validAtHitPoint",

"source_file": null,

"hashcode": 0

}

],

"name": "_wba253fake",

"source_file": null,

"hashcode": -1080400537

},

Again, after doing this, you want to go to the CreateSkeleton.exe step, above.

 
[ Page Icon ]