[ Beneath the Waves ]

TDR: Soul Reaver

article and software by Ben Lincoln

 

This is a basic walkthrough of decompiling a debug build of Soul Reaver using Ghidra and This Dust Remembers What It Once Was.

TDR was developed and tested with debug builds of Soul Reaver, so this walkthrough will assume you have one of those. I used the one with a build date of 1999-06-01, but any prototype version should work as long as it includes the .SYM file. Important: some of the European retail versions of Soul Reaver include .SYM files, but I don't think they actually match up with the game binary, as they're much older. Those probably won't work.

You'll probably want to copy all of the TDR binaries and DLLs to C:\Windows for convenience, or at least add their location to your PATH environment variable.

  1. In NTSC builds of Soul Reaver, the game binary is named SLUS_007.08, so the first step is to copy that to a working directory and rename it to KAIN2.EXE.
  2. Debug builds of Soul Reaver have a DEBUG directory on the disc. For NTSC debug builds, the symbols will be in the NTSC subdirectory of that. You really only need the KAIN2.SYM file, but I usually copy all of them into my working directory.
  3. Open a command prompt or PowerShell prompt and change directory to your working directory.
  4. Convert KAIN2.EXE to ELF format by running the following command:

    PlayStationELFConverter.exe --exe2elf KAIN2.EXE KAIN2.ELF > PlayStationELFConverter_Log.txt 2>&1

  5. Generate the JSON version of the debug symbols by running the following command:

    SymDumpTE.exe --debug --ignore-duplicate-definitions --auto-rename-fakes --inline-fakes --rename-for-compatibility --json KAIN2.SYM KAIN2.json > SymDumpTE_Log.txt 2>&1

  6. Generate the header/stub files, Ghidra scripts, and an updated/extended/mapped version of the JSON debug symbol file by running the following command:

    CreateSkeleton.exe --create-playstation-memory --assume-sn-gp-base --map-sld-functions --name KAIN2 --externs-to-labels --output-updated-json KAIN2-Mapped.json --output Output KAIN2.json > CreateSkeleton_Log.txt 2>&1

  7. Examine the contents of README-KAIN2-CreateSkeleton-Manual_Changes_Required.txt in the Output directory. There will most likely be a lot of manual cleanup suggested. For purposes of this walkthrough, you can skip that and come back to it later if you want to..
  8. Launch Ghidra, and create a new project. For the base directory, use the Output directory created by TDR. For the project name, use KAIN2.
  9. Import the ELF file you generated earlier. Ghidra will default to 64-bit MIPS, which is wrong. Click the ... button next to the Language field. Scroll up in the list and choose MIPS/default/32/little/default processor architecture, which will show up as MIPS:LE:32:default:default in the import file window. Click OK to begin the import.
  10. Close the import summary dialogue.
  11. Double-click on KAIN2.ELF in the project list.
  12. An Analyze prompt will appear. Click No, because you don't want that to happen until the debug symbols have been imported.
  13. From the Edit menu, choose Tool Options.
  14. Expand Decompiler, and select Analysis. Uncheck Eliminate unreachable code. Click OK.
  15. From the File menu, choose Parse C Source option. Click the green plus sign button. Open the KAIN2.H file in the Output directory. Click Parse to Program.
  16. Click Parse to Program. Click Continue. Click Continue?.
  17. After a moment, you should receive a message indicating that the header has been parsed successfully. If you don't, make sure you resolved any naming conflicts in the JSON file, re-run the CreateSkeleton.exe above, and then re-import the KAIN2.H file. Otherwise, Click OK, then click Dismiss.
  18. Copy the KAIN2TDRAggressiveArrayIdentification.java, KAIN2TDRDecompile.java, KAIN2TDRDefineFunctions.java, KAIN2TDRExportData.java, and KAIN2MapMemoryAndCreateLabels.java scripts from the Output/ghidra_files/ directory into your own Ghidra scripts directory (probably something like C:\Users\yourname\ghidra_scripts). Note: these files are dynamically generated, so you will need to re-copy them (overwriting the existing copies if necessary) every time they change, or when working on multiple projects.
  19. In Ghidra, from the Window menu, choose Script Manager option.
  20. In the Script Manager window, click on the the KAIN2KAIN2TDRMapMemoryAndCreateLabels.java entry, then click the green-and-white play button in the upper-right corner of the window. This script creates any necessary PlayStation memory segments and applies labels found in the debug symbols.
  21. After a noticeable delay, you should see a KAIN2TDRMapMemoryAndCreateLabels.java> Finished! message in the console at the bottom of the main Ghidra window.
  22. Use the Script Manager to execute the KAIN2TDRDefineFunctions.java entry. This script imports function definitions and a few other things from the debug symbols.
  23. From the Analysis menu, choose Auto Analyze 'KAIN2.ELF'. Check the Decompiler Parameter ID box if it's not already checked. Switch to the MIPS Constant Reference Analyzer section. Uncheck Recover global GP register writes if it's checked. Optionally, check Attempt to recover switch tables. Click Analyze.
  24. Wait for the analysis to complete (progress is in the lower-right corner of the main Ghidra window.
  25. Optional, but highly recommended: use the Script Manager to execute the KAIN2TDRAggressiveArrayIdentification.java entry. The options in the script popup are preset by TDR - you shouldn't need to change them in most cases. This script attempts to detect cases where a global variable exists with embedded data in the PlayStation binary, but Ghidra has only identified the first element of the entire array. It will generally do a very good job, but some manual cleanup work may be necessary later.
  26. Do any manual reverse-engineering work you want in Ghidra. This may be a very lengthy step if you're trying to rebuild a working binary. If you're just trying things out, you can skip this for now and come back to it later.
  27. This should be enough for a basic demonstration of the toolchain. However, if you're really trying to fully reverse-engineer the game, at this point, you would do all of that work in Ghidra. That will take awhile, and is outside the scope of this walkthrough.
  28. When you're ready to proceed with generating source code, go back to the Script Manager window, and run the KAIN2TDRDecompile.java script. Click OK in the popup - the location of the output file is preset by TDR, and you shouldn't change it in normal use.
  29. Wait for the decompilation to happen. This will usually take awhile. You should see a KAIN2TDRDecompile.java> Finished! message in the console at the bottom of the main Ghidra window when it's complete.
  30. In the Script Manager window, run the KAIN2TDRExportData.java script and wait for it to finish. The options in the script popup are preset by TDR - you shouldn't need to change them in most cases. This script will create a file named XPRTDATA.C in your output directory which contains C code that should create any embedded data from the game binary which is referenced by the decompiled code (global variables, etc.).
  31. Back in the command prompt, create another set of C source code files which contain the decompiled functions and global variable data output by Ghidra by running the following command:

    PopulateSkeleton.exe --name KAIN2 --input-json KAIN2-Mapped.json --input-source Output\KAIN2.C --input-data Output\XPRTDATA.C --output Output > PopulateSkeleton_Log.txt 2>&1

  32. Examine the contents of Output/PRIMARY/source-decompiled, which should contain TDR's best attempt at reconstructing the original source code in all of the separate files that were originally used. Anything not matched to one of those files will be placed in THISDUST.C or THISDUST.H instead.
 
[ Page Icon ]