[ Beneath the Waves ]

TDR: Biohazard 2

article and software by Ben Lincoln

 

This is a basic walkthrough of decompiling the 1997-10-30 beta build of Biohazard 2 using Ghidra and This Dust Remembers What It Once Was.

Note: this build of Biohazard 2 uses PsyQ memory overlays. That aspect of decompiling this game is not currently covered by this walkthrough, but will be added in a future version.

  1. From the Biohazard 2 prototype ISO, copy PSX.EXE, and MAIN.SYM to your working directory.
  2. In your working directory, rename PSX.EXE to MAIN.EXE.
  3. Open a command prompt or PowerShell prompt and change directory to your working directory.
  4. Convert MAIN.EXE to ELF format by running the following command:

    PlayStationELFConverter.exe --exe2elf MAIN.EXE MAIN.ELF > PlayStationELFConverter_Log.txt 2>&1

  5. Generate the JSON version of the debug symbols by running the following command:

    SymDumpTE.exe --debug --ignore-duplicate-definitions --rename-for-compatibility --auto-rename-fakes --inline-fakes --json MAIN.SYM MAIN.json > SymDumpTE_Log.txt 2>&1

  6. Generate the header/stub files, Ghidra scripts, and an updated/extended/mapped version of the JSON debug symbol file by running the following command:

    CreateSkeleton.exe --replace-non-ascii-labels --create-playstation-memory --map-sld-functions --name MAIN --externs-to-labels --output-updated-json MAIN-Mapped.json --output Output MAIN.json > CreateSkeleton_Log.txt 2>&1

    The --replace-non-ascii-labels (or --ignore-non-ascii-labels) flag is necessary with this game because of the enormous number of non-printable (may Japanese, using some strange encoding?) labels embedded in the debug symbol data.
    Note that unlike most of the other walkthroughs, this one omits the --assume-sn-gp-base flag from the CreateSkeleton.exe command. This is because Biohazard 2 does not have an __SN_GP_BASE label in the debug symbols that points to the global pointer value. A future version of this walkthrough will include instructions for manually determining that value, and using it with the --use-gp-base flag isntead.
  7. Examine the contents of README-MAIN-CreateSkeleton-Manual_Changes_Required.txt in the Output directory. If you want to follow up on any of the recommendations in it, do so now, then re-run the previous CreateSkeleton.exe command.
  8. Launch Ghidra, and create a new project. For the base directory, use the Output directory created by TDR.
  9. Import the ELF file you generated earlier. Ghidra will default to 64-bit MIPS, which is wrong. Click the ... button next to the Language field. Scroll up in the list and choose MIPS/default/32/little/default processor architecture, which will show up as MIPS:LE:32:default:default in the import file window. Click OK to begin the import.
  10. Close the import summary dialogue.
  11. Double-click on the ELF in the project list.
  12. An Analyze prompt will appear. Click No, because you don't want that to happen until the debug symbols have been imported.
  13. From the Edit menu, choose Tool Options.
  14. Expand Decompiler, and select Analysis. Uncheck Eliminate unreachable code. Click OK.
  15. From the File menu, choose Parse C Source option. Click the green plus sign button. Open the MAIN.H file in the Output directory. Click Parse to Program.
  16. Click Parse to Program. Click Continue. Click Continue?.
  17. After a moment, you should receive a message indicating that the header has been parsed successfully. If you don't, make sure you resolved any naming conflicts in the JSON file, re-run the CreateSkeleton.exe above, and then re-import the MAIN.H file. Otherwise, Click OK, then click Dismiss.
  18. Copy the MAINTDRAggressiveArrayIdentification.java, MAINTDRDecompile.java, MAINTDRDefineFunctions.java, MAINTDRExportData.java, and MAINMapMemoryAndCreateLabels.java scripts from the Output/ghidra_files/ directory into your own Ghidra scripts directory (probably something like C:\Users\yourname\ghidra_scripts). Note: these files are dynamically generated, so you will need to re-copy them (overwriting the existing copies if necessary) every time they change, or when working on multiple projects.
  19. In Ghidra, from the Window menu, choose Script Manager option.
  20. In the Script Manager window, click on the the MAINTDRMapMemoryAndCreateLabels.java entry, then click the green-and-white play button in the upper-right corner of the window. This script creates any necessary PlayStation memory segments and applies labels found in the debug symbols.
  21. After a noticeable delay, you should see a MAINTDRMapMemoryAndCreateLabels.java> Finished! message in the console at the bottom of the main Ghidra window.
  22. Use the Script Manager to execute the MAINTDRDefineFunctions.java entry. This script imports function definitions and a few other things from the debug symbols.
  23. From the Analysis menu, choose Auto Analyze 'MAIN.ELF'. Check the Decompiler Parameter ID box if it's not already checked. Switch to the MIPS Constant Reference Analyzer section. Uncheck Recover global GP register writes if it's checked. Optionally, check Attempt to recover switch tables. Click Analyze.
  24. Wait for the analysis to complete (progress is in the lower-right corner of the main Ghidra window.
  25. Optional, but highly recommended: use the Script Manager to execute the MAINTDRAggressiveArrayIdentification.java entry. The options in the script popup are preset by TDR - you shouldn't need to change them in most cases. This script attempts to detect cases where a global variable exists with embedded data in the PlayStation binary, but Ghidra has only identified the first element of the entire array. It will generally do a very good job, but some manual cleanup work may be necessary later.
  26. Do any manual reverse-engineering work you want in Ghidra. This may be a very lengthy step if you're trying to rebuild a working binary. If you're just trying things out, you can skip this for now and come back to it later.
  27. When you're ready to proceed with generating source code, go back to the Script Manager window, and run the MAINTDRDecompile.java script. Click OK in the popup - the location of the output file is preset by TDR, and you shouldn't change it in normal use.
  28. Wait for the decompilation to happen. This will usually take awhile. You should see a MAINTDRDecompile.java> Finished! message in the console at the bottom of the main Ghidra window when it's complete.
  29. In the Script Manager window, run the MAINTDRExportData.java script and wait for it to finish. The options in the script popup are preset by TDR - you shouldn't need to change them in most cases. This script will create a file named XPRTDATA.C in your output directory which contains C code that should create any embedded data from the game binary which is referenced by the decompiled code (global variables, etc.).
  30. Back in the command prompt, create another set of C source code files which contain the decompiled functions and global variable data output by Ghidra by running the following command:

    PopulateSkeleton.exe --name MAIN --input-json MAIN-Mapped.json --input-source Output\MAIN.C --input-data Output\XPRTDATA.C --output Output > PopulateSkeleton_Log.txt 2>&1

  31. Examine the contents of Output/PRIMARY/source-decompiled, which should contain TDR's best attempt at reconstructing the original source code in all of the separate files that were originally used. Anything not matched to one of those files will be placed in THISDUST.C or THISDUST.H instead.
 
[ Page Icon ]