- Notifications
You must be signed in to change notification settings - Fork7
fonic/wcdatool
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Tool to aid disassembling DOS applications created with theWatcom Toolchain.
I'm striving to become a full-time developer ofFree and open-source software (FOSS). Donations help me achieve that goal and are highly appreciated.
- The Watcom Toolchain
- Yet Another Disassembly Tool?
- Current State / Future Development
- Output Sample
- Getting Started
- Wcdatool Usage Information
- Contact Information
Many DOS applications of the 90s, especially games, were developed using theWatcom Toolchain. Notable examples areDOOM,Warcraft,Syndicate andMortal Kombat, just to name a few.
Most end-users probably never have heard ofWatcom, but might remember applications displaying a startup banner reading something like this:DOS/4G(W) Protected Mode Run-time [...]
.DOS/4G(W) was a popular DOS extender bundled with theWatcom Toolchain, allowing DOS applications to run in32-bit protected mode and thus being able to reach well beyond the limits of 16-bit (MS-)DOS.
Nowadays, theWatcom Toolchain is open source and lives on asOpen Watcom /Open Watcom v2 Fork.
The idea for this tool emerged when I discovered that one of my all-time favorite games,Mortal Kombat (CD version), was mainly written in Assembler (almost a line-by-line port of the arcade version) and was releasedunstripped (i.e. executable contains debug symbols). I tried using various decompilation/disassembly tools on it, only to realize that none seemed to be capable of dealing with the specifics ofWatcom-based applications.
Hence, I began writing my own tool. What initially started out asmkdecomptool specifically forMortal Kombat gradually became the now general-purposeWatcom Disassembly Tool (wcdatool).
Note that while wcdatool performs the tasks it is designed for quite well, it is not intended to compete with or replace high-end tools likeIDA Pro orGhidra.
Wcdatool works quite well in its current state - you'll get a well-readable, reasonably structured disassembly output (objdump format,Intel syntax). Check out issues#9 and#11 for games other thanMortal Kombat that wcdatool worked nicely for thus far.Please note that wcdatool works best when used on executables that contain debug symbols. If you come across otherunstrippedWatcom-based DOS applications that may be used for further testing and development,please let me know.
The next major goal is to cleanlyrewrite the disassembler module and transition fromstatic code disassembly toexecution flow tracing. Also, instead of treating an executable's objects separately, alinear unified address space containing all object data will be the basis for future processing. This will allow toapply fixups on a binary level, which should simplify dealing with references that cross object boundaries, such as placeholders/stubs (which are patched via fixups at run time).Mortal Kombat 2's executable will be baseline for the new approach, as it contains code regions within its data object (which are currently neither discovered nor processed) and extensively uses placeholders/stubs for jump/call targets that cross object boundaries (which are currently not handled properly).
Last but not least, wcdatool in its current state is relatively slow, as performance has not been the main focus during development.Cython might be utilized in the future to increase performance.
Output sample forFatal Racing (FATAL.EXE
) - the left side shows the reconstructed source files, the right side shows a portion of formatted disassembly:
There are multiple ways to usewcdatool, but the following instructions should get you started. Don't let the amount of information provided below discourage you, the tool is easier to use than it might seem. The instructions assume that you are usingLinux. ForWindows users, the easiest way to go is to useWindows Subsystem for Linux (WSL):
Check the following requirements:
Wcdatool:
Python (>=3.6.0),wdump (part ofOpen Watcom v2),objdump (part ofbinutils)
(bothwdump andobjdump need to be accessible viaPATH
)Open Watcom v2:
gcc -or-clang (for 64-bit builds),DOSEMU -or-DOSBox (forwgml utility)
(only relevant ifOpen Watcom v2 is built from sources; the project also providespre-compiled binaries)Clonewcdatool's repository (-or- download and extract arelease):
# git clone https://github.com/fonic/wcdatool.git
Download, build and installOpen Watcom v2 (-or- download and installpre-compiled binaries):
# cd wcdatool/OpenWatcom# ./1_download.sh# ./2_build.sh# ./3_install_linux.sh /opt/openwatcom /opt/bin/openwatcom
NOTE: these scripts are provided for convenience, they are not part of theOpen Watcom v2 project itself
Copy the executables to be disassembled to
wcdatool/Executables
, e.g. forMortal Kombat:# cp <source-dir>/MK1.EXE wcdatool/Executables# cp <source-dir>/MK2.EXE wcdatool/Executables# cp <source-dir>/MK3.EXE wcdatool/Executables
NOTE: file names of executables are used to locate corresponding object hint files (see step 5)
Create/update object hint files in
wcdatool/Hints
(optional; skip when just getting started):Object hints may be used to manually affect the disassembly process (e.g. force decoding of certain regions as code/data, specify data decoding mode, define data structs, add comments). Please refer to included object hint files forMortal Kombat,Fatal Racing andPac-Man VR for details regarding capabilities and syntax.
NOTE: hint files must be stored as
wcdatool/Hints/<name-of-executable>.txt
(case-sensitive, e.g.wcdatool/Executables/MK1.EXE
->wcdatool/Hints/MK1.EXE.txt
) to be picked up automatically by the included scriptsLetwcdatool process all provided executables (for the example executables listed in step 4, this will take ~3min. and generate ~1.5GB worth of data):
# wcdatool/Scripts/process-all-executables.sh
-or- Letwcdatool process a single executable:
# wcdatool/Scripts/process-single.executable.sh <name-of-executable>
-or- Runwcdatool manually (use
--help
to display detailed usage information orsee below):# python wcdatool/Wcdatool/wcdatool.py -od wcdatool/Output -wao wcdatool/Hints/<name-of-executable>.txt wcdatool/Executables/<name-of-executable>
NOTE: it is completely normal and expected forwcdatool to produce LOTS of warnings; ignore those when just getting started (see step 8 for details)
Have a look at the results in
wcdatool/Output
:- File
<name-of-executable>_zzz_log.txt
containslog messages (same as console output, but without coloring/formatting) - Files
<name-of-executable>_disasm_object_x_disassembly_plain.asm
containplain disassembly (unmodifiedobjdump output, useful for reference) - Files
<name-of-executable>_disasm_object_x_disassembly_formatted.asm
containformatted disassembly (this is arguably the most interesting/useful output) - Files
<name-of-executable>_disasm_object_x_disassembly_formatted_deduplicated.asm
containformatted deduplicated disassembly (same as above, but with data portions being compressed for increased readability where applicable) - Folder
<name-of-executable>_modules
containsformatted disassembly split into separate files (same as above, attempts to reconstruct an application's original source file structure if corresponding debug information is available)
NOTE: if you are new to assembler/assembly language, check out thisx86 Assembly Guide
- File
Refine the output by analyzing the disassembly, updating the object hints and re-runningwcdatool (i.e. loop steps 5-8):
- Identify and add hints for regions in code objects that are actually data (look for
; misplaced item
comments,(bad)
assembly instructions and labels with trailing; access size
comments) - Identify and add hints for regions in data objects that are actually code (look for
call
/jmp
instructions in code objects with fixup targets pointing to data objects) - Check section
Possible object hints
ofwcdatool's console output / log file for suggestions (not guaranteed to be correct, but likely a good starting point) - The ultimate goal is to eliminate all (or at least most) warnings issued by wcdatool. Each warning points out a region of the disassembly that does currently seem flawed and therefore requires further attention/investigation. Note that there is acascading effect at work (e.g. a region of data that is falsely intepreted as code may produce bogus branches, leading to further warnings), thus warnings should be tackled one (or few) at a time from first to last withwcdatool re-runs in between
NOTE: this is by far the most time-consuming part, butcrucial to achieve good and clean results (!)
- Identify and add hints for regions in code objects that are actually data (look for
Usage: wcdatool.py [-wde|--wdump-exec PATH] [-ode|--objdump-exec PATH] [-wdo|--wdump-output PATH] [-wao|--wdump-addout PATH] [-od|--output-dir PATH] [-cm|--color-mode VALUE] [-id|--interactive-debugger] [-is|--interactive-shell] [-h|--help] FILETool to aid disassembling DOS applications created with the Watcom Toolchain.Positionals: FILE Path to input executable to disassemble (.exe file)Options: -wde PATH, --wdump-exec PATH Path to wdump executable (default: 'wdump') -ode PATH, --objdump-exec PATH Path to objdump executable (default: 'objdump') -wdo PATH, --wdump-output PATH Path to file containing pre-generated wdump output to read/parse instead of running wdump -wao PATH, --wdump-addout PATH Path to file containing additional wdump output to read/parse (mainly used for object hints) -od PATH, --output-dir PATH Path to output directory for storing generated content (default: '.') -cm VALUE, --color-mode VALUE Enable color mode (choices: 'auto', 'true', 'false') (default: 'auto') -id, --interactive-debugger Drop to interactive debugger before exiting to allow inspecting internal data structures -is, --interactive-shell Drop to interactive shell before exiting to allow inspecting internal data structures -h, --help Display usage information (this message)
If you want to get in touch with me, give feedback, ask questions or simply need someone to talk to, pleaseopen an issue here on GitHub. Make sure to provide an email address if you prefer personal/private contact.
Last updated: 10/24/24
About
Watcom Disassembly Tool (wcdatool) - Tool to aid disassembling DOS applications created with the Watcom Toolchain