Abstract:
Techniques for creating SFX archives and interpreters.
Created byPeter Kankowski
Last changed
Contributors: Ace
Filed underWin32 programming
Some programs extract data from themselves, e.g. SFX archive gets the compressed data from its exe file. The file consists of two parts — the executable and the archive — that are simply concatenated. You can append the archive to the exe with this command:
copy /b sfx.exe+archive.zip sfx_ready.exe
The program opens its own exe file, finds the zip file attached to it, and unpacks it. In similar way, some programming languages append your program to an interpreter to make a stand-alone exe, which is easier to distribute (AutoIt and Rapid-Q are the examples of such approach).
How can you do this?
The exe file stores information about its size in headers. So you can get the size of executable data from the headers and read the attached data from this position.
The chart above shows all necessary details. First, you should reach the portable exe headers (IMAGE_NT_HEADERS32 structure in winnt.h file) by extracting the offset to it from DOS header. The PE header contains SizeOfCode and SizeOfInitializedData fields, but Windows doesn't require their values to be correct, that's why a linker can write wrong numbers here. We need more reliable source to calculate the exe data size.
And here it is — a section table. Each element of the table stores file offset of the section data and its size; the number of elements in the table can be found in the NumberOfSections field. There are two sections on the chart, ".data" and ".text". In real programs, there may be more different ones. Some of them can have PointerToRawData set to zero meaning the loader should initialize them to empty memory pages.
Let's walk through the section table and find the section with maximum PointerToRawData value. Then the size of executable data will be equal to PointerToRawData of this section plus its SizeOfRawData. In the sample file from the chart, the size will be calculated as offset of the ".data" section plus size of the ".data" section.
Here is the program (error handling mostly stripped to make the example shorter and more clear):
int ReadFromExeFile(void){ BYTE buff[4096]; DWORD read; BYTE* data; // Open exe file GetModuleFileName(NULL, (CHAR*)buff, sizeof(buff)); HANDLE hFile = CreateFile((CHAR*)buff, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); if(INVALID_HANDLE_VALUE == hFile) return ERR_READFAILED; ReadFile(hFile, buff, sizeof(buff), &read, NULL); IMAGE_DOS_HEADER* dosheader = (IMAGE_DOS_HEADER*)buff; // Locate PE header IMAGE_NT_HEADERS32* header = (IMAGE_NT_HEADERS32*)(buff + dosheader->e_lfanew); if(dosheader->e_magic != IMAGE_DOS_SIGNATURE || header->Signature != IMAGE_NT_SIGNATURE) { CloseHandle(hFile); return ERR_BADFORMAT; } // For each section IMAGE_SECTION_HEADER* sectiontable = (IMAGE_SECTION_HEADER*)((BYTE*)header + sizeof(IMAGE_NT_HEADERS32)); DWORD maxpointer = 0, exesize = 0; for(int i = 0; i < header->FileHeader.NumberOfSections; i++) { if(sectiontable->PointerToRawData > maxpointer) { maxpointer = sectiontable->PointerToRawData; exesize = sectiontable->PointerToRawData + sectiontable->SizeOfRawData; } sectiontable++; } // Seek to the overlay DWORD filesize = GetFileSize(hFile, NULL); SetFilePointer(hFile, exesize, NULL, FILE_BEGIN); data = (BYTE*)malloc(filesize - exesize + 1); ReadFile(hFile, data, filesize - exesize, &read, NULL); CloseHandle(hFile); // Process the data *(data + datasize) = '\0'; MessageBox(0, (CHAR*)data, AppName, MB_ICONINFORMATION); free(data); return ERR_OK;}
The sample program just reads the whole overlay and shows it in the message box. In a real program, you may consider reading long overlays in chunks of 32-64 Kb or so. The code was tested with different compilers and sections' layouts.
Download source code (7 Kb) with full error handling
You can find more information about PE file format inthe paper by B. Luevelsmeyer and inIczelion's tutorial. See alsothis forum thread about self-extracting from memory, not from file (note that Wayside's code will not work if the last section has PointerToRawData==0).
Peter is the developer ofAba Search and Replace, a tool for replacing text in multiple files. He likes to program in C with a bit of C++, also in x86 assembly language, Python, and PHP.
Created byPeter Kankowski
Last changed
Contributors: Ace
BYTE buff[512];while(not end of file) { ReadFile( 512 bytes into buff) if(*(long*)buff == marker) { // Marker found! } // else read another 512-byte chunk in the loop}
Btw - zip file format allows to place zip header at the end of file, meaning you could have .exe + .zip in the same size. (Try to just append zip to exe - you will still be able to open zip as a zip with 7z).
But one of remaining question is digital signature - apparently it's placed at the end of .exe as well.
I would like to tamper PE header to include SFX data as dummy data, so .exe can be signed afterwards.
Any idea about this ?
This has nothing to do with zip header being located at the end of archive. 7-Zip can open other SFX archive formats (RAR, 7z), where the headers are located at the beginning.
The SFX archive should be digitally signed after you appended your ZIP file to EXE.
Perfectionistic and minimalistic programming.
Discussion: the first language
Hash functions: An empirical comparison
Software interface design tips
Implementing strcmp, strlen, and strstr using SSE 4.2 instructions