Portable Executable (PE) Structure
A Portable Executables (PE) is the file format for executables on Windows. A few examples of PE file extensions are .exe
, .dll
, .sys
, and .scr
. In this blog psot, I’ll discuss the Windows PE structure, which is fundamental knowledge to those working in IT, specifically if you work with Windows-based malware in any capacity, as a developer or reverse engineer.
PE Structure
The diagram below shows a general, simplified structure of a PE file, with every header being a data structure.
DOS Header (IMAGE_DOS_HEADER)
The DOS header is always prefixed with two bytes, 0x4D
and 0x5A
, or 4D5A
, this is commonly referred to as the magic bytes MZ
. These bytes represent the DOS header signature, which is used to confirm that the file being parsed or inspected is a valid PE file. The DOS header is a data structure, defined as follows in the code block below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
typedef struct _IMAGE_DOS_HEADER { // DOS .EXE header
WORD e_magic; // Magic number
WORD e_cblp; // Bytes on last page of file
WORD e_cp; // Pages in file
WORD e_crlc; // Relocations
WORD e_cparhdr; // Size of header in paragraphs
WORD e_minalloc; // Minimum extra paragraphs needed
WORD e_maxalloc; // Maximum extra paragraphs needed
WORD e_ss; // Initial (relative) SS value
WORD e_sp; // Initial SP value
WORD e_csum; // Checksum
WORD e_ip; // Initial IP value
WORD e_cs; // Initial (relative) CS value
WORD e_lfarlc; // File address of relocation table
WORD e_ovno; // Overlay number
WORD e_res[4]; // Reserved words
WORD e_oemid; // OEM identifier (for e_oeminfo)
WORD e_oeminfo; // OEM information; e_oemid specific
WORD e_res2[10]; // Reserved words
LONG e_lfanew; // Offset to the NT header
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;
The two members of the above struct which are the most important are the e_magic
and e_lfanew
e_magic
is as described above, the Magic Bytes (MZ) denoted as0x5A4D
here.e_lfanew
is a 4-byte value that holds an offset at the beginning of the NT Header, which is located at0x3C
DOS Stub
The DOS stub is a short piece of code that resides at the beginning of every Portable Executable (PE) file format within Windows operating systems.
This stub is executed when a user tries to run the file on MS-DOS or equivalent platforms. Since PE files can’t run on such platforms, the DOS stub usually prints out a message along the lines of “This program cannot be run in DOS mode.” By default, this is what the DOS stub does, but it can be modified to implement other routines, though it’s rarely done because modern systems do not use the MS-DOS environment for execution.
Therefore, the main purpose of the DOS stub in the modern context is to provide backward compatibility and prevent DOS-based systems from executing these PE files and potentially causing system instability.
NT Header (IMAGE_NT_HEADERS)
The NT header is essential for one key reason: it incorporates the other two image headers contained in a PE file, FileHeader
and OptionalHeader
, which together contains a large amount of information about a PE file.
Where the DOS header has a unique signiture to identify it, namely 0x4D
and 0x5A
, the NT header has a unique signiture too, equal to the “PE” string, which is 0x50
and 0x45
bytes. However, since this signiture is of the data type DWORD
, it’ll be represented as 0x50450000
. This is exactly the same as the “PE” string, execpt that it is padded with two null bytes. The e_lfanew
member is used to reach the NT header.
Depending on the machines’ architecture, the NT header structures will be different, see below:
32-Bit
1
2
3
4
5
typedef struct _IMAGE_NT_HEADERS {
DWORD Signature;
IMAGE_FILE_HEADER FileHeader;
IMAGE_OPTIONAL_HEADER32 OptionalHeader;
} IMAGE_NT_HEADERS32, *PIMAGE_NT_HEADERS32;
64-Bit
1
2
3
4
5
typedef struct _IMAGE_NT_HEADERS64 {
DWORD Signature;
IMAGE_FILE_HEADER FileHeader;
IMAGE_OPTIONAL_HEADER64 OptionalHeader;
} IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;
Functionally, the only difference between the two is the OptionalHeader
data structure, that being IMAGE_OPTIONAL_HEADER32
and IMAGE_OPTIONAL_HEADER64
respectively.
File Header (IMAGE_FILE_HEADER)
As previously mentioned, this header can be accessed via the NT header. The most important members of the FileHeader
are:
NumberOfSections
- The number of sections in the PE file.Characteristics
- Flags that specify certain attributes about the executable file, such as whether it is a dynamic-link library (DLL) or a console application.SizeOfOptionalHeader
- The size of the optional header, covered next.
The code block below shows the struct for the FileHeader
1
2
3
4
5
6
7
8
9
typedef struct _IMAGE_FILE_HEADER {
WORD Machine;
WORD NumberOfSections;
DWORD TimeDateStamp;
DWORD PointerToSymbolTable;
DWORD NumberOfSymbols;
WORD SizeOfOptionalHeader;
WORD Characteristics;
} IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;
Microsofts’ official documentation provides more information and context around the FileHeader
.
Optional Header (IMAGE_OPTIONAL_HEADER)
Contrary to assumption, the OptionalHeader
is important. This header is so named “optional” because some file ypes do not have it. The OptionalHeader
has two versions, a 32-bit and 64-bit version, they are nearly identical, aside from the fact that the 32-bit version has some extra members that the 64-bit version doesn’t have. In addition, ULONGLONG
is used in the 64-bit version, where DWORD
is used in the 32-bit version.
See the code blocks below for the distinction.
32-Bit
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
typedef struct _IMAGE_OPTIONAL_HEADER {
WORD Magic;
BYTE MajorLinkerVersion;
BYTE MinorLinkerVersion;
DWORD SizeOfCode;
DWORD SizeOfInitializedData;
DWORD SizeOfUninitializedData;
DWORD AddressOfEntryPoint;
DWORD BaseOfCode;
DWORD BaseOfData;
DWORD ImageBase;
DWORD SectionAlignment;
DWORD FileAlignment;
WORD MajorOperatingSystemVersion;
WORD MinorOperatingSystemVersion;
WORD MajorImageVersion;
WORD MinorImageVersion;
WORD MajorSubsystemVersion;
WORD MinorSubsystemVersion;
DWORD Win32VersionValue;
DWORD SizeOfImage;
DWORD SizeOfHeaders;
DWORD CheckSum;
WORD Subsystem;
WORD DllCharacteristics;
DWORD SizeOfStackReserve;
DWORD SizeOfStackCommit;
DWORD SizeOfHeapReserve;
DWORD SizeOfHeapCommit;
DWORD LoaderFlags;
DWORD NumberOfRvaAndSizes;
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER32, *PIMAGE_OPTIONAL_HEADER32;
64-Bit
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
typedef struct _IMAGE_OPTIONAL_HEADER64 {
WORD Magic;
BYTE MajorLinkerVersion;
BYTE MinorLinkerVersion;
DWORD SizeOfCode;
DWORD SizeOfInitializedData;
DWORD SizeOfUninitializedData;
DWORD AddressOfEntryPoint;
DWORD BaseOfCode;
ULONGLONG ImageBase;
DWORD SectionAlignment;
DWORD FileAlignment;
WORD MajorOperatingSystemVersion;
WORD MinorOperatingSystemVersion;
WORD MajorImageVersion;
WORD MinorImageVersion;
WORD MajorSubsystemVersion;
WORD MinorSubsystemVersion;
DWORD Win32VersionValue;
DWORD SizeOfImage;
DWORD SizeOfHeaders;
DWORD CheckSum;
WORD Subsystem;
WORD DllCharacteristics;
ULONGLONG SizeOfStackReserve;
ULONGLONG SizeOfStackCommit;
ULONGLONG SizeOfHeapReserve;
ULONGLONG SizeOfHeapCommit;
DWORD LoaderFlags;
DWORD NumberOfRvaAndSizes;
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER64, *PIMAGE_OPTIONAL_HEADER64;
The OptionalHeader
has plenty of information that can be used, listed below are some of the most common struct
members:
Magic
- Describes the state of the image file (32 or 64-bit image)MajorOperatingSystemVersion
- The major version number of the required operating systemMinorOperatingSystemVersion
- The minor version number of the required operating systemSizeOfCode
- The size of the.text
section (Discussed later)AddressOfEntryPoint
- Offset to the entry point of the file (Typically the main function)BaseOfCode
- Offset to the start of the.text
sectionSizeOfImage
- The size of the image file in bytesImageBase
- It specifies the preferred address at which the application is to be loaded into memory when it is executed.DataDirectory
- One of the most important members in the optional header. This is an array ofIMAGE_DATA_DIRECTORY
, which contains the directories in a PE file (discussed next).
Data Directory
The data directory can be accessed from the last member of the OptionalHeader
, as described above. This is an array of the data type IMAGE_DATA_DIRECTORY
; which has the data structre shown in the code block below:
1
2
3
4
typedef struct _IMAGE_DATA_DIRECTORY {
DWORD VirtualAddress;
DWORD Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
The data directory array has a constant size value of 16
, with each element in the array represents a specific data directory. The specific data directories are shown below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#define IMAGE_DIRECTORY_ENTRY_EXPORT 0 // Export Directory
#define IMAGE_DIRECTORY_ENTRY_IMPORT 1 // Import Directory
#define IMAGE_DIRECTORY_ENTRY_RESOURCE 2 // Resource Directory
#define IMAGE_DIRECTORY_ENTRY_EXCEPTION 3 // Exception Directory
#define IMAGE_DIRECTORY_ENTRY_SECURITY 4 // Security Directory
#define IMAGE_DIRECTORY_ENTRY_BASERELOC 5 // Base Relocation Table
#define IMAGE_DIRECTORY_ENTRY_DEBUG 6 // Debug Directory
#define IMAGE_DIRECTORY_ENTRY_ARCHITECTURE 7 // Architecture Specific Data
#define IMAGE_DIRECTORY_ENTRY_GLOBALPTR 8 // RVA of GP
#define IMAGE_DIRECTORY_ENTRY_TLS 9 // TLS Directory
#define IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG 10 // Load Configuration Directory
#define IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT 11 // Bound Import Directory in headers
#define IMAGE_DIRECTORY_ENTRY_IAT 12 // Import Address Table
#define IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT 13 // Delay Load Import Descriptors
#define IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR 14 // COM Runtime descriptor
The next two sections Export Directory
and Import Address Table
are shown as item (0) and item (12) in the above table respectively.
Export Directory
The export directory contains information about the functions and variables that a Dynamic Link Library (DLL) or an executable makes available to other programs. This directory is part of the PE file’s data structure and provides the ability for external programs to use the library’s exported symbols (e.g., functions) (kernel32.dll
exporting CreateFileA
).
Import Address Table (IAT)
The Import Address Table (IAT) in a PE file plays a crucial role in dynamic linking, which allows an executable to use functions or data from external Dynamic Link Libraries (DLLs). Essentially, the IAT is a structure within the PE file that holds addresses of the functions imported from DLLs. During the loading of the executable, these addresses are resolved by the operating system’s loader. An exxample of this coule be Malware.exe
importing CreateFileA
from kernel32.dll
.
PE Sections
The PE sections contains the code needed to create an executable (.exe
). Each PE section is given a unique name which may contain executable code or resource information. PE sections can be dynamic, with it being possible add manually add sections later on. IMAGE_FILE_HEADER.NumberOfSections
will help to determine the total number of sections.
If we are talking about what sections exist in nearly every PE file, review the list below:
.text
- Contains the executable code which is the written code..data
- Contains initialized data which are variables initialized in the code..rdata
- Contains read-only data. These are constant variables prefixed with const..idata
- Contains the import tables. These are tables of information related to the functions called using the code. This is used by the Windows PE Loader to determine which DLL files to load to the process, along with what functions are being used from each DLL..reloc
- Contains information on how to fix up memory addresses so that the program can be loaded into memory without any errors..rsrc
- Used to store resources such as icons and bitmaps
Each PE section has an IMAGE_SECTION_HEADER
data structure that contains information about it. These structures are saved under the NT headers in a PE file and are stacked above each other where each structure represents a section.
Referring to the IMAGE_SECTION_HEADER
structure, as below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
typedef struct _IMAGE_SECTION_HEADER {
BYTE Name[IMAGE_SIZEOF_SHORT_NAME];
union {
DWORD PhysicalAddress;
DWORD VirtualSize;
} Misc;
DWORD VirtualAddress;
DWORD SizeOfRawData;
DWORD PointerToRawData;
DWORD PointerToRelocations;
DWORD PointerToLinenumbers;
WORD NumberOfRelocations;
WORD NumberOfLinenumbers;
DWORD Characteristics;
} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;
the elements are as follows:
Name
- The name of the section. (e.g..text
,.data
,.rdata
).PhysicalAddress
orVirtualSize
- The size of the section when it is in memory.VirtualAddress
- Offset of the start of the section in memory.
Conclusion
This has been my fundamental look at the structure of a PE file. If your analysing malware, this knowledge will be helpful. Feel free to refer back to it when you need to. I haven’t gone into a ton of detail, there is a lot more we could discuss, but this blog would be much longer.