DOS Executable formats

See the disclaimer on the main page.

There are two main types of DOS executables: programs and device drivers. Programs exist as files with a .COM or .EXE extension. Device drivers may have any extension, but are usually given the .SYS extension. They are loaded at startup as specified in the CONFIG.SYS file. MS-DOS 6.0 and later also allows auto-loading of device drivers, but Microsoft appear to have patented this technique.

Some .COM and .EXE files are simply 'memory images'. That is, the file is simply a sequence of bytes to be loaded at any segment address at offset 100h (to allow for the PSP), with execution beginning at (segment):0100h. These programs may never exceed 64KB in size, and are always allocated at least 64KB of memory (they are, in fact, allocated all of the memory block into which they are loaded).

Device driver (.SYS) files are also (generally) memory images, but follow a special format reserved for device drivers. Device driver files may also take the .EXE format, in which case they can be loaded as a normal DOS program as well as a driver (in which case the EXE header is ignored; the driver header should immediately follow it). The device-driver format is beyond the scope of this text.

Program files may take another format which allows programs to have several logical segments (and thus exceed 64KB in size). All such executable files (and I stress that both .COM and .EXE files may take this format) begin with a double byte 'signature' within a variable-size header to indicate that they are in fact segmented. This is the ascii equivalent of 'MZ', the initals of Mark Zbikowski, one of the principal designers of DOS. The header (which must be a multiple of 16 byes in length, and is nearly always set at 512 bytes length - for no real reason) contains a 'relocation table' which is a table of segment pointers within the program which must be 'fixed' at load time to point to the correct physical segment, by adding the actual physical segment at which the program was loaded (the segment immediately after the PSP).

Files in this format may also specify a 'minimum' and 'maximum' amounts of memory to allocate. If 'maximum' is greater than the memory available in the block into which the program is being loaded, all the memory in the block will be allocated. If the minimum required memory cannot be allocated, the program will not be executed. If the maximum is less than the amount available, the program will be loaded so that is at the high end of the memory block.

OffsetContents
0000hSegmented EXE header
0002hFile length remainder
0004hFile length (512 byte pages)
0006hCount of relocation table items
0008hHeader size in paragraphs
000AhMinimum extra memory (paragraphs) to allocate
000ChMaximum extra memory (paragraphs) to allocate
000EhInitial SS (before fixup)
0010hInitial SP
0012hChecksum
0014hInitial IP
0016hInitial CS (before fixup)
0018hRelocation table offset

0004: The file length in 512 byte 'pages' (rounded up). This is not necessarily the actual file length, but specifies the amount to be loaded. The previous word, at offset 0002h, gives the remainder of the file length (again, only the portion that should be loaded).

0008: The header length in 16 byte 'paragraphs'.

0012: File checksum. The negative sum of all words in the file ignoring overflow. (To calculate the checksum, add all the words in the EXE file together - assuming this word at offset 12h to be 0 - then take the low word of the result and make it negative). A value of 0 indicates that the checksum has not been computed. The checksum is not checked by DOS on loading.

0018: Relocation table offset. The offset within the file of the relocation table. The table is usually stored within the header, and contains a series of far pointers (segment:offset, offset stored 'below' segment) which are references to locations within the program which must be fixed up to point to the correct segment adress. This is done by adding the segment address in which the program was actually loaded.