CHM Signature Format Specification & Recovery Example

CHM Signature Format: Specification & CHM Recovery Example

Microsoft Help CHM files start with a signature ITSF (characters 'I','T','S','F' or bytes 0x49, 0x54, 0x53, 0x46). Except signature, it begins with a short initial header (56 bytes). This is followed by the header section table and the offset to the content. File version is in initial header - four bytes at offset 3, little-endian order. Initial header size is defined at offset 8: 4 bytes, little-endian order (lowest byte first). Next to the initial header is located Header Section 0. It has a signature 0x01 0xFE, and CHM total file size is located at offset 8 at this header section, 4 bytes, little-endian order.

Let's examine the example

When inspecting example.chm file's binary data using any Hex Viewer, like Active@ Disk Editor we can see it starts with a signature ITSF (hex: 49, 54, 53, 46). Version check confirms that it is a valid CHM file v. 3 (4 bytes at offset 4: 0x03). Initial header total size is 96 bytes (4 bytes at offset 8, little-endian, hex: 60 00 00 00). Next to the initial header we see Header Section 0, starting from signature 0xFE 0x01. Total CHM file size is 815,211 bytes (6B 70 0C hex), four bytes, little-endian order, at offset 8 in the section (or offset 0x68 from the beginning of the file).

Thus reading of all 815,211 consecutive bytes starting from the position of detected ITSF header provide us with all CHM file data, provided that file is not fragmented.

CHM Signature inspection

More info:

The CHM file initial header:

offset size description
0 4 signature, must be 49, 54, 53, 46 hex ("ITSF")
4 4 version of Microsoft Help CHM file, 3 in most cases
8 4 total initial header size
12 4 must be one
16 4 timestamp, big-endian DWORD, contains seconds (MSB) and fractional seconds (second byte).
20 4 Windows Language ID, for example 0x0409 = LANG_ENGLISH/SUBLANG_ENGLISH_US
24 16 GUID: 7C01FD10-7BAA-11D0-9E0C-00A0-C922-E6EC
40 16 GUID: 7C01FD11-7BAA-11D0-9E0C-00A0-C922-E6EC

Header Section 0:

offset size description
0 4 signature, must be hex: 01 FE
4 4 must be zero
8 8 CHM file size, little-endian

Active@ File Recovery Custom Scripting Example

This example does some validation calculations for CHM header's parameters beyond simple file size extraction.
Syntax of the signature definition language you can read here.

[CHM_HEADER]
DESCRIPTION=Microsoft CHM Help
EXTENSION=chm
BEGIN=CHM_BEGIN
SCRIPT=CHM_SCRIPT

[CHM_BEGIN]
ITSF=0|0

[CHM_SCRIPT]
	version = read(dword, 4)
	if (version == 0) goto exit
	header = read(dword, 8)
	if (header <= 1Ch) goto exit
	temp = read(qword, header)
	if (temp != 1FEh) goto exit
	temp = sum(header, 8)
	size = read(qword, temp)
	temp = sum(header, 10h)
	if (size > temp) goto exit
	size = 0