<--

Apple QuickTime Image Description Atom Sign Extension Memory Corruption

Aleph Research Advisory

Identifier

Severity

High

Product

Apple Quicktime

Vulnerable Version

Verified on QuickTime 7.6

Mitigation

Install v7.6.2 or later.

Technical Details

According to QuickTime’s specification, The sample description atom (STSD) stores information that allows QuickTime to decode samples in the media. It has the following structure:

0   DWORD   Size
4   DWORD   Type
8   BYTE    Version
9   BYTE[3] FLAGS
12  DWORD   Number of entries
16  DWORD   Sample description table

The structure of each entry in the sample description table varies by the media type, however the first four fields are the same for all media types:

0   DWORD   Sample description size
4   DWORD   Data format
6   BYTE[6] Reserved
12  WORD    Data reference index

These four fields may be followed by additional data specific to the media type and data format. For video media, the general sample description format is extended by the following structure:

14 WORD     Version
6 WORD     Revision level
18 DWORD    Vendo
22 DWORD    Temporal quality
26 DWORD    Spatial quality
30 WORD     Width
32 WORD     Height
34 DWORD    Horizontal resolution
38 DWORD    Vertical resolutio
42 DWORD    Data size
46 WORD     Frame count
48 BYTE[32] Compressor name
80 WORD     Depth
82 WORD     Color table ID

When the data format field (offset 4 of the sample description table extension) is ‘RPZA’ (Apple Video), it is possible to trigger a sign extension vulnerability which leads to a buffer underflow. The following is the faulty sign extended MOV:

MOVSX ECX,WORD PTR SS:[ESP+4C]

[ESP+4C] contains a user controlled input, which is equal to ‘((width+(4-width%4))*4 & 0xFFFF’ where ‘width’ is taken from the RPZA sample description entry (offset 30).

If width >= 0x5FFD, then [ESP+4C] >= 0x8000. Sign-extending such values results in very large unsigned values, as their most significant word becomes 0xFFFF (so 0x8000 is sign-extended to 0xFFFF8000).

Deeper in the code, the user controllable sign-extended value is treated as the size of a structure.

A vector of this structure is walked over:

  • At each iteration the base pointer is incremented by the user’s controlled sign-extended value. This means that it is possible to force the pointer to reference memory regions below the vector’s VA:
    ADD EAX,EDX ; EAX = vector, EDX = sign extended value
    
  • At each iteration values are written to an element in the vector (a single structure) which is referenced by the incremented pointer. This means that it is possible to write to memory regions below the buffer’s VA.
    MOV DWORD PTR DS:[EAX],EBX
    MOV DWORD PTR DS:[EAX+4],EBX
    MOV DWORD PTR DS:[EAX+4],EBX
    MOV DWORD PTR DS:[EAX],EBX 
    

By writing to memory regions below the buffer’s VA, An attacker may overwrite crucial data such as function pointers, flags, heap structures and so forth. Doing so may allow an attacker to alter the normal control flow of the application and execute arbitrary code. A simple attack vector would be to lure the victim to browse to a web site controlled by the attacker, which serves a malicious QuickTime file that exploits this vulnerability.

Timeline

Credit

External References