Windows Heap Internals learning

windows heap is divided into two types:NT heap and segment heap

  • NT heap
    • Exists since early versions of Windows NT.
    • The default heap implementation up through Windows 7/8.
  • Segment heap (why not public)
    • Introduced in Windows 10 as the modern heap manager.
    • Default for apps built with the Universal Windows Platform (UWP), Microsoft Edge, and newer apps.

Heap structure

  1. A heap consist of some segments.Each segment has a continuous space.
  2. A segment consist of some Entry

Here is the structure of heap:

NT heap

Some Structure

  1. _HEAP
//0x2c0 bytes (sizeof)
struct _HEAP
{
    union
    {
        struct _HEAP_SEGMENT Segment;                                       //0x0
        struct
        {
            struct _HEAP_ENTRY Entry;                                       //0x0
            ULONG SegmentSignature;                                         //0x10
            ULONG SegmentFlags;                                             //0x14
            struct _LIST_ENTRY SegmentListEntry;                            //0x18
            struct _HEAP* Heap;                                             //0x28
            VOID* BaseAddress;                                              //0x30
            ULONG NumberOfPages;                                            //0x38
            struct _HEAP_ENTRY* FirstEntry;                                 //0x40
            struct _HEAP_ENTRY* LastValidEntry;                             //0x48
            ULONG NumberOfUnCommittedPages;                                 //0x50
            ULONG NumberOfUnCommittedRanges;                                //0x54
            USHORT SegmentAllocatorBackTraceIndex;                          //0x58
            USHORT Reserved;                                                //0x5a
            struct _LIST_ENTRY UCRSegmentList;                              //0x60
        };
    };
    ULONG Flags;                                                            //0x70
    ULONG ForceFlags;                                                       //0x74
    ULONG CompatibilityFlags;                                               //0x78
    ULONG EncodeFlagMask;                                                   //0x7c
    struct _HEAP_ENTRY Encoding;                                            //0x80
    ULONG Interceptor;                                                      //0x90
    ULONG VirtualMemoryThreshold;                                           //0x94
    ULONG Signature;                                                        //0x98
    ULONGLONG SegmentReserve;                                               //0xa0
    ULONGLONG SegmentCommit;                                                //0xa8
    ULONGLONG DeCommitFreeBlockThreshold;                                   //0xb0
    ULONGLONG DeCommitTotalFreeThreshold;                                   //0xb8
    ULONGLONG TotalFreeSize;                                                //0xc0
    ULONGLONG MaximumAllocationSize;                                        //0xc8
    USHORT ProcessHeapsListIndex;                                           //0xd0
    USHORT HeaderValidateLength;                                            //0xd2
    VOID* HeaderValidateCopy;                                               //0xd8
    USHORT NextAvailableTagIndex;                                           //0xe0
    USHORT MaximumTagIndex;                                                 //0xe2
    struct _HEAP_TAG_ENTRY* TagEntries;                                     //0xe8
    struct _LIST_ENTRY UCRList;                                             //0xf0
    ULONGLONG AlignRound;                                                   //0x100
    ULONGLONG AlignMask;                                                    //0x108
    struct _LIST_ENTRY VirtualAllocdBlocks;                                 //0x110
    struct _LIST_ENTRY SegmentList;                                         //0x120
    USHORT AllocatorBackTraceIndex;                                         //0x130
    ULONG NonDedicatedListLength;                                           //0x134
    VOID* BlocksIndex;                                                      //0x138
    VOID* UCRIndex;                                                         //0x140
    struct _HEAP_PSEUDO_TAG_ENTRY* PseudoTagEntries;                        //0x148
    struct _LIST_ENTRY FreeLists;                                           //0x150
    struct _HEAP_LOCK* LockVariable;                                        //0x160
    LONG (*CommitRoutine)(VOID* arg1, VOID** arg2, ULONGLONG* arg3);        //0x168
    union _RTL_RUN_ONCE StackTraceInitVar;                                  //0x170
    struct _RTL_HEAP_MEMORY_LIMIT_DATA CommitLimitData;                     //0x178
    VOID* FrontEndHeap;                                                     //0x198
    USHORT FrontHeapLockCount;                                              //0x1a0
    UCHAR FrontEndHeapType;                                                 //0x1a2
    UCHAR RequestedFrontEndHeapType;                                        //0x1a3
    WCHAR* FrontEndHeapUsageData;                                           //0x1a8
    USHORT FrontEndHeapMaximumIndex;                                        //0x1b0
    volatile UCHAR FrontEndHeapStatusBitmap[129];                           //0x1b2
    struct _HEAP_COUNTERS Counters;                                         //0x238
    struct _HEAP_TUNING_PARAMETERS TuningParameters;                        //0x2b0
}; 
Some Structure members Explain
SegmentSignaturen Identity heap’s type(NT heap or Segment heap)
SegmentFlags Not public
NumberOfPages Segment Reverse size (* page_size)
FirstEntry The address of the first traversable block header within the segment
LastValidEntry Logical endpoint of segment traversal
NumberOfUnCommittedPages Total number of pages not yet submitted
UCRSegmentList Todo
Flags Todo
ForceFlags Debug bits that are forcibly enabled by tools such as gflags/AppVerifier will merge with Flags at runtime and affect behavior
Encoding Encode(xor) chunk’s header (size and prevsize)
SegmentReserve Segment reversed size (Bytes)
SegmentCommit Segment committed size (Bytes)
TotalFreeSize Total free space within the segment (in 16-byte granularity, not bytes)
BlocksIndex The core structure in back-end allocator used to manage chunks
FreeLists A linked list used to collect all freed chunk in back-end(similar as unsorted bin in libc,size sorted list)
FrontEndHeap A pointer pointed to the structure of front-end heap
FrontEndHeapUsageData Recode the number of chunks used by various sizes.When it reaches a certain level,it will enable the Front-End allocator of the corresponding chunk

Here is a real example:

Back-End

Data structure

  1. _HEAP_ENTRY(chunk) it divided into:
  • Allocated chunk
  • Freed chunk
  • VirtualAlloc chunk
//0x10 bytes (sizeof)
struct _HEAP_ENTRY
{
    union
    {
        struct _HEAP_UNPACKED_ENTRY UnpackedEntry;                          //0x0
        struct
        {
            VOID* PreviousBlockPrivateData;                                 //0x0
            union
            {
                struct
                {
                    USHORT Size;                                            //0x8
                    UCHAR Flags;                                            //0xa
                    UCHAR SmallTagIndex;                                    //0xb
                };
                struct
                {
                    ULONG SubSegmentCode;                                   //0x8
                    USHORT PreviousSize;                                    //0xc
                    union
                    {
                        UCHAR SegmentOffset;                                //0xe
                        UCHAR LFHFlags;                                     //0xe
                    };
                    UCHAR UnusedBytes;                                      //0xf
                };
                ULONGLONG CompactHeader;                                    //0x8
            };
        };
        struct _HEAP_EXTENDED_ENTRY ExtendedEntry;                          //0x0
        struct
        {
            VOID* Reserved;                                                 //0x0
            union
            {
                struct
                {
                    USHORT FunctionIndex;                                   //0x8
                    USHORT ContextValue;                                    //0xa
                };
                ULONG InterceptorValue;                                     //0x8
            };
            USHORT UnusedBytesLength;                                       //0xc
            UCHAR EntryOffset;                                              //0xe
            UCHAR ExtendedBlockSignature;                                   //0xf
        };
        struct
        {
            VOID* ReservedForAlignment;                                     //0x0
            union
            {
                struct
                {
                    ULONG Code1;                                            //0x8
                    union
                    {
                        struct
                        {
                            USHORT Code2;                                   //0xc
                            UCHAR Code3;                                    //0xe
                            UCHAR Code4;                                    //0xf
                        };
                        ULONG Code234;                                      //0xc
                    };
                };
                ULONGLONG AgregateCode;                                     //0x8
            };
        };
    };
}; 
Structure members Explain
PreviousBlockPrivateData most time no meaning,it’s prev block’s data.Heap managers use it to store metadata in certain situations.
size size of chunk(need to size « 4)
flag whether the chunk is busy
SmallTagIndex checksum is the xor of the first three bytes((size & 0xff) ^ (size & 0xFF00) ^ (flag)),it used to verify the header of chunk
PreviousSize prev block size(need to « 4)
SegmentOffset Sometime used to search for segment
Unusedbyte The remaining size of chunk after allocation.It can be used to determine the status of chunk(front-end or back-end)
  1. Allocated chunk

  2. Freed chunk

  3. VirtualAlloc chunk

BlocksIndex(_HEAP_LIST_LOOKUP)

manage freed chunks of various sizes,so that it can quickly fint suitable chunks.

Data structure

//0x38 bytes (sizeof)
struct _HEAP_LIST_LOOKUP
{
    struct _HEAP_LIST_LOOKUP* ExtendedLookup;                               //0x0
    ULONG ArraySize;                                                        //0x8
    ULONG ExtraItem;                                                        //0xc
    ULONG ItemCount;                                                        //0x10
    ULONG OutOfRangeItems;                                                  //0x14
    ULONG BaseIndex;                                                        //0x18
    struct _LIST_ENTRY* ListHead;                                           //0x20
    ULONG* ListsInUseUlong;                                                 //0x28
    struct _LIST_ENTRY** ListHints;                                         //0x30
}; 
Structure members Explain
ExtendedLookup Point to next ExtendedLookup.(The next BlocksIndex will manage larger chunks )
Array_size The max chunk size that will be managed by the BlocksIndex (The first BlocksIndex ArraySize will by 0x80(actually it is 0x800))
ExtraItem Some non-standard sized blocks will be placed in the Extra slot.For example:Abnormal sizes, sizes that are not normally 16-aligned.
ItemCount The number of chunks in the BlocksIndex
OutOfRangeItems The number of chunks that exceed the size managed by this BlocksIndex
BaseIndex The free list managed by this lookup is the starting index within the entire heap FreeLists[].
ListHead Freelist’s Header
ListsInUseUlong A bitmap is used to quickly determine whether a free list is not empty. (Same as Linux’s binmap)
ListHints A hint (cached) used to determine which linked list is more likely to be hit next.(The interval of chunk is 0x10)(Same as Linux’s tcache)

Some examples:

  1. BaseIndex
BaseIndex = 0x20
ArraySize = 0x10

This lookup manages FreeLists[0x20] to FreeLists[0x2F]

NT heap is default memory allocator and it is divided into FrontEnd and BackEnd Allocators.

  • FrontEnd Allocator
    • LowFragementationHeap (aka LFH, talk about it later)
    • Deal with small allocation(size <= 0x4000)
    • In order to prevent memory fragement problem,it will enable LFH after allocating a certain number of the same size of chunk
  • BackEnd Allocator
    • Handles large allocations.
    • Core allocator responsible for demanding memory from OS.

Allocate detail

Free Detail

Segment Heap

  • new memory allocator in win10
  • Default for apps built with the Universal Windows Platform (UWP), Microsoft Edge, and newer apps.

Segment Heap Architecture

Defaults and Configuration

The Segment Heap is currently an opt-in feature. Windows apps are opted-in by default and executables with a name that matches any of the following (names of system executables) are also opted-in by default to use the Segment Heap:

  • csrss.exe
  • lsass.exe
  • runtimebroker.exe
  • services.exe
  • smss.exe
  • svchost.exe

To enable or disable the Segment Heap for a specific executable, the following Image File Execution Options (IFEO) registry entry can be set:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\
Image File Execution Options\(executable)
FrontEndHeapDebugOptions = (DWORD)
Bit 2 (0x04): Disable Segment Heap
Bit 3 (0x08): Enable Segment Heap

To globally enable or disable the Segment Heap for all executables, the following registry entry can be set:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Segment Heap
Enabled = (DWORD)
0 : Disable Segment Heap
(Not 0): Enable Segment Heap

if heap’s Signature equals 0xddeeddee,it is Segment heap,else if it equals 0xffeeffee, it is Nt heap.

The heap is allocated and initialized via a call to RtlpHpSegHeapCreate(). NtAllocateVirtualMemory() is used to reserve and commit the virtual memory for the heap. The reserve size varies depending on the number of processors and the commit size is the size of the_SEGMENT_HEAP structure.

Block Allocation

When allocating a block via HeapAlloc() or RtlAllocateHeap(), the allocation request will eventually be routed to RtlpHpAllocateHeap() if the heap is managed by the Segment Heap. RtlpHpAllocateHeap() has the following function signature: PVOID RtlpHpAllocateHeap(_SEGMENT_HEAP* HeapBase, SIZE_T UserSize, ULONG Flags, USHORT Unknown) Where UserSize (user-requested size) is the size passed to HeapAlloc() or RtlAllocateHeap(). The return value is the pointer to the newly allocated block (called UserAddress for the rest of this paper). The diagram below shows the logic of RtlpHpAllocateHeap():

The purpose of RtlpHpAllocateHeap() is to call the allocation function of the appropriate Segment Heap component based on AllocSize. AllocSize (allocation size) is the adjusted UserSize depending on Flags, but by default, AllocSize will be equal to UserSize unless UserSize is 0 (if UserSize is 0, AllocSize will be 1). Note that the logic starting where AllocSize is checked is actually in a separate RtlpHpAllocateHeapInternal() function, it is just inlined in the diagram for brevity. Also, one part to notice is that if LFH allocation returns -1, it means that the LFH bucket corresponding to AllocSize is not yet activated, and therefore, the allocation request will eventually be passed to the VS allocation component.

Block Freeing

Different kinds of heap

Process Heap

The system creates a default heap when each process starts.Functions like malloc, new, HeapAlloc(GetProcessHeap(), ...), LocalAlloc,GlobalAlloc usually allocate from this heap unless otherwise specified.This heap shared by the entired process,it will be used when you use windows API.And it stored in _PEB. Data Structure _PEB

//0x7c8 bytes (sizeof)
struct _PEB
{
    UCHAR InheritedAddressSpace;                                            //0x0
    UCHAR ReadImageFileExecOptions;                                         //0x1
    UCHAR BeingDebugged;                                                    //0x2
    union
    {
        UCHAR BitField;                                                     //0x3
        struct
        {
            UCHAR ImageUsesLargePages:1;                                    //0x3
            UCHAR IsProtectedProcess:1;                                     //0x3
            UCHAR IsImageDynamicallyRelocated:1;                            //0x3
            UCHAR SkipPatchingUser32Forwarders:1;                           //0x3
            UCHAR IsPackagedProcess:1;                                      //0x3
            UCHAR IsAppContainer:1;                                         //0x3
            UCHAR IsProtectedProcessLight:1;                                //0x3
            UCHAR IsLongPathAwareProcess:1;                                 //0x3
        };
    };
    UCHAR Padding0[4];                                                      //0x4
    VOID* Mutant;                                                           //0x8
    VOID* ImageBaseAddress;                                                 //0x10
    struct _PEB_LDR_DATA* Ldr;                                              //0x18
    struct _RTL_USER_PROCESS_PARAMETERS* ProcessParameters;                 //0x20
    VOID* SubSystemData;                                                    //0x28
    VOID* ProcessHeap;                                                      //0x30  Default heap (same as GetProcessHeap())
    struct _RTL_CRITICAL_SECTION* FastPebLock;                              //0x38
    union _SLIST_HEADER* volatile AtlThunkSListPtr;                         //0x40
    VOID* IFEOKey;                                                          //0x48
    union
    {
        ULONG CrossProcessFlags;                                            //0x50
        struct
        {
            ULONG ProcessInJob:1;                                           //0x50
            ULONG ProcessInitializing:1;                                    //0x50
            ULONG ProcessUsingVEH:1;                                        //0x50
            ULONG ProcessUsingVCH:1;                                        //0x50
            ULONG ProcessUsingFTH:1;                                        //0x50
            ULONG ProcessPreviouslyThrottled:1;                             //0x50
            ULONG ProcessCurrentlyThrottled:1;                              //0x50
            ULONG ProcessImagesHotPatched:1;                                //0x50
            ULONG ReservedBits0:24;                                         //0x50
        };
    };
    UCHAR Padding1[4];                                                      //0x54
    union
    {
        VOID* KernelCallbackTable;                                          //0x58
        VOID* UserSharedInfoPtr;                                            //0x58
    };
    ULONG SystemReserved;                                                   //0x60
    ULONG AtlThunkSListPtr32;                                               //0x64
    VOID* ApiSetMap;                                                        //0x68
    ULONG TlsExpansionCounter;                                              //0x70
    UCHAR Padding2[4];                                                      //0x74
    VOID* TlsBitmap;                                                        //0x78
    ULONG TlsBitmapBits[2];                                                 //0x80
    VOID* ReadOnlySharedMemoryBase;                                         //0x88
    VOID* SharedData;                                                       //0x90
    VOID** ReadOnlyStaticServerData;                                        //0x98
    VOID* AnsiCodePageData;                                                 //0xa0
    VOID* OemCodePageData;                                                  //0xa8
    VOID* UnicodeCaseTableData;                                             //0xb0
    ULONG NumberOfProcessors;                                               //0xb8
    ULONG NtGlobalFlag;                                                     //0xbc
    union _LARGE_INTEGER CriticalSectionTimeout;                            //0xc0
    ULONGLONG HeapSegmentReserve;                                           //0xc8
    ULONGLONG HeapSegmentCommit;                                            //0xd0
    ULONGLONG HeapDeCommitTotalFreeThreshold;                               //0xd8
    ULONGLONG HeapDeCommitFreeBlockThreshold;                               //0xe0
    ULONG NumberOfHeaps;                                                    //0xe8
    ULONG MaximumNumberOfHeaps;                                             //0xec
    VOID** ProcessHeaps;                                                    //0xf0 Array of heap handles
    VOID* GdiSharedHandleTable;                                             //0xf8
    VOID* ProcessStarterHelper;                                             //0x100
    ULONG GdiDCAttributeList;                                               //0x108
    UCHAR Padding3[4];                                                      //0x10c
    struct _RTL_CRITICAL_SECTION* LoaderLock;                               //0x110
    ULONG OSMajorVersion;                                                   //0x118
    ULONG OSMinorVersion;                                                   //0x11c
    USHORT OSBuildNumber;                                                   //0x120
    USHORT OSCSDVersion;                                                    //0x122
    ULONG OSPlatformId;                                                     //0x124
    ULONG ImageSubsystem;                                                   //0x128
    ULONG ImageSubsystemMajorVersion;                                       //0x12c
    ULONG ImageSubsystemMinorVersion;                                       //0x130
    UCHAR Padding4[4];                                                      //0x134
    ULONGLONG ActiveProcessAffinityMask;                                    //0x138
    ULONG GdiHandleBuffer[60];                                              //0x140
    VOID (*PostProcessInitRoutine)();                                       //0x230
    VOID* TlsExpansionBitmap;                                               //0x238
    ULONG TlsExpansionBitmapBits[32];                                       //0x240
    ULONG SessionId;                                                        //0x2c0
    UCHAR Padding5[4];                                                      //0x2c4
    union _ULARGE_INTEGER AppCompatFlags;                                   //0x2c8
    union _ULARGE_INTEGER AppCompatFlagsUser;                               //0x2d0
    VOID* pShimData;                                                        //0x2d8
    VOID* AppCompatInfo;                                                    //0x2e0
    struct _UNICODE_STRING CSDVersion;                                      //0x2e8
    struct _ACTIVATION_CONTEXT_DATA* ActivationContextData;                 //0x2f8
    struct _ASSEMBLY_STORAGE_MAP* ProcessAssemblyStorageMap;                //0x300
    struct _ACTIVATION_CONTEXT_DATA* SystemDefaultActivationContextData;    //0x308
    struct _ASSEMBLY_STORAGE_MAP* SystemAssemblyStorageMap;                 //0x310
    ULONGLONG MinimumStackCommit;                                           //0x318
    struct _FLS_CALLBACK_INFO* FlsCallback;                                 //0x320
    struct _LIST_ENTRY FlsListHead;                                         //0x328
    VOID* FlsBitmap;                                                        //0x338
    ULONG FlsBitmapBits[4];                                                 //0x340
    ULONG FlsHighIndex;                                                     //0x350
    VOID* WerRegistrationData;                                              //0x358
    VOID* WerShipAssertPtr;                                                 //0x360
    VOID* pUnused;                                                          //0x368
    VOID* pImageHeaderHash;                                                 //0x370
    union
    {
        ULONG TracingFlags;                                                 //0x378
        struct
        {
            ULONG HeapTracingEnabled:1;                                     //0x378
            ULONG CritSecTracingEnabled:1;                                  //0x378
            ULONG LibLoaderTracingEnabled:1;                                //0x378
            ULONG SpareTracingBits:29;                                      //0x378
        };
    };
    UCHAR Padding6[4];                                                      //0x37c
    ULONGLONG CsrServerReadOnlySharedMemoryBase;                            //0x380
    ULONGLONG TppWorkerpListLock;                                           //0x388
    struct _LIST_ENTRY TppWorkerpList;                                      //0x390
    VOID* WaitOnAddressHashTable[128];                                      //0x3a0
    VOID* TelemetryCoverageHeader;                                          //0x7a0
    ULONG CloudFileFlags;                                                   //0x7a8
    ULONG CloudFileDiagFlags;                                               //0x7ac
    CHAR PlaceholderCompatibilityMode;                                      //0x7b0
    CHAR PlaceholderCompatibilityModeReserved[7];                           //0x7b1
    struct _LEAP_SECOND_DATA* LeapSecondData;                               //0x7b8
    union
    {
        ULONG LeapSecondFlags;                                              //0x7c0
        struct
        {
            ULONG SixtySecondEnabled:1;                                     //0x7c0
            ULONG Reserved:31;                                              //0x7c0
        };
    };
    ULONG NtGlobalFlag2;                                                    //0x7c4
}; 
HANDLE GetProcessHeap() {
    return NtCurrentTeb()->ProcessEnvironmentBlock->ProcessHeap;
}

CRT Heap (After VS2015 use process heap)

C++ programming often uses malloc and new to allocate memory, methods provided by the CRT library. According to research, prior to (and including) VS2010, the CRT library used HeapCreate to create a heap for its own use. After VS2015, the CRT library no longer creates a separate heap, but instead uses the process’s default heap

Private Heap

It create by HeapCreate,Then use APIs such as HeapAlloc to operate the heap, such as applying for space.

HANDLE customHeap = HeapCreate(0, 0, 0);
void* mem = HeapAlloc(customHeap, 0, 1024);
HeapFree(customHeap, 0, mem);
HeapDestroy(customHeap);

Reference

[1] https://mrt4ntr4.github.io/Windows-Heap-Exploitation-dadadb/ [2] https://www.slideshare.net/slideshow/windows-10-nt-heap-exploitation-english-version/154467191#1 [3] https://www.freebuf.com/articles/web/287403.html [4] https://www.vergiliusproject.com/kernels/x64/windows-10/1809 [5] https://www.cnblogs.com/XiuzhuKirakira/p/17993207 [6] https://blackhat.com/docs/us-16/materials/us-16-Yason-Windows-10-Segment-Heap-Internals-wp.pdf [7] https://blackhat.com/docs/us-16/materials/us-16-Yason-Windows-10-Segment-Heap-Internals.pdf