Friday 15 November 2013

Debugging Stop 0x1A - Working Set Free List Corrupt

Another simple memory management case to debug, I thought I explained working set beforehand, but it seems that isn't true so I'll also be explaining working set and the Working Set Manager.

The first parameter indicates the subtype of the bugcheck, in this case the 5003 corresponds to a corrupt working set free list which is usually a result of a hardware problem. The second parameter (undocumented) contains the address of the working set list for a given process.

Before, we go directly explaining the bugcheck, let's take a look at working set.

Working Set:

Working Set is simply a group of virtual pages allocated to a process which are present within physical memory (RAM). Windows by default sets the working set limits to a minimum of 50 pages and a maximum of 345 pages. However, these limits have little effect, since a process can exceed the maximum page limit as long as there is enough  physical memory. The minimum and maximum working set can be set by the user with the SetProcessWorkingSetSize function. However, this isn't recommended by Microsoft and other developers, a few good threads can be found here and here.

The !process extension can give us the working set for a given process.

These working set limits are still governed by the hard working set limits set by Windows. On x86 systems it's 2,047.9MB and on x64 systems it's 8,192GB.

So, when a process causes a page fault, the Memory Manager adds more pages to the working set. However, there may be times when the system begins to become low on memory, and therefore instead of adding pages, pages are replaced.

Although, there may be times, when the physical memory drops to a certain level, that the Working Set Manager must intervene and start scanning through each process' working set, in order to free some memory. This is known as trimming. You can run working set trimming on your own processes with SetProcessWorkingSetSizeEx.

Generally, the Working Set Manager will look at processes which are above their working set minimums, and then checks the Accessed protection bit in the PTE for that virtual page (use !pte), if the process has been accessed then (the bit is clear), then the page is said to be aged. If the page still hasn't been accessed on it's second run, then the page is freed from the working set of the process. Otherwise, the process repeats itself.

On the other hand, if the Accessed bit has been set, then it is cleared by the Working Set Manager. The Working Set Manager then scans the page again (if needed) on it's second run, and checks if the page has been accessed again. If the page hasn't been accessed again, then the page is aged too, and liable to be removed from the working set.

We can view the entries of the working set for a process with the !wsle extension.

The parameters and bit flags associated with this extension are all documented withing WinDbg. As you can see, we can only gather very basic information at the moment, this the Working Set Free List is corrupt. This can be further seen with the dt nt!_MMWSLE command.

The data structure applies to each individual page within a working set for a process. 

It contains one field named Age, which is counter for when the Working Set Manager has incremented it's age value, because it hasn't been accessed. The counter's maximum is 7, and this is when a page is removed from the working set. The Working Set Manager is called by a system thread called the Balance Set Manager which waits upon two event objects. A event which is signaled upon a timer object is set to the signaled state every second, and a working set manager event which is signaled in certain memory conditions.


Code Machine - _MMWSLE 

1 comment: