Friday 28 June 2013

Handling IRPs - Driver Stacks

This is going to be quite a brief blog post, I was reading up about some information about Stop 0x9F bugchecks, and noticed a few interesting calls highlighted by a another BSOD debugger, so I went and did some research, and therefore would like to
explain a little about Driver Stacks and the Device Tree.

I'm assuming you know about IRPs and the I/O Manager used by the Windows Kernel.

Okay, when a device driver is requested to perform some kind of operation, then it is usually sent by a IRP, the device in which the device driver belongs to is represented by something called a Device Node. This Device Node is a structure used to represent a device connected to the system. The Device Node is stored within another structure called the Device Tree, whereby all the Device Nodes for all the devices connected to the system are stored.

Each Device Node also has it's own Device Stack, the Device Stack contains a list of ordered Device Objects; Device Objects are the individual drivers for each device, and are used to perform different operations for that device. For example, a PCI Bus may have two Device Objects, or drivers associated with the Device (Device Node), these Device Objects or drivers are Pci.sys and Acpi.sys.

So getting back to the point, a Driver Stack connects all the Devices together into a list or a stack in this case to process
the IRP.

*Note* I'm no expert by any means, and as a result, some of the information may either be incorrect or misunderstood, please
check the references as a more through guide.


References:

Device nodes and device stacks - http://msdn.microsoft.com/en-us/library/windows/hardware/ff554721(v=vs.85).aspx

Driver Stacks - http://msdn.microsoft.com/en-us/library/windows/hardware/hh439632(v=vs.85).aspx

Thursday 27 June 2013

Learning Debugging Resources

Right, I'm been reading some great resources on Windows Debugging, and the internals of the Windows operating system. Remember my 5 learning tips? I still feel it's very important to understand the mechanics of the operating system when debugging, especially when we usually tend to only have Kernel-based dump files to work with.

Okay, so enough of the chatter, and let's get some debugging resources written up for you to enjoy:

MSDN Library (Debugging) - http://msdn.microsoft.com/library/windows/hardware/hh833791

MSDN Library (Drivers) - http://msdn.microsoft.com/library/windows/hardware/gg581061

MSDN Blogs (NT Debugging) - http://blogs.msdn.com/b/ntdebugging/

Mark Russinovich’s Blog - http://blogs.technet.com/b/markrussinovich/

Channel 9 | Defrag Tools (Debugging Videos) - http://channel9.msdn.com/Shows/Defrag-Tools

OSR Online (Drivers/Debugging) - http://www.osronline.com/index.cfm

There's many resources out there, which are free and available to use, so get searching and reading about debugging, driver development and operating system mechanisms.




Monday 24 June 2013

Stop 0x19 - Some Theory About Corrupt Pool Headers

A Stop 0x19 will typically mention that a pool header has become corrupt, so I wanted to explain a little abit about the theory behind what a pool header is and how it is used in the Windows memory allocation system.

A device driver or process will often request a chunk of memory, therefore the Memory Manager will allocate the requested block of memory to the requesting process. Here is where, the header comes in, most Memory Managers will allocate a block of memory which is larger than the requested amount, this extra memory is known as the header. The header will contain useful information such as the size of the allocation and a pool tag. A tag is a form of information used to reference certain objects. 

References:

What is an Object?






Update: Linked Lists

Okay, you know that ongoing discussion about linked lists? There's been some further input from other debuggers - 


"Just to clarify, linked lists that use both forward links and back links are called doubly linked lists. A regular linked list will just have forward links that proceed to the next entry in the list, whereas doubly linked will link to the entry before and after it." ~ Vir Gnarus

Wednesday 19 June 2013

Learning BSOD Debugging - 5 Tips

Okay, most computers will get a BSOD at some point in their lifetime, much like we will almost certainly become ill at some point. A BSOD can be a scary and frustrating predicament for most users who may not have the knowledge of what a BSOD is and how to resolve it.

They will tend to search YouTube or a Google, and here's what really irritates me; the usual advice which is given to them is to clean install Windows and hope that the problem doesn't happen again, that's until they install the program or driver which was causing the problem.

Okay, that's enough of exerting my annoyance with posts like that, in this blog post, I'm going to give you future debuggers some quick learning tips which will help you improve your responses when debugging BSODs, and when attempting to understand the technical concepts of computers.

Tip #1

Grab a programming book on C++ or find a website which offers some good tutorials for learning how to program in C++. The drivers developed for the Windows operating system are primarily written in C++ now, and therefore I feel it's important if you wish to understand what some of the driver function calls do then learning C++ will certainly be an asset. 

Here's a good reference - 

-http://www.cplusplus.com/doc/tutorial/


Tip #2

You will soon see many parameters of bugchecks giving us some very useful information, to how the BSOD was exactly caused, I feel learning how the Windows operating system works internally, will help you understand what these parameters mean and what steps should be taken when debugging the crash.

Tip #3

I suggest learning how hardware works and how we can test different hardware, many BSODs can be caused by hardware, and having a good background of hardware diagnosis will certainly help you with the debugging process.

Tip #4

Read the Windows Debugger documentation, and learn how to use different extensions and commands to extract more useful information from the dump files. Please note that a Kernel memory dump may be needed for most of the commands.

Tip #5

Sign up and join a forum and start reading different debugging blogs, there is a huge level of knowledge which many users share openly and many will be willing to answer any questions you have about debugging BSODs.

Understanding Memory Descriptor Lists

Originally, I was going to write up my own explanation of what MDLs are, but Vir Gnarus (excellent debugger on numerous forums) has written up a great tutorial for understanding what they are, and what they do:

Fun with MDLs -- Sysnative Forums

Hope this helps anyone who wishes to understand MDLs.

Understanding Blink and Flink Lists (Stop 0x19)

Hey fellow BSOD Kernel Dump Analysts,


I noticed in some dumps, there seems to be a listing of addresses related to something called a Flink and Blink Free-List. I was curious of what it meant, and how it was used by the Windows operating system, and therefore started a discussion on the topic at Sysnative.com.

Here's a the discussion, it's on-going discussion:

[Question] What's the flink and blink free-list? - Sysnative.com



Sunday 16 June 2013

Understanding Page Frame Number Lists

I'm in the mood to write a proper blog post this time, with a real-life example of debugging a Stop 0x4E, which is still currently in process of being debugged.

Okay, so let's begin with explaining what is a PFN List and how the operating system uses this list.

The Windows Memory Management system organizes all physical pages of memory into a large one dimensional array, which allows the operating system to conveniently access each page (Page Frame database); each page is given referenced with a Page Frame Number entry. 

If you have ever programmed in almost any kind of modern programming language, then you will understand what a one-dimensional array is, however, for those who do not understand then I will briefly explain what it is. Remember Windows is written in C++, and therefore I will explain it in terms of C++. A array is a data structure used in programming, to store variables of the same type in one continuous memory block, this makes more efficient to access certain parts of data.

A Stop 0x4E can be caused by device drivers writing and accessing invalid memory locations, we can use Driver Verifier to check for any drivers which may be causing issues.

Another cause is, corrupted RAM and incorrect handling of MDLs (Memory Descriptor Lists), which is a data structure used in I/O requests to map a process' virtual memory address to the physical memory address; the MDL locks the physical page for the driver. More Information about MDLs.

If you would like to follow my current debugging efforts on a Stop 0x4E, then your more than welcome to do so, and can view this thread for any updates - Memory_Management and PFN_LIST_CORRUPT.




Friday 14 June 2013

Debugging Stop 0x116/Stop 0x117

This bugcheck is another common BSOD, I tend to find it related to a bad graphics card or a buggy graphics card driver:


STOP 0x116: VIDEO_TDR_ERROR troubleshooting


Debugging Stop 0x119

Hey everyone, this is another link to a tutorial post again, this time related to Stop 0x119's, which is usually a graphics card/graphics driver related problem.

0x119 VIDEO_SCHEDULER_INTERNAL_ERROR & Fence IDs


Thursday 13 June 2013

Debugging 0x19 BAD_POOL_HEADER

This is going to be another post a link to a useful tutorial blog post, I'll post a few other one-link blog posts soon:

Debugging 0x19 BAD_POOL_HEADER 
Have fun reading!

EDIT: I've added the link again, so hopefully it now works, I'm going to paste the URL down below, just in case the link still doesn't work:

http://www.sevenforums.com/bsod-help-support/299517-bsod-bad-pool-header-caller-while-using-cubase-after-closing.html

Thursday 6 June 2013

Windows Checked Builds

I've noticed that some users have Checked builds of Windows, and that a certain BSOD named STOP 0x00000001: APC_INDEX_MISMATCH only appears on Checked builds of Windows operating systems.

From what I understand, Checked builds of Windows are used for Driver Developers to have more accessibility and detailed information when bugs occur in the drivers they are developing. I tend to like to think of it as, a VM for Driver Developers to test the compatibility and stability of their drivers with the Windows operating system.

A Checked build is commonly referred to as a Debugging build of Windows, whereas, the Retail versions - the versions most of us are currently running - are referred to as Free Builds.

Here are the key differences between the two builds as quoted from Microsoft:

"The free build (or retail build):

The free build of Microsoft Windows is used in production environments. The free build of the operating system is built with full compiler optimizations. When the free build discovers correctable problems, it continues to run.

The distribution media that contain the free build of the operating system do not have any special labels--in other words, the CD that contains the free build is labeled with the Windows version name, without any reference to the type of build. 
The checked build (or debug build):

The checked build of Microsoft Windows makes identifying and diagnosing operating-system-level problems easier.

The checked build differs from the free build in the following ways:
  • Many compiler optimizations (such as stack frame elimination) are disabled in the checked build. This makes it easier to understand disassembled machine instructions, and therefore it is easier to trace the cause of problems in system software.
  • The checked build enables a large number of debugging checks in the operating system code and system-provided drivers. This helps the checked build to identify internal inconsistencies and problems as soon as they occur.
  • Distribution media that contain the checked build are clearly labeled as "Debug/Checked Build." The checked build distribution medium contains the checked version of the operating system, plus the checked versions of HALs, drivers, file systems, and even many user-mode components."

Please note, that Checked Builds are much larger and slower than their Free Build counterparts.

More Information - Checked Builds and Free Builds