WHEA General Structure and Reporting:
WHEA has the following general component structure:
LLHEHs are used to perform hardware error source discovery, gather information about the error source in the form of Hardware Error Packets, and then notify the operating system of the error. The Windows Kernel then formats these Hardware Error Packets into Error Records.
Here is the general format of a Error Record for WHEA:
When a error condition has happened, the LLHEH may communicate with the PSHED to receive platform specific information for that error condition.
Hardware Error Classification
There are two types hardware error groups: corrected and uncorrected. These classifications are quite self explanatory, but I'll explain them nevertheless.
Corrected Errors: These are errors which have been corrected by the hardware, the operating system is then notified of this correction.
Uncorrected Errors: These are errors which can't be corrected by the hardware, and therefore fall into a further two different categories: Fatal and Non-fatal.
Fatal: Uncorrected error which can't be corrected by the recovered by the hardware, and will result in a bugcheck.
Non-Fatal: The errors can be attempted to recovered by the operating system, however, failure to do so will result in a bugcheck.
Error Sources
Error sources refer to the hardware which located a hardware error, they do not necessarily mean that the error source is the problem. At boot, the PSHED (Platform Specific Hardware Error Driver) returns a list of WHEA_ERROR_SOURCE_DESCRIPTOR structures to the Windows Kernel to indicate all the supported error sources for that hardware platform. With this information, the operating system is able to load and set up the necessary LLHEHs (Low Level Hardware Error Handlers) for the hardware error sources.
The hardware platforms for x86 and x64 are:
- Machine Check Exceptions
- Corrected Machine Checks
- Non Maskable Interrupts
- Boot Errors
The general structure is as follows:
The Type member indicates the type of error source. ErrorSourceID indicates the unique identifier for the error source. More information on this structure can be found in the WDK documentation.
The union lists all the descriptors for a specific error source.
No comments:
Post a Comment