The NT Insider 2004 Archive by 95hDBGa

VIEWS: 108 PAGES: 180

									The NT Insider 2004 Archive
| Published: 23-Mar-05| Modified: 23-Mar-05



Locking Down Drivers - A Survey of Techniques

Almost Like Being There - Virtual Server 2005

File System Filter Context - Observations and Comments

Keep Version Resources Up-To-Date

You've Typed !analyze -v, Now What? -- Next Steps in Debugging

What's Your Test Score -- Best Practices for Driver Testing

Try This -- Interactive Driver Testing

Trust Yet Verify -- All About Driver Verifier

Testing from the Ground Up -- Getting a Good Start

Test Lab Basics -- Helpful Hardware Accessories

Test Lab Basics -- Choosing Machines for Your Lab

Sometimes You Have to Write Your Own -- Case Study: ActGen IO Utility

One Special Case -- Testing File Systems

On the Right Path -- Testing with Device Path Exerciser

Just Checking Revisited -- Installing a Partially Checked Build

It's Easy to be Hard -- Testing with HCTs

It's a Setup -- What You Need to Start Developing Drivers

Go Diskless -- Using the Microsoft Symbol Servers

Easy Once You've Done It -- Setting Up the Debugger

Brand New 'Bag -- The Latest on WinDBG

WDF PnP/Power Interview with Microsoft's Jake Oshins
A New Framework

Beware the Guarded Mutex

The Future Is Now -- The WDF Kernel Mode Framework

Don't Blow Your Stack -- Clever Ways to Save Stack Space

Service Pack or Dot Release? -- Test With XP SP2 Now

Finding File Contents in Memory

Debugging A Sound Driver

Caching in the Pentium 4 Processor

Emerging Issues in IoCancelFileOpen
Locking Down Drivers - A Survey of Techniques
The NT Insider, Vol 11, Issue 5, November-December 2004 | Published: 09-Dec-04| Modified: 09-Dec-04



Recently, we've seen a series of questions concerning the retrieval of executable image
names. Presumably, these developers believe that by comparing the name of the executable
image with the name of their service, they can provide some additional level of security.
Unfortunately, doing so not only relies upon undocumented techniques, but also provides no
real additional security. We spent some time talking about this at OSR and decided to provide
a quick survey of the techniques that are available for accomplishing this.

What to Consider

In deciding how to provide secure communications between your service and driver, there are
several points to consider:


         The goals of securing this communications channel
         The operations to be secured - note that it is possible to allow some channels while
          disallowing others
         The environment in which the driver operating
         The type of driver being secured

The Image Name

First, let us note that using the name of the executable image is not secure. Indeed, with the
increasing accumulation of service threads within svchost.exe, it is quite common for the name
to be "svchost.exe". This name can be spoofed by even the most rudimentary of Trojan horse
programs simply by using the same name. Note that this is not a theoretical attack vector as it
has been used in recent computer attacks.

Admin Access Only

In considering the design of your communications, it is important to clearly articulate the goals
of securing the service-to-driver communications channel. For example, when we first started
discussing this issue, one perspective someone had was, "Just lock it down so only Admin
Group people can access it". In discussing this, we actually talked about reasons why, while
simple, this might not be the best security policy:


         It grants rights to your service far beyond what it needs to "get the job done"
         In modern PCs, many users log in using accounts that are members of the admin
          group
         Doing so does not grant you network (or domain) level permissions that are often
          useful or necessary
Use a Separate Account

How about the use of a separate, dedicated account for the service? Doing so, allows the
developer to determine exactly the set of rights required - establishing such credentials during
installation. An interesting side-effect of this is that it can circumvent "generic" attacks from an
account with Admin group privileges (of course, a specific attack can be constructed by
changing the shared password between the service and driver). This is because by using a
specific account, you can remove Admin group access. This can be overcome, but requires an
additional, specific step on the part of a would-be attacker.

To use a separate account, the installation process would create a new account (either on the
local machine or within the domain) and then use the resulting account information (notably
the SID) to construct the specific security. Perhaps the simplest way to achieve this is by using
the SDDL semantics that are described for the use in .INF files. The "trick" here is to use the
explicit SID of the account rather than one of the built-in, well-known SIDs. This becomes a
registry-based characteristic of the device for the driver, and will be automatically applied by
the Plug and Play Manager. This technique is only available to drivers that are installed or
started by the Plug and Play Manager.

Thus, for plug and play devices, this is likely to be the best and simplest solution. Note that
another avenue explored was the use of IoCreateDeviceSecure. Unfortunately, this routine
does not support arbitrary SID values in the SDDL string that is provided as the security
descriptor argument.

Use the Registry

A particularly clever solution that was suggested is to implement your own security by storing
the (binary) security descriptor in the registry. You can retrieve that security descriptor (in
REG_BINARY format) during system initialization and then use it directly to perform an access
check (RtlIsValidSecurityDescriptor) when your device is opened.

In this case, a driver can use a (validated) security descriptor in IRP_MJ_CREATE simply by
calling SeAccessCheck, passing in the security descriptor from the registry and the
necessary parameters from the create parameters themselves. The advantage of this
approach is it is freely available for any driver and is easy to implement with a few dozen lines
of code.

Check Against Multiple SDs

A more sophisticated implementation of the security check might be to perform that check
against multiple security descriptors. Your driver might have query interfaces that protect those
operations at one level (relatively low), while other interfaces represent state changes,
protected at a different (higher) level. Still other interfaces represent irreversible operations
("initiate the self-destruct sequence on the device") that must be protected at the highest levels.
Thus, a driver might extract three different security descriptors from the registry. A driver could
compute the allowed access in its IRP_MJ_CREATE dispatch entry point (certainly the easiest
solution) or it might capture the necessary parameters and compute the access when the
operation is performed.

What About File System Software?

Drivers in the file system space (linking against ntifs.h) can theoretically directly change the
security descriptor on their device object. We wrote code to do this, using documented
routines, but ultimately concluded that it was far too much effort, particularly given the power of
simply performing the access checks directly - a far simpler, equally workable approach.

Another technique worthy of consideration is to use privileges to protect the driver. The call to
check privileges is in the DDK header files. Unfortunately, the actual values of the various
privileges are not, and are only available for those drivers in the IFS Kit environment. Thus,
this technique has more limited usability. For file system drivers, it might be useful to check
specific privileges (e.g., SeManageVolume Privilege). This is done using the
SeSinglePrivilegeCheck API. Again, it is important that this type of check be done using the
correct set of credentials (generally in IRP_MJ_CREATE).

Key Overkill

It is possible to augment the security of the system in other ways as well. One of the most
interesting we have observed resulted in the driver dynamically generating an AES key that
was then directly written into a memory location of the service. The service and driver then
used that key to encrypt all subsequent operations between them. Thus, even a filter driver
inserted between the two of them could not directly interfere in their specific operations - this
was clearly an environment in which any tampering with the information was to be prevented.
For most drivers (and services) this level of protection is likely to be far more than the
circumstances require.

Summary

Security is one of those aspects of development that seems (on the surface) to be
straight-forward and yet as you learn more you find out how easy it is to do it wrong - and the
cost of doing it wrong today is compromised systems, lost information, and denial of service
attacks. See Sidebar Other Security Considerations

With that said, we know there are far better methods available to secure your control device
object for your service's use than by checking the service executable name - use them!


                            Other Security Considerations

       Never trust anything from an unauthenticated caller. If it is your service, track it. If your
        service terminates, ensure you can handle that state so that a subsequent call is not
        granted appropriate access.
         Don't be afraid to use exclusive access on your control device. This simplifies tracking
          who (or what) is calling your driver.
         Do not rely upon statically coded information. This is too easily compromised. Do not
          embed passwords in your software. Trust that the secure pieces of the OS are secure.
         There is no "security through obscurity". The advantage of obscuring information is to
          make it more difficult to compromise. This does not protect against compromise and is
          not sufficient by itself.
         There are no secure systems. Ultimately the biggest enemy to security is the user. Try
          to restrict what users can (and do) perform in normal activities. Use the event log for
          anything strange - like attempts to access your device that you deny.
         Always assume your driver (and service) are operating in a hostile environment.
          Assume that everything outside the trusted computing base is being written
          specifically to compromise your driver (and service).
         Don't worry about the trusted computing base being compromised - if it is, there is
          nothing your driver can do to resolve that fundamental underlying issue.
         When considering if something is done "securely," never trust your own judgment.
          Ensure that you get multiple opinions.
         Consider your security implementation from the "adversary" position - how would you
          compromise your own system?


Related Articles
Keeping Secrets - Windows Security (Part III)

Keeping Secrets - Windows NT Security (Part II)

Keeping Secrets - Windows NT Security (Part I)

You've Gotta Use Protection -- Inside Driver & Device Security

Still Feeling Insecure? - IoCreateDeviceSecure( ) for Windows 2K/XP/.NET

Securing Device Interfaces - A Better Approach than Sending an SD

Security During Create Operations

What is Coming with Vista - Limited User Access
Almost Like Being There - Virtual Server 2005
The NT Insider, Vol 11, Issue 5, November-December 2004 | Published: 09-Dec-04| Modified: 09-Dec-04



We've previously discussed various virtual machine solutions available for Windows including
both VMWare and Connectix (The NT Insider, January-February 2002). Shortly after acquiring
Connectix, Microsoft released "Virtual PC 2004". Our view of VirtualPC is that it suffers from
many of the same failings as the Connectix product did - most notably the performance lacks
that of the VMWare offering. On the plus side, one benefit to Virtual PC is its support for kernel
debugging.

Recently, folks at Microsoft have been singing the praises of "Virtual Server" a new product
(currently available for download from the MSDN download site, with appropriate level of
membership, of course). We've started playing with it here at OSR - and our initial impressions
are favorable.

The Feature Set

The number one improvement is clearly performance - most noticeable when you go through
the inevitable installation step. The other obvious change is the administration interface, which
is entirely IIS based. If you want to run this on your workstation class system (like Windows XP)
then you will need to install the IIS server component before installing Virtual Server. From
reading the documentation, the rationale for this appears to be that the primary target for
Virtual Server is both being hosted on and hosting the server class operating systems (be it
Windows 2000 Server or Windows Server 2003). The documentation does not even mention
workstation operating systems (Windows XP) nor did we get a chance to try that configuration
out prior to press time.

One advantage of the administration interface being hosted on IIS is that all you need to
manage a Virtual Server installation is Internet Explorer - it does require an ActiveX control
that must be installed, and it complains bitterly if your connection to the IIS server is not
secured with an SSL certificate, but it seems happy to use an internally generated certificate
(assuming you have set up a Certification Authority).

From the initial configuration screen you can choose to administer a different server, connect
to an existing running virtual machine, create a new virtual hard disk, or create a new virtual
network.

Individual machines can be configured with a variety of virtual devices - it comes standard with
two IDE controllers, a variable amount of memory (generally up to about 80% of the physical
memory on the computer system), and a virtual CD drive that can be connected to a physical
CD (or DVD) drive, or redirected to an ISO image. The ISO image option is nice for those of us
using the downloaded ISO images from MSDN. One disadvantage that I found with using ISO
images is that I have an existing library of such images, and in order to use them, I had to type
in the full path name to their location - there's no browse dialog to ease that process. This
limitation also applied when I wanted to create a second virtual hard disk drive - I had to
manually type in the entire path to its location because I wanted to store it to a different
location on my system.

Setup

Configuring a new machine is actually pretty simple. Choose "Create" from the main control
menu. You are then presented with a menu of choices.

You can choose a name for your new Virtual Server, configure it with some memory (default is
128MB), and establish the size of the new virtual hard disk (this is a maximum size, not a
pre-allocated size. The initial hard disk is about 40KB in size, but grows quickly once you
install the OS into it). Finally, you need to decide how you want this to appear on the network -
you can assign no network (a bad choice for server platforms usually), you can associate it
with an internal (local to this machine) network, or bind to one of the attached physical network
cards on the system. This can be convenient when testing of networking software (for
instance), but don't expect activation over the internet to work very well if you are connected to
a "local to this machine" network!

Having completed the small list of choices, click on OK and you are now presented with your
new virtual machine!

The Reality of Virtual Server

As with Virtual PC, the Virtual Server product does not have any virtual USB ports. This is
unfortunate, since it can be useful to connect (and test) USB hardware within a virtual machine
environment, rather than requiring a physical machine. Thus, if USB support is still a
requirement for your virtual machines, the Microsoft offerings are not going to meet your needs.
For debugging purposes, you can use the serial port attached to a named pipe. Details of how
to establish the debugger end of this connection are in the documentation for the current
debugger.

The documentation for Virtual Server hints at some of the rather interesting features supported
within the server environment itself - support for connection via a SAN, for example. One nice
feature (from the testing perspective) is that you can actually configure two virtual machines to
run in a clustered configuration - substantially easing the testing of what is, (generally
speaking), a rather demanding configuration to set up and test. The documentation clearly
says that this is for testing and training only and should not be used to provide reliability in "real
world" environments.

Many of the features available for Virtual Server are geared towards use in production
environments - support for up to 3.6GB of memory within a virtual machine, resource
management, etc. Surprisingly, Virtual Server does not appear to provide support for virtual
multiprocessor machines, giving the VMWare virtual server product an edge up. Of course, for
those of us developing and testing software, it means we still must use real computers to test -
we can't ship it without considerable testing and exposure on SMP.

For both Virtual Server 2005 and VMWare, an emerging challenge is how to deal with the
newer 64-bit processor platforms. With AMD shipping and Intel planning to ship 64-bit variants
of their processors, more and more machines will begin to use these new 64-bit platforms.
Neither product currently supports AMD-64 in long mode, probably due to AMD's decision not
to provide hardware virtual machine support. No doubt over the next year or two this is an area
that will need to be addressed.

Red Pill or Blue Pill?

For those developers who have an existing MSDN subscription and need to do server-level
testing, Virtual Server 2005 is definitely worth considering. Its performance is much improved
over Virtual PC, although still a bit slower than VMWare and there's no additional cost in using
it. Around here, we'll likely continue using a variety of different virtual solutions for the
foreseeable future - Virtual Server is terrific, but so is VMWare and it supports Windows XP as
well as the sever platforms.


Related Articles
?Fixed in the Next Release? ? Product Review Update: VMWare and Connectix
File System Filter Context - Observations and Comments
The NT Insider, Vol 11, Issue 5, November-December 2004 | Published: 09-Dec-04| Modified: 09-Dec-04



Within the past few weeks we discovered a file system that would report support for filter
contexts, and yet when we attempted to use it the system would crash. While we've since
overcome the problem, in analyzing it we found that there are some important considerations
to keep in mind when using the Windows (XP and later) filter context mechanism. For a primer
on support for filter context, see http://www.osronline.com/article.cfm?id=33.

Here at OSR, we have taken a more complex implementation approach to using file system
filter contexts in our own filters. This is because we must support a broad range of OS versions,
including those on which filter contexts are not available. We prefer to provide a uniform
context model, so that even if the underlying file system does not support filter contexts on
some files (for example the paging file) we fall back to the old model.

Of course, when filter contexts are not supported, we use a separate lookup table with either
the FsContext or SectionObjectPointers structure address as the lookup key - each is
unique on a per-file basis. Separately, we also track per-instance information by using the
address of the FileObject itself - again, via a lookup table. Thus, on a system that generally
supports filter contexts, we first check to see if we can find the requisite information from the
filter context and if not, we then check our separate lookup table. This dual lookup is wrapped
with our own functions so as to keep the interface clean - and isolates our drivers from the
differences between various versions of Windows, including service packs.

Back to the issue at hand. Filter context support is determined by using the function
FsRtlSupportsPerStream Contexts. This is actually a macro that we can find in ntifs.h, see
below.




                    #define FsRtlSupportsPerStreamContexts(_fo) \

                    ((NULL != FsRtlGetPerStreamContextPointer(_fo)) && \

                    FlagOn(FsRtlGetPerStreamContextPointer(_fo)->Flags2, \

                    FSRTL_FLAG2_SUPPORTS_FILTER_CONTEXTS))




Note that this macro really doesn't validate that it is looking at an advanced header - it just
checks bits assuming that it is an advanced header. What we found is that under some
circumstances this macro would return TRUE, but when we called
FsRtlLookupStreamContext, the system would crash. After doing some digging around, we
found that one of the standard Microsoft file system drivers (no name calling here) was actually
setting FsContext to point to a UNICODE_STRING structure. We discussed this issue for a
while and since we were in "fix it and keep going" mode, we decided to special case this
particular file system so that we know we can't rely upon stream contexts.

However, after further discussion and reflection it occurs to us that there's a fundamental
(implicit) assumption in the filter context model that bears additional caution here: the filter
context model assumes that the FsContext pointer will always refer to a structure arranged in
the common header format. Thus, this either now becomes a requirement for all file systems
(something that Microsoft developers have publicly stated in the past and for the record is not
the case) or the filter context model has a fundamental flaw.

In the case of the Microsoft FSD, we looked at, the FsContext value actually referred to a
UNICODE_STRING structure. On the x86 platform, this structure consists of 64 bits of
information; Casting it to an FSRTL_ADVANCED_FCB_ HEADER and then testing the
Flags2 field to determine if the FSRTL_FLAG2_ SUPPORTS_FILTER_CONTEXTS bit is set,
means that the macro is testing a bit somewhere in the address of the buffer itself. Of course,
whether this bit is set or cleared does seem to be a bit arbitrary - change the pool allocation
function, and code that previously worked might stop working because these bits no longer are
set quite the same way.

We spent some time thinking about how we could make this more robust. We want to
guarantee that our own filters can take advantage of filter contexts on file systems where they
are supported, and yet ensure we do not rely upon a feature that might not be present on other
file systems. Naturally, we try to be conservative in our development efforts in this area
because a failure to do this properly will typically lead to a system crash - a situation that we
consider to be unacceptable. Thus, the thought that we could track just the exceptions to this
rule works if we assume that we will only be called upon to filter the Windows file systems.

One possibility is that we could add additional checks (or validation) to the FCB header itself.
For example, we note there is "node" information: a NodeTypeCode and NodeByteSize field.
Compiling a list of valid values for this might at least provide us with some additional surety (for
example, ensuring the NodeByteSize field is at least large enough to contain the stream
context field). If we examine the FAT file system code, we can find that it uses several different
values for both of these fields. This makes validation a rather costly affair since we would need
to perform this check each time we wanted to consider looking at the stream context
information associated with a given FILE_OBJECT.

For example, the FAT file system defines its "node types" in the header file nodetype.h. The
values range from 0x500 through 0x508. This field would correspond to the Length field in our
UNICODE_STRING structure. While all of these values are large, none of the even values
would be invalid. Presumably we can also find the comparable range for the other file systems,
either by inspecting the source code (CDFS uses the range from 0x301 through 0x309) or by
observing them in actual use (NTFS appears to use values in the 0x700 range).
Heuristic data structure validation techniques might work most of the time, but they are merely
a way of "papering over" the underlying problem - that there really is no way that we can
determine, by looking at the file object, if the file system is using the
FSRTL_COMMON_FCB_HEADER. Ideally, we'd want to have some sort of flag set in the file
object itself (perhaps FO_USES_COMMON_FCB_HEADER) that we could use to validate the
assumption that the common header is present. Once we know the common header is present,
the other techniques used for validation (reading Flags2 for example) work correctly.
Unfortunately, to implement this would require changing all of the existing file systems to
support this flag.

In the interim, we've discussed this at length and come to the conclusion that filter contexts in
their current incarnation are not really safe to use. This is very unfortunate because using a
lookup table is an expensive operation. We optimize the lookup operation as best we can to
minimize this cost, but in our development efforts we've concluded that expensive is always
preferable to incorrect.

For those of you that are willing to rely upon heuristic approaches, we have come up with a
variety of suggestions on how to mitigate these issues. They include:


        Checking for the type of the file system against a list of drivers known to support filter
         contexts. The alternative (checking for file systems known not to support filter contexts)
         isn't reliable, but might be sufficient if you never filter anything other than a standard
         Windows file system.
        Checking the node information; one observation is that the NodeTypeCode value is
         generally larger than the NodeByteSize field and this would at least allow us to rule
         out the case we saw - where a UNICODE_STRING structure was being used.
        Checking to ensure the NodeByteSize field is greater than the
         FSRTL_ADVANCED_FCB_HEADER. If it is not, then no matter what the Flags2 field
         indicates, it is quite possible that the structure is not big enough to contain the
         requisite field.

No doubt there are other heuristic approaches that can also be concocted, and yet each of
them leaves us with the fear that some other file system - whether from Microsoft or from some
third party vendor - will not fit into our heuristic model. This will result in either incorrect
behavior or a system crash.

Thus, for the foreseeable future we have reached the unfortunate conclusion that filter
contexts, while good in theory, can't be used in the hostile world in which we operate. Other
developers should make sure they are also aware of these hazards and (at a minimum) take
precautions or eschew using filter contexts until such time as Microsoft's development team
finds a way to make them more robust.
Related Articles
IFS FAQ

Filtering the Riff-Raff - Observations on File System Filter Drivers

Tracking State and Context - Reference Counting for File System Filter Drivers

In Context: Understanding Execution Context for NT Drivers

Caching in Network File Systems

Multi-Version Functionality

OSR + MS Team Up: Mini-Filters and Plugfest Combo - Again!
Keep Version Resources Up-To-Date
The NT Insider, Vol 11, Issue 5, November-December 2004 | Published: 09-Dec-04| Modified: 09-Dec-04



As any savvy developer knows, the only way to keep track of what version of your software a
customer is running is by adding a version resource to your image. This version resource,
once created, is something that is understood and can by viewed with Explorer. Until now,
keeping version resource information up to date required manual editing. This article
introduces the reader to version resources and to OSR's, now publicly available, automatic
Increment Versioning kit (IV). Of course, this kit is available as yet another free download at
www.osronline.com.

Adding a Version Resource

A Version Resource is created by entering a VERSIONINFO resource definition statement into
a resource script file, usually named with an ".RC" suffix. This ".RC" file is a text file that is
specified in the "SOURCES=" line of your project "SOURCES" file and is compiled with the RC
compiler. The resulting output then becomes part of your binary image.

A VERSIONINFO resource can contain the following information about your image:

         CompanyName
         FileDescription
         FileVersion
         InternalName
         LegalCopyright
         OriginalFilename
         ProductName
         ProductVersion
         FileVersion
         ProductVersion

This information can be viewed from Explorer by right-clicking on the image file, selecting the
"Properties" menu item, and then selecting the "Version" tab. For example, by using a simple
resource file and included config.h file, (see Figures 1 and 2 immediately below), we see the
results shown in Figure 3 on (further below).
Figure 1 -- config.h




Figure 2 -- config.h
Figure 3

What many people do not realize is that this version resource information can be generated,
maintained, and updated programmatically, and that is what OSR's IV kit does.

The IV Kit

IV is a kit that consists of 3 components: The IV program, which programmatically maintains
the version information for your project and its component pieces; a set of template files that
you use to integrate IV with your project; and an example project, named TestDrv, which uses
IV. Let us discuss each component.


       IV Program - The IV program is an image that is run each time your project is
        compiled. Its responsibility is to update the version information of your image if
        directed to via the IV command line, or if any project component has changed.
       IV Templates - The IV templates consist of the following components to be integrated
        with the project to be built:

                          o a set of ".h" files
                         o a "SOURCES" file template with include files

                         o a MAKEFILE.INC template

                         o an ".RC" template


       TestDrv - TestDrv is an example project which utilizes the components of the IV kit.
        This should give developers a good start at adding IV to their projects.

Let us explore each of these in more detail.

IV Program

IV is a console mode program designed to update the version information for a component of a
project based on information contained within the file "Version.h".

Its command line is shown below with parameter descriptions:

IV [-r] [-h] [-n] [-a] [-i] [-b buildType] [-f headerfile] [-v version] fileList



        -r reports everything done by IV to stdout

        -h prints out the help

        -n no increment, just prints the current version to stdout

        -a always modifies the version file, even if it is read-only. The default is not
        to modify a read-only version of "version.h".

        -i always increments version

        -b indicates what type of build is being done, where

                                 "M" means VER_MAJOR++, VER_MINOR=0,
                                 VER_BASELEVEL=0, VER_PRODUCTBUILD=0

                                 "I" means VER_MINOR++, VER_BASELEVEL=0,
                                 VER_PRODUCTBUILD=0

                                 "A" means VER_BASELEVEL=1,
                                 VER_PRODUCTBUILD=0 or VER_PRODUCTBUILD++,
                                 if VER_BASELEVEL same since last build

                                 "B" means VER_BASELEVEL=2,
                                 VER_PRODUCTBUILD=0 or VER_PRODUCTBUILD++,
                                 if VER_BASELEVEL same since last build
          -f name of the version file to be updated. This usually is specified as
          ".\version.h"

          -v sets VER_MAJOR to the indicated value

          fileList a list of the files making up this component. Usually "$(SOURCES)"
          is specified.

IV Template Files

The IV Template files are files that you add to both your project and to all the components
(exe,sys,dll) that make up your project. The template files allow you to define project-wide
version information as well as component-specific version information.

Table 1 contains a list of the template files included with the kit and their purpose.


Template File Name                Purpose
Resource.rc                       This is the Resource file template that is to be compiled as part of a
                                  project component that you are building, for example a program or
                                  a driver.?? This file pulls in Prjresource.h.
Prjresource.h                     This file contains component specific version information.? This
                                  file pulls in globalresource.h and verrc.h, as well as the version.h
                                  file, which is maintained by the IV program.
GlobalResource.h                  This file contains the global version information.? For a project
                                  consisting of multiple components, for example, executables,
                                  DLLs, and drivers, this file would contain Product version
                                  information that is common for all.
Makefile.inc                      This file is a template makefile.inc that is added to project
                                  components which will contain version information.? This file
                                  runs the IV program, which maintains the version information for
                                  this component.
Sources                           This file is a template sources file which shows how a project
                                  utilizes the IV kit.?? The user should note where the .RC file is
                                  specified and NTTARGETFILE0=version line, which results in
                                  the makefile.inc file being run prior to project compilation.?? This
                                  file references the ?Versionmake.def? file for common build
                                  definitions.
Verrc.h                           This is a template include file that is added to project components
                                  which will contain version information.? This file must be updated
                                  by the developer since it contains component specific information
                                  such as file type and file subtype information.?? This file then
                                  includes Microsoft?s ?common.ver? which constructs the Version
                                  Resource for the project.
Versionmake.def,                  Build definition files used in conjunction with a ?Sources? File to
Versionmake-net.def,              provide platform-specific definitions to the IV versioning
Versionmake-w2k.def,                 program.?? These files are usually put at the root of your project
Versionmake-wxp.def                  and all components of your project will reference
                                     Versionmake.def.?? See the template ?Sources? file for details.


Table 1

TestDrv

The TestDrv component included in the kit demonstrates the use of the versioning technology
discussed in this article. It should give the user ample information on how to add the IV
components to any project.

How IV works

IV works by maintaining the file "VERSION.H" (see Figure 4), which it creates upon first being
run as part of a component compilation.




Figure 4 -- version.h

The "VERSION.H" file contains the 5 lines which IV uses for maintaining versioning
information. The lines are:


         VER_MAJOR - contains the Major Version number of the component
         VER_MINOR - contains the Minor Version number of the
          componentVER_BASELEVEL - 0 = Major, 1 = Minor, 2 = Alpha, 3 = Beta
         VER_PRODUCTBUILD - contains the build generation number. This number would
          typically be updated if the modification time of a file in the project is later than the
          modified time of the "Version.h" file.
         VER_PLATFORM - contains the name of the platform that this product was built on.

For each subsequent compilation, IV updates the information contained within this file if a
project file has changed or if directed to by a command line option. IV is run as part of building
a component which uses the template "makefile.inc" and "sources" files distributed with this kit.
For example, consider the sources in Figure 5.
Figure 5 -- sources

IV is invoked when building a project because the "Sources" file contains the line
"NTTARGETFILE0=version". This line causes "build" to run the "version" dependency rule
contained in the "makefile.inc" file (see Figure 6) prior to compiling the component.




Figure 6 -- makefile.inc

In Figure 6 you will notice the "-b $(BUILDTYPE)" option. This indicates the type of build being
done. The value of the "$(BUILDTYPE)" variable is set according to the setting of the
PROJECT_BUILDTYPE environment variable that the developer can set prior to doing a
project build (The default value is "A" for Alpha release, but the default can be changed in the
file "versionmake.def"). IV understands 4 build types: Major, Intermediate (Minor), Beta, and
Alpha (Default). The build type is determined by the value of the PROJECT_BUILDTYPE
environment variable, defined in versionmake.def. The different settings of this variable result
in a product version number prefixed by "V" (if "M" or "I" are specified), "B" (if "B" is specified),
or "A" (if "A" is specified). The setting of this variable at the time of the build is reflected in the
setting of the VER_BASELEVEL variable within "VERSION.H".

Summary

While IV may not be the optimal solution for your project, this article should give you the
information you need to programmatically create and maintain your version resources and
eliminate the need for manual editing of your version information.

Check the Downloads section of www.osronline.com to download this kit.
You've Typed !analyze -v, Now What? -- Next Steps in Debugging
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




We often see people asking what to do when their driver doesn’t work – this question arises on
an almost daily basis. While the actual setup of the debugger now seems to be reasonably well
understood, the steps beyond that appear to be shrouded in mystery. This article will warn
about common issues and delve into some of the basic steps that you should follow when
debugging.

For the purposes of this article, we will assume that you either have a live system debugging
set-up (and have setup the appropriate symbols - see Side Bar Accessing the Microsoft
Symbol Server at conclusion of article) or you are looking at a post-mortem dump in the
debugger. For this article we are using the current (as of this article) publicly available version
of the Windows debugger (WinDBG) from Microsoft (6.3.17.0, May 27, 2004). If you are using
a different version of the debugger, the details may vary from what we describe in this article.

The first place to start with any crash is the information provided using the command "!analyze
–v". This command invokes the "analyze" routine embedded within the ext.dll library shipped
with the current debugger. This library encapsulates a tremendous body of knowledge for the
causes of most common failures – and each new version of the debugger reflects the
continuous learning process at Microsoft on how to perform automated fault isolation.

For example, we looked through our archives and found a recent crash dump to describe the
use of the analyze utility as the starting point of our investigation, see Figure 1.

kd> !analyze -v
********************************************************************
***********
*
                 *
*                                          Bugcheck
Analysis                                                                    *
*
                 *
********************************************************************
***********

UNEXPECTED_KERNEL_MODE_TRAP (7f)
This means a trap occurred in kernel mode, and it's a trap of a kind
that the kernel isn't allowed to have/catch (bound trap) or that
is always instant death (double fault).                               The first number in the
bugcheck params is the number of the trap (8 = double fault, etc)
Consult an Intel x86 family manual to learn more about what these
traps are. Here is a *portion* of those codes:
If kv shows a taskGate
          use .tss on the part before the colon, then kv.
Else if kv shows a trapframe
          use .trap on that value
Else
          .trap on the appropriate frame will show where the trap was taken
          (on x86, this will be the ebp that goes with the procedure KiTrap)
Endif
kb will then show the corrected stack.
Arguments:
Arg1: 00000008, EXCEPTION_DOUBLE_FAULT
Arg2: 80042000
Arg3: 00000000
Arg4: 00000000

Debugging Details:
------------------



BUGCHECK_STR:    0x7f_8

TSS:    00000028 -- (.tss 28)
eax=00030e01 ebx=812ca368 ecx=8117f300 edx=00000001 esi=8117f3ac
edi=00010000
eip=baf3f9d7 esp=f5031fc0 ebp=f503205c iopl=0                 nv up ei ng nz na
po nc
cs=0008    ss=0010   ds=0023   es=0023    fs=0030   gs=0000              efl=
00010286
atapi!IdeSendCommand+0x3:
baf3f9d7 53                    push      ebx
Resetting default scope

DEFAULT_BUCKET_ID:     DRIVER_FAULT

LAST_CONTROL_TRANSFER:     from baf40776 to baf3f9d7

STACK_TEXT:
f503201c baf40776 812ca368 8117f3ac 00010000 atapi!IdeSendCommand+0x3
f503205c baf4291c 812ca368 8117f3ac 812ca1c8 atapi!AtapiStartIo+0x22b
f5032088 80530f57 012ca030 00000002 0dd36002
atapi!IdeStartIoSynchronized+0x162
812ca030 00000000 812fdc18 00000000 00000000
nt!KeSynchronizeExecution+0x17
FOLLOWUP_IP:
atapi!IdeSendCommand+3
baf3f9d7 53                         push        ebx

SYMBOL_STACK_INDEX:           0

FOLLOWUP_NAME:         MachineOwner

SYMBOL_NAME:        atapi!IdeSendCommand+3

MODULE_NAME:        atapi

IMAGE_NAME:        atapi.sys

DEBUG_FLR_IMAGE_TIMESTAMP:             3d6ddb04

STACK_COMMAND:         .tss 28 ; kb

BUCKET_ID:       0x7f_8_atapi!IdeSendCommand+3

Followup: MachineOwner
---------


                                            Figure 1

Note that the output from !analyze -v does not tell us the specific cause of the crash, but it
does give us a place to start looking. First, make sure to read the text that describes the bug
check code. It is also worthwhile to read the online documentation with WinDBG in order to
better understand the bug check code.

In this example, the bug check code (0x7F) indicates that this is a "kernel mode trap". Note
that the example text indicates 8, which represents a double fault – exactly the case in our
crash example (this is also a hint to us that this is probably a common case, since it was used
as the example). The documentation here also refers us out to the Intel reference manual –
these manuals are useful when debugging (on the x86) and can be found on the Intel web site
(as of the writing of this article the Pentium 4 documentation can be found at
http://developer.intel.com/design/Pentium4/documentation.htm). This includes information
about the various types of traps supported by the x86 processor family. It confirms that trap 8
is the "double fault" trap.

The WinDBG documentation also provides further insight here:

0x00000008, or Double Fault, is when an exception occurs while trying to call the handler for a
prior exception. Normally, the two exceptions can be handled serially. However, there are
several exceptions that cannot be handled serially, and in this situation the processor signals a
double fault. There are two common causes of a double fault:


         A kernel stack overflow. This occurs when a guard page is hit, and then the kernel
          tries to push a trap frame. Since there is no stack left, a stack overflow results, causing
          the double fault. If you suspect this has occurred, use !thread to determine the stack
          limits, and then use kb (Display Stack Backtrace) with a large parameter (for
          example, kb 100) to display the full stack.
         A hardware problem. This also points out an important attribute of analyzing any crash:
          understand the context in which the crash occurred. If this were a machine running
          your experimental driver, it would seem likely that a software bug, and not a hardware
          problem was the cause. That allows us to ignore the less likely cases.

Continuing on, let’s follow the advice of the documentation. So, in our sample crash we do as
we’re told, see Figure 2.

kd> !thread
THREAD ffb6fda8          Cid 0314.0330         Teb: 7ffdb000 Win32Thread: e1785120
RUNNING on processor 0
IRP List:
        81662e28: (0006,01d8) Flags: 40000404                  Mdl: 00000000
        81714f68: (0006,0094) Flags: 40000000                  Mdl: 00000000
        81d1af68: (0006,0094) Flags: 40000000                  Mdl: ffab9d78
        81d5af68: (0006,0094) Flags: 40000000                  Mdl: ffabbde0
        81784f68: (0006,0094) Flags: 40000000                  Mdl: ffabbe60
        81990f68: (0006,0094) Flags: 40000000                  Mdl: ffabc6f8
        81760f68: (0006,0094) Flags: 40000900                  Mdl: 811b3bd0
Not impersonating
DeviceMap                             e1001098
Owning Process                        ffb750b0
Wait Start TickCount                  36960                 Elapsed Ticks: 0
Context Switch Count                  251                          LargeStack
UserTime                              00:00:00.0100
KernelTime                            00:00:00.0130
Start Address 0x77e7d342
Win32 Start Address 0x77f95b06
Stack Init f5035000 Current f50326f0 Base f5035000 Limit f5032000 Call
0
Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 16
ChildEBP RetAddr            Args to Child
805367e0 8052f283 0000007f 00000008 80042000 nt!KeBugCheckEx+0x19 (FPO:
[Non-Fpo])
805367e0 baf3f9d7 0000007f 00000008 80042000 nt!KiTrap08+0x52 (FPO:
TaskGate 28:0)
f503201c baf40776 812ca368 8117f3ac 00010000 atapi!IdeSendCommand+0x3
(FPO: [EBP 0xf503205c] [2,24,4])
f503205c baf4291c 812ca368 8117f3ac 812ca1c8 atapi!AtapiStartIo+0x22b
(FPO: [Non-Fpo])
f5032088 80530f57 012ca030 00000002 0dd36002
atapi!IdeStartIoSynchronized+0x162 (FPO: [Non-Fpo])
812ca030 00000000 812fdc18 00000000 00000000
nt!KeSynchronizeExecution+0x17


                                           Figure 2

We have emphasized the line that contains the actual stack limits. It is important to remember
that successive addresses on the stack are lower than the previous limit, so that we should
think of stacks as "growing down". Thus, the address of the stack base is greater than the limit
of the stack.

If we look at the stack information provided by the !thread command we can also see that the
stack addresses (along the far left edge the "ChildEBP" values) end with 0x805367e0 but a
few lines below are 0xf503201c – the first address is not within the stack range for this thread
(0xf5032000 to 0xf5035000) but that later address is – and just above the stack limit. This
does seem to suggest some sort of stack overflow.

The debugger documentation suggests using kb with a large numeric value. We tried this, see
Figure 3.

kd> kb 100
ChildEBP RetAddr        Args to Child
805367e0 8052f283 0000007f 00000008 80042000 nt!KeBugCheckEx+0x19
805367e0 baf3f9d7 0000007f 00000008 80042000 nt!KiTrap08+0x52
f503201c baf40776 812ca368 8117f3ac 00010000 atapi!IdeSendCommand+0x3
f503205c baf4291c 812ca368 8117f3ac 812ca1c8 atapi!AtapiStartIo+0x22b
f5032088 80530f57 012ca030 00000002 0dd36002
atapi!IdeStartIoSynchronized+0x162
812ca030 00000000 812fdc18 00000000 00000000
nt!KeSynchronizeExecution+0x17


                                           Figure 3

However, this doesn’t provide us with any more information than we had with the !thread
command. At this point it would be easy to give up and say, "I don’t know. This documentation
is wrong, the debugger is broken, and this is too hard!" DO NOT GIVE UP YET!

Note that we can still exploit the steps suggested by the debugger. We wandered off looking at
the documentation, but the debugger suggested that we look at the results from the kv
command, see Figure 4.
kd> kv
ChildEBP RetAddr         Args to Child
805367e0 8052f283 0000007f 00000008 80042000 nt!KeBugCheckEx+0x19 (FPO:
[Non-Fpo])
805367e0 baf3f9d7 0000007f 00000008 80042000 nt!KiTrap08+0x52 (FPO:
TaskGate 28:0)
f503201c baf40776 812ca368 8117f3ac 00010000 atapi!IdeSendCommand+0x3
(FPO: [EBP 0xf503205c] [2,24,4])
f503205c baf4291c 812ca368 8117f3ac 812ca1c8 atapi!AtapiStartIo+0x22b
(FPO: [Non-Fpo])
f5032088 80530f57 012ca030 00000002 0dd36002
atapi!IdeStartIoSynchronized+0x162 (FPO: [Non-Fpo])
812ca030 00000000 812fdc18 00000000 00000000 nt!KeSynchronizeExecution+0x17


                                             Figure 4

Note that this is similar in output to the kb command, but includes additional information about
how the particular call frame was constructed. We can see that the call to KiTrap08 includes a
TaskGate – and the debugger advised that if we found a task gate we should use .tss and
then obtain a second stack list. This gives us what we see in Figure 5.

kd> kv
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr         Args to Child
f503201c baf40776 812ca368 8117f3ac 00010000 atapi!IdeSendCommand+0x3
(FPO: [EBP 0xf503205c] [2,24,4])
f503205c baf4291c 812ca368 8117f3ac 812ca1c8 atapi!AtapiStartIo+0x22b
(FPO: [Non-Fpo])
f5032088 80530f57 012ca030 00000002 0dd36002
atapi!IdeStartIoSynchronized+0x162 (FPO: [Non-Fpo])
812ca030 00000000 812fdc18 00000000 00000000 nt!KeSynchronizeExecution+0x17


                                             Figure 5

Alas, this doesn’t seem to tell us much more than we knew before. Again, it would be easy for
us to quit at this point, but we can still look farther and see if we can find the underlying cause.
Unfortunately, once the automated part of the analysis is done, we have to actually start
poking around – and most of the time, what we write about in The NT Insider is that process of
"poking around".

So, what do we do next? We already suspect that this is a stack overflow, so it would be nice
to see if we can find potential culprits for this stack consumption. Such large consumers would
show up as big gaps between the ChildEBP values we see on the stack – and would suggest
where to look in the guilty driver to resolve the underlying problem.
Unfortunately, sometimes the debugger cannot unwind the stack. In such cases, we must
manually walk the stack looking for another call frame and see if we can continue the stack
unwind. We normally analyze the stack from the last valid frame (in this case the third line, not
that final line). The third line lists a different stack value for the ChildEBP and thus is suspect.
We sometimes see this in the debugger output and have learned over the years to treat that
final entry with suspicion.

We’re specifically looking for a pair of addresses: the first will be a ChildEBP (pushed on the
stack upon entry to a function) and the second will be a return address for the function that
called. We can use the information from these two to reconstruct the call sequence. In the
listing below we highlighted the prospective addresses – keep in mind that a ChildEBP would
normally be an address above the current location, but still within the stack – so it will be
numerically close, see Figure 6.

kd> dd f5032088
f5032088 812ca030 80530f57 012ca030 00000002
f5032098 0dd36002 baf42b57 812fd970 baf427ba
f50320a8 812ca030 ffa50e94 813034c0 f5032104
f50320b8 0dd36000 baf43827 00000000 00000000
f50320c8 00000000 812ca030 f9ad3c75 812ca030
f50320d8 00000000 806adee0 81303408 00000000
f50320e8 812ddbd8 813034c0 ffa25000 806ab698
f50320f8 00004000 81322220 00000000 f5032134


                                             Figure 6

Thus, our first candidate is f5032104. The corresponding candidate return address is actually
0dd36000, but we know that can’t be a valid kernel address (since it is below the kernel
address space). Interestingly, when we first did this we accidentally skipped this value and
used the next stack value - baf43827. As it turns out, this was the correct value, but we arrived
at it accidentally.

To determine if a value is a valid return address, we must remember that the return address
will be the instruction after the call. So, in order to see the call instruction we must back up in
the instruction stream. In general, we try to back up five or six bytes from the return address in
order to hit the call – presumably, using that Intel manual, one can figure out the various
possible sizes for a call operation, but nobody around here has done that analysis yet! This
yields the results seen in Figure 7.

kd> u baf43827-5
atapi!IdePortAllocateAccessToken+0x12:
baf43822 e803f3ffff                 call       atapi!CallIdeStartIoSynchronized
(baf42b2a)
baf43827 eb0d                    jmp atapi!IdePortAllocateAccessToken+0x26
(baf43836)
baf43829 682a2bf4ba                   push        0xbaf42b2a
baf4382e ffb084000000                 push        dword ptr [eax+0x84]
baf43834 ffd1                     call        ecx
baf43836 c20400                   ret         0x4
baf43839 cc                      int          3
atapi!IdeProcessCompletedRequest:
baf4383a 55                    push     ebp


                                              Figure 7

So since we can see a call, we really do have a return address. We can feed this into WinDBG
using the kb or kv command plus the optional arguments (the ESP EBP and EIP values at that
point in the call sequence). We know the EBP – that was on the stack (f5032104), and we
know the EIP – that was the call instruction (baf43822). The stack pointer must have been the
location on the stack of the return address, since the processor pushed the return address
onto the stack at the current value of ESP. Thus, ESP was f50320bc (we took the address of
the first location listed – f50320b8 and added 4 to it – and this is when we noticed we grabbed
the wrong address earlier!).

Just to touch on our erroneous guess before (and confirm that we "got lucky") we looked at the
code of the called function seen in Figure 8.

kd> u baf42b2a
atapi!CallIdeStartIoSynchronized:
baf42b2a 53                      push         ebx
baf42b2b 55                      push         ebp
baf42b2c 8b6c2418                 mov             ebp,[esp+0x18]
baf42b30 56                      push         esi
baf42b31 8b7528                   mov         esi,[ebp+0x28]
baf42b34 57                      push         edi
baf42b35 8dbee0000000                 lea         edi,[esi+0xe0]

baf42b3b 8bcf                          mov          ecx,edi


                                              Figure 8

Note that the first value pushed was the value in EBX and the second was the value in EBP.
While we are not sure why the debugger could not unwind this, we know that we can do so –
and we’ll capitalize on our earlier mistake! This gives us the results seen in Figure 9.

kd> kv = f5032104 f50320bc baf43822
ChildEBP RetAddr         Args to Child
f50320b8 baf43827 00000000 00000000 00000000
atapi!IdePortAllocateAccessToken+0x12 (FPO: [1,0,0])
f50320cc f9ad3c75 812ca030 00000000 806adee0
atapi!IdePortAllocateAccessToken+0x17 (FPO: [1,0,0])
f50320d8 806adee0 81303408 00000000 812ddbd8
PCIIDEX!BmReceiveScatterGatherList+0x21 (FPO: [4,0,1])
f5032104 806ae0b4 00000000 81303408 ffa50e68
hal!HalBuildScatterGatherList+0x19c (FPO: [Non-Fpo])
f5032134 f9ad3d17 81322220 81303408 ffa50e68
hal!HalGetScatterGatherList+0x24 (FPO: [Non-Fpo])
f5032168 baf44114 813034c0 ffa25000 00004000 PCIIDEX!BmSetup+0x5d (FPO:
[Non-Fpo])
f50321a0 804eb04f 812ca030 81ab6f20 812fd578 atapi!IdePortStartIo+0xe6
(FPO: [Non-Fpo])
f50321c0 baf434fc 812ca030 81ab6f20 00000000 nt!IoStartPacket+0x7b (FPO:
[Non-Fpo])
f50321ec 804ea221 812ca030 00ab6f20 806ac214
atapi!IdePortDispatch+0x4c0 (FPO: [Non-Fpo])
f50321fc 8062c190 81ab6f20 bafb910c 00000000 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f5032220 bafb9133 bafb910c bafb9898 812fd2a8 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f5032228 bafb9898 812fd2a8 81ab6f20 812fd2a8
ACPI!ACPIDispatchForwardIrp+0x27 (FPO: [2,0,1])
f5032258 804ea221 812fd2a8 bafcd68c 806ac214 ACPI!ACPIDispatchIrp+0x158
(FPO: [Non-Fpo])
f5032268 8062c190 81ab6fd8 8117f300 812fd2a8 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f503228c f98a4028 ffa25000 81a18e28 00000000 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f503229c f98a3c4b 8117f300 81328998 81a18f28
CLASSPNP!SubmitTransferPacket+0x7e (FPO: [1,0,3])
f50322c8 f98a3b5a 00004000 00004000 813288e0
CLASSPNP!ServiceTransferRequest+0xe0 (FPO: [Non-Fpo])
f50322ec 804ea221 813288e0 00000000 806ac214
CLASSPNP!ClassReadWrite+0xfd (FPO: [Non-Fpo])
f50322fc 8062c190 813286b8 812fd040 81328770 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f5032320 f9adb36c 812f7b00 81a18f4c f5032364 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])


                                             Figure 9

Of course, this still isn’t enough stack information. The previous trick (of providing the stack
frame count) won’t work this time around because we overrode the stack parameters. Thus,
we must use a different debugger trick – the .kframes directive. That changes the default so
that the debugger shows us more frames by default. We set it to 0x100 and repeated the stack
backtrace and we now have what you see in Figure 10.

kd> .kframes 100
Default stack trace depth is 0n256 frames
kd> kv = f5032104 f50320bc baf43822
ChildEBP RetAddr       Args to Child
f50320b8 baf43827 00000000 00000000 00000000
atapi!IdePortAllocateAccessToken+0x12 (FPO: [1,0,0])
f50320cc f9ad3c75 812ca030 00000000 806adee0
atapi!IdePortAllocateAccessToken+0x17 (FPO: [1,0,0])
f50320d8 806adee0 81303408 00000000 812ddbd8
PCIIDEX!BmReceiveScatterGatherList+0x21 (FPO: [4,0,1])
f5032104 806ae0b4 00000000 81303408 ffa50e68
hal!HalBuildScatterGatherList+0x19c (FPO: [Non-Fpo])
f5032134 f9ad3d17 81322220 81303408 ffa50e68
hal!HalGetScatterGatherList+0x24 (FPO: [Non-Fpo])
f5032168 baf44114 813034c0 ffa25000 00004000 PCIIDEX!BmSetup+0x5d (FPO:
[Non-Fpo])
f50321a0 804eb04f 812ca030 81ab6f20 812fd578 atapi!IdePortStartIo+0xe6
(FPO: [Non-Fpo])
f50321c0 baf434fc 812ca030 81ab6f20 00000000 nt!IoStartPacket+0x7b (FPO:
[Non-Fpo])
f50321ec 804ea221 812ca030 00ab6f20 806ac214
atapi!IdePortDispatch+0x4c0 (FPO: [Non-Fpo])
f50321fc 8062c190 81ab6f20 bafb910c 00000000 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f5032220 bafb9133 bafb910c bafb9898 812fd2a8 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f5032228 bafb9898 812fd2a8 81ab6f20 812fd2a8
ACPI!ACPIDispatchForwardIrp+0x27 (FPO: [2,0,1])
f5032258 804ea221 812fd2a8 bafcd68c 806ac214 ACPI!ACPIDispatchIrp+0x158
(FPO: [Non-Fpo])
f5032268 8062c190 81ab6fd8 8117f300 812fd2a8 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f503228c f98a4028 ffa25000 81a18e28 00000000 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f503229c f98a3c4b 8117f300 81328998 81a18f28
CLASSPNP!SubmitTransferPacket+0x7e (FPO: [1,0,3])
f50322c8 f98a3b5a 00004000 00004000 813288e0
CLASSPNP!ServiceTransferRequest+0xe0 (FPO: [Non-Fpo])
f50322ec 804ea221 813288e0 00000000 806ac214
CLASSPNP!ClassReadWrite+0xfd (FPO: [Non-Fpo])
f50322fc 8062c190 813286b8 812fd040 81328770 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f5032320 f9adb36c 812f7b00 81a18f4c f5032364 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f5032330 804ea221 813286b8 81a18e28 806ac214 PartMgr!PmReadWrite+0x93
(FPO: [Non-Fpo])
f5032340 8062c190 81a18e28 81a18f70 812f7bb0 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f5032364 baf781c6 812f7af8 812fbc88 81a18e00 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f5032380 804ea221 812f7af8 81a18e28 806ac214
ftdisk!FtDiskReadWrite+0x194 (FPO: [Non-Fpo])
f5032390 8062c190 81a18f8c 81a18fb0 81a18e28 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f50323b4 f9883a4b 81a18e00 812f9020 812fdf38 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f50323c8 804ea221 812f90d8 81a18e28 806ac214 VolSnap!VolSnapWrite+0xb9
(FPO: [Non-Fpo])
f50323d8 8062c190 812c7738 81a18e28 81a18e0c nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f50323fc bae8fcd5 f50327ec 81a18e28 f50325e4 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f503240c bae8f77d f50327ec 812f9020 3f95d000 Ntfs!NtfsSingleAsync+0x6b
(FPO: [Non-Fpo])
f50325e4 bae90b3a f50327ec 81a18e28 812c5a98 Ntfs!NtfsNonCachedIo+0x363
(FPO: [Non-Fpo])
f50327dc bae904f1 f50327ec 81a18e28 0110070a
Ntfs!NtfsCommonWrite+0x1847 (FPO: [Non-Fpo])
f5032950 804ea221 812c7658 81a18e28 806ac214 Ntfs!NtfsFsdWrite+0xf3 (FPO:
[Non-Fpo])
f5032960 8062c190 81a18e28 812f9350 812c70d8 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f5032984 baf2c3b8 812c7020 812c5c00 f50329c8 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f5032994 804ea221 812c7020 81a18e28 806ac214 sr!SrWrite+0xa8 (FPO:
[Non-Fpo])
f50329a4 8062c190 81a18fdc 81a19000 812c5c90 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f50329c8 f468222f ffb5bc98 81280b60 ffb5bc00 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f50335f4 f4682377 ffb5bd50 81a18e28 804ea221 IOTRAP!IOTrapHookProc+0x27f (FPO:
[Non-Fpo]) (CONV: stdcall) [c:\dev\iotrap\iotrap.c @ 2728]
f5033600 804ea221 ffb5bc98 81a18e28 806ac214 IOTRAP!IOTrapDispatch+0x1d (FPO:
[2,0,0]) (CONV: stdcall) [c:\dev\iotrap\iotrap.c @ 2844]
f5033610 8062c190 812c5c90 007e3000 ffb5bc98 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f5033634 804eb2d3 f5033828 f5033670 00000000 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f5033648 80504af6 812c5c0a f5033670 f5033704
nt!IoSynchronousPageWrite+0xad (FPO: [Non-Fpo])
f5033720 80505466 e14e6f8c e14e6f9c 00000000
nt!MiFlushSectionInternal+0x37a (FPO: [7,44,3])
f503375c 804e048d 00000000 e14e6f8c 000007e3 nt!MmFlushSection+0x1e0
(FPO: [Non-Fpo])
f50337e4 baeb1f83 00004000 f503385c 00004000 nt!CcFlushCache+0x363 (FPO:
[Non-Fpo])
f50338ac baeb2044 e144d8c0 e15f38a0 e144d8c0 Ntfs!LfsFlushLfcb+0x227
(FPO: [Non-Fpo])
f50338d0 baeb6d35 e144d8c0 e15f38a0 e14b8fd8 Ntfs!LfsFlushLbcb+0x7f (FPO:
[Non-Fpo])
f50338f8 baeb23d1 e144d8c0 0c0fceb4 00000000
Ntfs!LfsFlushToLsnPriv+0xf1 (FPO: [Non-Fpo])
f5033938 804df58e e14b8fd8 0c0fceb4 00000000 Ntfs!LfsFlushToLsn+0x8e
(FPO: [Non-Fpo])
f50339e0 804e03ba ffa655e8 00000000 00000001
nt!CcAcquireByteRangeForWrite+0x7dc (FPO: [Non-Fpo])
f5033a6c baec6055 00001000 00000000 00000000 nt!CcFlushCache+0x290 (FPO:
[Non-Fpo])
f5033acc baec5fe2 f5033e3c e1274008 f5033b7c
Ntfs!NtfsDeleteAllocationFromRecord+0x125 (FPO: [Non-Fpo])
f5033bfc baecc97d f5033e3c e1274008 e1676d20 Ntfs!NtfsDeleteFile+0x25a
(FPO: [Non-Fpo])
f5033e18 baeafac0 f5033e3c 81662e28 812c7658
Ntfs!NtfsCommonCleanup+0xaaf (FPO: [Non-Fpo])
f5033f90 804ea221 812c7658 81662e28 806ac214 Ntfs!NtfsFsdCleanup+0xcf
(FPO: [Non-Fpo])
f5033fa0 8062c190 812c70d8 812f9350 81662e28 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f5033fc4 baf3066f 812c7020 812d1400 f5034008 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f5033fd4 804ea221 812c7020 e175cdb0 806ac214 sr!SrCleanup+0xb1 (FPO:
[Non-Fpo])
f5033fe4 8062c190 81662fdc 81663000 812d14c8 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f5034008 f468222f ffb5bc98 81280b60 81662e00 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
Page f4000 too large to be in the dump file.
f5034c34 f4682377 ffb5bd50 81662e28 804ea221 IOTRAP!IOTrapHookProc+0x27f (FPO:
[Non-Fpo]) (CONV: stdcall) [c:\dev\iotrap\iotrap.c @ 2728]
f5034c40 804ea221 ffb5bc98 81662e28 806ac214 IOTRAP!IOTrapDispatch+0x1d
(FPO: [2,0,0]) (CONV: stdcall) [c:\dev\iotrap\iotrap.c @ 2844]
f5034c50 8062c190 812d14c8 81662e28 81662e38 nt!IopfCallDriver+0x31 (FPO:
[0,0,1])
f5034c74 80560d4d 812d14b0 8130b560 00000001 nt!IovCallDriver+0x9e (FPO:
[Non-Fpo])
f5034ca8 80597c03 ffb750b0 ffb5bc98 00110080 nt!IopCloseFile+0x261 (FPO:
[Non-Fpo])
f5034cd8 805975bb ffb750b0 812d14c8 8130b560
nt!ObpDecrementHandleCount+0x119 (FPO: [Non-Fpo])
f5034d00 80597651 e157e280 812d14c8 00001a0c
nt!ObpCloseHandleTableEntry+0x14b (FPO: [Non-Fpo])
f5034d48 80597777 00001a0c 00000001 00000000 nt!ObpCloseHandle+0x85 (FPO:
[Non-Fpo])
f5034d58 8052d571 00001a0c 00530020 00740065 nt!NtClose+0x19 (FPO:
[1,0,0])
f5034d58 7ffe0304 00001a0c 00530020 00740065 nt!KiSystemService+0xc4
(FPO: [0,0] TrapFrame @ f5034d64)

005df220 00000000 00000000 00000000 00000000
SharedUserData!SystemCallStub+0x4 (FPO: [0,0,0])


                                               Figure 10

This does draw our attention to this call frame and we note that it does appear to consume an
unhealthy amount of stack space – 3116 bytes to be exact. We note that this driver is on the
stack twice, and the second invocation also uses 3116 bytes. Thus, a single driver in this one
stack has used 6232 bytes of stack space. While itself not fatal, it certainly suggests where we
should be turning our attention!

The key to debugging is perseverance. Follow all of the clues and all of the possible avenues.
It is difficult when you first get started, but each time you do this, your understanding of the
system improves as does your repertoire of scenarios. Eventually you will be debugging like a
pro!

Accessing the Microsoft Symbol Server
One of the most common problems that new users have when using the debugger is that the
symbols are not properly configured. Current version of WinDBG have a new command that
will reset your debugger configuration to retrieve symbols from the Microsoft symbol
server. This command is the .symfix command and it will adjust the symbol search path to
default to srv**http://msdl.microsoft.com/download/symbols -- this is the default public
location for symbols. If you do not have Internet acess, this will not work (duh), but it is a
terrific fix for the vast majority of configurations.
User Comments
Rate this article and give us feedback. Do you find anything missing? Share your opinion with
the community!
   Post Your Comment


"Article is Really Good"
Article is Really Good, Commments too. I am a beginner in Windows Driver Development &
Debugging, It's Helped me lot to Analysis the Real time crash dump.


Rating:
08-Jun-07, Mohamed Jaffar (xxxx@readytestgo.com)




"DDS command"
The part of the analysis talked about in relation to Fig 6 is made easier by using the DDS esp
command. This command lists symbols for those dwords on the stack that can be translated to
a symbol. This makes it easy to pick out potential return addresses.

Additionally I have usually found it sufficient to use KV = and leave off the EIP and ESP values.
Plus, KV = can take a frame count, so you don't have to use .kframe command.

04-Feb-05, Chip Webb (xxxx@dell.com)




"Great article"
I think it would be nice to finish the article by proving that the problem indeed was a stack
overflow problem. This way we have at least proven that we need to look into the unhealthy
stack memory consumption of iotrap:

TSS: 00000028 -- (.tss 28) eax=00030e01 ebx=812ca368 ecx=8117f300 edx=00000001
esi=8117f3ac edi=00010000 eip=baf3f9d7 esp=f5031fc0 ebp=f503205c iopl=0 nv up ei ng nz
na po nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010286
atapi!IdeSendCommand+0x3: baf3f9d7 53 push ebx <- We're pushing data onto the stack
beyond it's limit.

Current ESP = f5031fc0

Thread stack limits:

Stack Init f5035000 Current f50326f0 Base f5035000 Limit f5032000 Call 0

Note that we are already over the limit. Before the push instruction is executed. Which means
that the routine probably corrupted another component's memory by writing more data to the
stack.
I am still looking for an explanation why it failed at ESP = f5031fc0 and not at ESP = f5032000.
Could it be that the page the stack was written to was paged out? A 'dd f5031f00' would show
us whether or not the memory memory was valid. But perhaps this is not the reason of the
double fault.

Loved the KV tip to unwind the stack with manual parameters. Never used the K commands
like this before. A tip may be to use the !stack command which is, I think, included with the
kdex2x86.dll extension. I think it would show a similar result.

Keep up the good work!


Rating:
28-Aug-04, Erwin Zoer (xxxx@zonnet.nl)




"Great article!"
Good work guys! Personally, I prefer the following steps: 1) use kv to capture the trap frame or
TSS 2) .trap or .tts to where it died. 3) dds esp l200 4) then feed kv with what I saw in the
output from 3)

Love to see more crash analyze.


Rating:
27-Aug-04, Guan Calvin (xxxx@yahoo.ca)
What's Your Test Score -- Best Practices for Driver Testing
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




In this article, we’ll try to lay out what we consider to be the best practices for testing any driver.
This list, and its associated comments, is based on our 12+ years of experience writing drivers
here at OSR, as well as lots of information we’ve seen shared in the driver development
community.

This list isn’t a "wish list" or a collection of the types of testing that one would perform in a
perfect world. Rather, we’ve tried to keep the list pragmatic. If you aren’t doing the tests
described in the following list – at least items 1 through 6 – then you are not doing an adequate
job of testing your code. What we’re attempting to establish here are the basic standards of
acceptable professional practice. The only reason your mileage should vary is if you’re doing
more tests or more stringent testing than what’s described in this article.

1) Test on the Latest OS
Regardless of the intended target for your code, you should always test aggressively on the
most recently released version of Windows. At the date of this writing, the right system to use
would be Windows Server 2003. This is true, even if you only intend your driver to run (today)
on Windows 2000 client systems! This is because there are more capabilities and checks in
Driver Verifier in the most recent versions of Windows. Also, there may be more (or more
appropriate) checks in the checked build.

This isn’t to say that you should never test on the OS that’ll be your main deployment target.
Obviously, you need to test aggressively on that too.

Also, this isn’t a suggestion to base your testing on a pre-release or beta release of Windows.
It’d be insane to base testing of a driver you’re building today on Windows Longhorn – There
are just too many unknowns. Stick with released builds for the majority of your testing.

2) Enable All Driver Verifier Options
At all times, during all testing you perform (functional, regression, and stress) always have
Driver Verifier enabled for your driver. And be sure that you’ve selected all the Driver Verifier
options, except low resource simulation.

Remember, (as described in Trust Yet Verify), Driver Verifier watches to ensure that
whatever your driver does, it does correctly. The more situations you put your driver through
with Driver Verifier enabled, the better testing you’re getting.

3) Test Under The Checked Kernel and HAL
This is a must. I remember one time talking to a developer who was having trouble with his
driver. I asked him if he had run it under the checked build. "Oh no, man," he said, "I tried that
once and the system crashed. I never did use the checked build again." Duh!

The checked build of Windows contains a set of reasonableness tests for various parameters
passed and actions taken by your driver. These tests are in addition to those provided by
Driver Verifier. Therefore, it’s important to test with both the checked build and Driver Verifier.
If you’re not testing with the checked build, you’re not adequately testing your driver.

4) Test on MultiProcessor Systems
I can’t believe I have to write this, but let’s be thorough: You must test your code on
multiprocessor systems. And, the more CPUs the better. At the very least, and regardless of
what you think about hyperthreading, get your test lab a few systems with hyperthreaded
CPUs. These are inherently multiprocessor systems.

The more processors you have, the more quickly you’ll uncover your MP-specific bugs. So, it’s
a good idea to test on a dual-processor, hyperthreaded system, which effectively acts like it’s a
quad-processor.

You cannot ignore testing on MP systems. Most Intel chips that are sold today are
hyperthreaded, so regardless of what you may think, if you’re code is running on new
hardware it’ll be running in an MP environment.

5) Use Call Usage Verifier
Test your driver, at least periodically, with Call Usage Verifier (CUV). We recommend using
the most recently released version of CUV, such as that provided in the Windows Server 2003
SP1 DDK. This version has had a number of enhancements and improvements made to it.

CUV can find a lot of ugly problems that are otherwise extremely hard to detect. For example,
problems with invalid I/O Stack Locations are problems that Driver Verifier can’t even find. A
good example of this type of problem: Calling IoAllocateIrp and then getting the location of the
I/O Stack Location to fill in with parameters by calling IoGetCurrentIrpStackLocation. Ooops!

Get a checked version of your driver built with CUV, enable Driver Verifier, and run it on the
checked build. Really put the driver through its paces.

6) PreFAST and Lint the Code
You must run your driver through PreFAST. At the very least, fix any bugs that show up when
the "winpft" filter is selected. Yes, even if the bugs are PreFAST complaining about something
stupid. Just fix it (so you don’t have to look at it the next time you run PreFAST).

Here at OSR, we’ve also found it very useful to run drivers through pcLint. We find that lint
finds different errors than PreFAST, and one does not replace the other. We published an
article on how to use lint for driver development a while back (All About Lint). If you haven’t
read it, you should.
7) Segment Your Testing
Divide and conquer. Any proper driver testing program will need to separate out at least three
kinds of tests:


         Tests for functional correctness
         Tests for implementation correctness
         Long-run and stress testing

Never confuse these types of tests.

Functionality tests check to see if your driver does what your customers expect it to do. These
tests answer the question, "Does your driver properly fulfill its mission?" Yes, you need to test
whether or not the satellite dish turns right five degrees when you send the "turn right"
command, with a parameter indicating that the turn should be five degrees. But it would be a
serious error to simply exercise all the functionality in your driver, and declare it to be fully
tested.

On the other hand, tests for implementation correctness aim at exercising your driver’s
robustness. These tests answer the question, "In fulfilling its mission, does your driver do what
it does in a way that’s valid and plays well with the Windows operating system?" You need to
check to ensure that your driver robustly validates any parameters passed to it from user mode.
You must validate that your driver can properly handle any invalid, inopportune, or malformed
requests it might receive. Driver Verifier, CUV, and other test tools (such as DC2, pnpdtest,
and the ACPI tests) that are provided as part of the DDK can help with this category of test.

Finally, there are long-run and stress tests. These tests answer the question, "Can your driver
keep it up for a long time, and under heavy system loads?" We can’t believe the number of
colleagues we run into who run tests for a few minutes and go home, confident that their driver
works. Hullo? How about letting it run for two weeks under three times the normally expected
load? You must perform these tests to ensure that your driver’s code paths are properly
exercised.

8) Consider Code Coverage
Consider getting a tool that tracks or measures driver code coverage. Here at OSR, we
haven’t actually used any of these tools. However, we do know that both Compuware and
Bullseye make code coverage tools that’ll work for drivers. We haven’t heard anything positive
or negative about the Compuware tool. We’ve heard very little about the Bullseye tool, but
what we have heard has been positive. Maybe somebody with experience with either or both
of these tools will volunteer to write up an article one of these days.

Having a good quality tool that can provide code coverage analysis for your driver would be
nothing but a good thing. That way, you would know for sure that all key code paths are being
exercised. It’s something to consider, right?
Try This -- Interactive Driver Testing
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




Drivers can be tough to test, especially early in their development cycle. Often, you need to get
a ton of code written and running just so you can get to the point where you can start testing to
see if some of the basic assumptions you’ve made about how your hardware works are valid.

Well, maybe not. One of the most unique ideas I’ve seen in driver testing was recently
proposed by NTDEV member (and DDK MVP) Don Burn.

Let’s say you want to do some basic tests to see if you can actually get your hardware
initialized. Why not stick your code into the world’s most minimal driver skeleton using a
DriverEntry routine like that shown in Figure 1?



NTSTATUS
DriverEntry(PDRIVER_OBJECT DriverObj, PUNICODE_STRING RegistryPath)
{
      ULONG function;
      NTSTATUS status;
      BOOLEAN done = FALSE;

      DbgPrint("\n MyDriver -- Compiled %s %s\n",__DATE__, __TIME__);

      while(done == FALSE)                  {

             DbgPrint("\nMyDriver TEST -- Functions:\n\n");
             DbgPrint("\t1. Input Port Address\n");
             DbgPrint("\t2. Input Shared Memory Address\n");
             DbgPrint("\t3. Call HwInit\n");
             DbgPrint("\t4. Test InitStatus\n");
             DbgPrint("\n\t0. Exit\n");


 DbgPrompt("\n\tSelection: %x", &function);


 DbgPrint("\n");

             switch(function)               {

                    case 0:
                           done = TRUE;
                           break;
            case 1:
                DbgPrompt("\nPort to use: %x", &Port)
                break;



            case 2:       {
                ULONG mem;
                PHYSICAL_ADDRESS pa;

                DbgPrompt("Shared Memory low-part: %x", &mem);
                pa.LowPart = pa;
                DbgPrompt("Shared Memory high-part: %x", &mem);
                pa.HighPart = pa;

                MappedMem = MmMapIoSpace(pa, SHARED_SIZE, MmNonCached);
                break;
            }

            case 3:
                DbgPrint("Calling HwInit");
                status = HwInit();
                DbgPrint("HwInit returned 0x%0x\n", status);
                break;

            case 4:
                DbgPrompt("How many MS to wait?? %d", &mem);

                DbgPrint("Calling TestInitStatus");
                status = TestInitStatus(mem);
                DbgPrint("TestInitStatus returned 0x%0x\n", status);

                break;

        }

    }

    DbgPrint(DRIVER_NAME "DriverEntry: Leaving\n");

    if(MammpedMem)    {
        MmUnmapIoSpace(MappedMem, SHARED_SIZE);
    }

    return(STATUS_UNSUCCESSFUL);
}
                                            Figure 1

The trick here is that you write a DriverEntry routine that uses DbgPrint and DbgPrompt in a
loop to interact with you via the debugger. When you’re done playing (er, testing); your code
breaks out of the loop, reverses any changes it made in the system, and returns an error.
That’ll cause your driver to unload.

Given that you’ll be performing the majority of this testing on one or two test machines at the
most, with a limited set of hardware, you will almost certainly know the port and/or shared
memory address assigned to your device. If you don’t know these from your hardware
specification, you can always use one of the many PCI bus examination utilities and read the
information from your device’s BARs. After all, these registers were assigned to your device
during system initialization. You can very happily use them on a limited basis and in a test
environment only without doing the usual StartDevice song-and-dance.

Before you go all frothy at the mouth, understand that this whole idea is limited to early test
scenarios. We’re talking about a technique that’ll help you get something going fast with the
least possible annoyance and smallest-possible investment in test code. Once you get some
solid code in your driver, you’ll want to change over to properly allocating the ports and shared
memory address like you’re supposed to, and using a user-mode test app to send testing
IOCTLs to you driver.

And, be warned, this idea might not work with SoftICE.

But for early stage testing, where you need to continuously vary timer values and other test
information, this might be just the trick you need. Have fun!
Trust Yet Verify -- All About Driver Verifier
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




Can you spot the bug in this root enumerated WDM function driver’s DriverEntry? (see Bug 1)

We can guarantee that the driver that this came from does not blue screen, it does not hang,
and it functions 100% properly in every way. Nonetheless, this code has a bug in it (of course
this bug was artificially added, no code that we write has bugs so we have to go back and add
them in for illustrative purposes).

NTSTATUS
DriverEntry(PDRIVER_OBJECT DriverObj, PUNICODE_STRING RegistryPath)
{
     DriverObj->MajorFunction[IRP_MJ_CREATE] = NothingCreateClose;
     DriverObj->MajorFunction[IRP_MJ_CLOSE] = NothingCreateClose;
     DriverObj->MajorFunction[IRP_MJ_READ]                                = NothingRead;
     DriverObj->MajorFunction[IRP_MJ_PNP]                                 = NothingPnp;
     DriverObj->MajorFunction[IRP_MJ_POWER] = NothingPower;


     DriverObj->DriverExtension->AddDevice = NothingAddDevice;


     DriverObj->DriverUnload = NothingUnload;


     return(STATUS_SUCCESS);

}


                                                       Bug #1

The bug is that this code does not supply an IRP_MJ_SYSTEM_CONTROL handler, which is
required for any WDM driver. The reason that the driver this came from shows no signs of
being broken is that its PDO does not implement any WMI functionality. But, because we’re all
lazy, code seems to continue to show up all over the place. Slap this DriverEntry into a new
driver whose PDO does support WMI and you’ve got a bonafide bug on your hands.

OK, so you install your function driver and the WMI functionality of the bus driver is
inaccessible. You look, see you don’t have an IRP_MJ_SYSTEM_CONTROL handler, and
five minutes later your code is working properly. Not exactly the kind of bug you’re going to be
burning a weekend trying to track down, so who cares?

If you didn’t catch the last bug, here’s another chance. In a disk filter driver, you decide that
you really want a count of the number of outstanding reads on the disk. This means that you
want to bump your count in your read dispatch entry point and set up a completion routine so
you can decrement it when the read is done. Easy enough, check out Bug #2—Increment
Read Dispatch Entry Point.

NTSTATUS DiskFilterRead(PDEVICE_OBJECT DeviceObject, PIRP Irp)
{
    PDISK_FILTER_EXT devExt =
        (PDISK_FILTER_EXT)DeviceObject->DeviceExtension;


    InterlockedIncrement(&devExt->OutstandingReadCount);


    IoCopyCurrentIrpStackLocationToNext(Irp);


    IoSetCompletionRoutineEx(DeviceObject,
                                  Irp,
                                  DiskFilterReadComplete,
                                  devExt,
                                  TRUE,
                                  TRUE,
                                  TRUE);


    return IoCallDriver(devExt->DeviceToSendIrpsTo, Irp);
}


                      Bug #2 -- Increment Read Dispatch Entry Point

How can this be wrong? You were even careful to use InterlockedIncrement and
IoSetCompletionRoutineEx! Let’s check out the completion routine, see Bug
#2—Completion Routine.

NTSTATUS DiskFilterReadComplete(PDEVICE_OBJECT DeviceObject, PIRP Irp,
PVOID Context)
{
    PDISK_FILTER_EXT devExt = (PDISK_FILTER_EXT)Context;


    InterlockedDecrement(&devExt->OutstandingReadCount);


    return STATUS_CONTINUE_COMPLETION;

}


                                Bug #2 -- Completion Routine

Returning STATUS_CONTINUE_COMPLETION? This was obviously written by a true DDK
savant. Running it seems to work OK, for a while that is. Then all of a sudden things start to get
weird. First, maybe Explorer starts to hang, and then another app, and another, and another
until nothing works at all.

You have absolutely no idea what’s going on, but you notice that just about every other
completion routine that you can find has this line in it:

if (Irp->PendingReturned) {

IoMarkIrpPending(Irp);

}


You don’t really know why it would fix it, but you add it anyway and like magic the problem
disappears. So, you’ve fixed all of the bugs that you’ve chosen to acknowledge that day and
head for home.

Real Bugs Never Die
Being the good developer that you are, you realize that you obviously have some deficiencies
in your WDM knowledge (and who doesn’t) so you need to go back and make sure that these
bugs aren’t in the twenty other drivers that you cut and pasted into existence. Did we mention
developers are lazy? Wouldn’t it be nice to have had someone ask you to turn around so they
could smack you in the back of the head the first time you wrote the code, instead of finding
the bugs days, weeks, or months later? Or, wouldn’t it be nice to receive detailed errors and fix
instructions when an application hangs after your filter is loaded, instead of being left
scratching your head to find the problem?

Lesson: You should have been using Driver Verifier. If you had been running Driver Verifier
on your target machine while you were debugging your driver, you would have found these
problems right off the bat. Live and learn? We hope so.

Not Just a Testing Tool
Don’t be deceived by the fact that the documentation on Driver Verifier is listed under Tools for
Testing Drivers in the DDK. A more appropriate section title would be, Tools for Testing and
Developing Drivers. It is definitely not something that you just want to flip on two weeks before
your product ships in the final round of testing. It can find problems ranging from the innocuous
and easy to fix, like what we saw in Bug #1, to show-stopping, architectural bugs like broken
locking hierarchies or application hangs.

Before we move on, gather around for a quick tale. Not too long ago in a popular driver
development newsgroup far, far away, some poor dev wrote in saying that he had enabled
Driver Verifier for his driver, but his driver wasn’t really being exercised in any way. Why wasn’t
Driver Verifier calling him with all sorts of messed up requests? Much to my surprise he wasn’t
derided into a new profession, but in case you’re not aware: for the most part, Driver Verifier is
passive in its bug finding. The majority of its tests will only test code paths that you exercise
while it is monitoring your driver’s activity.
Using our two earlier examples, the IRP_MJ_SYSTEM_ CONTROL bug would have been
discovered by just enabling Driver Verifier on your driver, because it would send you a bogus
WMI request and fail if you mishandled it. The second bug, however, would only be flagged by
Driver Verifier when the driver below you returned STATUS_PENDING. Therefore, Driver
Verifier is only a part of a balanced testing and developing breakfast (an entire discussion on
testing would take up a whole issue of The NT Insider and then some…Oh wait, it did!).

It’s the O/S in Verification Mode, not a Separate Entity
Driver Verifier is actually part of the Windows O/S, it is not a separately loaded module. This
puts it in a position to have an extraordinary level of power in monitoring the interaction
between drivers and the kernel. Driver Verifier basically places a wrapper around your driver to
closely monitor how it manages all of the various kernel objects and resources. Because of
this, it should be pretty obvious that at some point, you will want to run your tests without Driver
Verifier enabled to ensure that it is not masking any subtle timing bugs.

Starting Driver Verifier
Because Driver Verifier is so tightly integrated with the O/S, new versions of it ship with new
versions of the O/S. One thing to note then, is even if you’re targeting versions of the operating
system way back to Windows 2000, you should try to do your testing during development on
the latest version of the O/S to get the latest Driver Verifier checks. We’ll be using an XP
system throughout this article, so there might be some differences in options when running on
2000 or Server 2003. These will be called out when appropriate. Also, almost all of the Driver
Verifier options that we’ll see are also accessible from the command line. Run Driver Verifier
from a command window with the "/?" switch to get detailed info.

To start Driver Verifier, simply run verifier.exe, located in the SystemRoot\System32 directory,
and select the Create custom settings option (see Figure 1).
Figure 1 -- Verifier Manager: Create Custom Settings

Next, select individual settings from a full list, as shown in Figure 2 and 3.




Figure 2 -- Verifier Manager: Individual Settings
From the resulting dialog (see Figure 3), you’ll generally want to select all of the available
options except for Low resources simulation, and sometimes, Special pool during your
development. We’ll see why in a minute.




Figure 3 -- Verifier Manager: Enabling Test Types

The Tests
So what exactly do the tests listed in the settings dialog do? We’ll talk about each test in turn to
find out.

Automatic Tests
There are some tests that you’ll get just for the price of admission, and they’re worth every
penny.

The automatic checks catch all sorts of "Oops!" errors that are easy to make and difficult to
track down. A short list of interesting checks:


           Attempting to allocate zero bytes of memory
           Freeing a non-pool address
           Freeing a previously freed block of pool
           Marking an allocation as MUST_SUCCEED, which is deprecated
           Releasing a spinlock twice
           Unloading with outstanding timers, lookaside lists, worker threads, etc.
Special Pool
By enabling the Special Pool option, you enable two safeguards for one of the most insidious
types of driver error: memory corruption.

The first set of potentially memory corrupting errors that this option will catch is buffer overruns
- accessing memory after a valid address range. Driver Verifier catches these by adding what
are called "guard pages" to the tail of every allocation that the driver makes. Driver Verifier
then marks these pages as "no access" so that an access violation will occur if these pages
happen to be touched by the driver. If the access violation does trigger, Verifier traps it and
bugchecks the system in a more controlled way than usual. By that we mean the bugcheck
code and stack trace will be very explicit about the error and the stack trace will pinpoint the
offending code exactly. This is important, because it is very common for a driver that is writing
to a random location off the end of its buffer to corrupt another driver in the system. When a
situation like that happens, the system will bugcheck and typically blame the wrong driver.
These types of blue screens are extremely hard to debug and even harder to explain to your
customers. Note that according to the DDK documentation, you can use the GFlags utility to
alternatively choose to have the guard pages added to the head of the allocations instead of
the tail. This would allow you to catch buffer underrun errors, (accessing memory before a
valid address range), which are less common.

The other set of potentially memory corrupting errors that Special Pool will catch are accesses
to memory after it has been freed. This is another problem that is particularly tricky to track
down in the wild, because it can easily go undetected for long periods of time. It generally only
causes a problem if the system is under heavy load and the address is quickly recycled to
another driver (or even the same driver!) in the system. Driver Verifier plays a pretty cool trick
in order to catch these errors. What it does is free the memory that is backing the allocation,
but leaves the virtual to physical address mapping (i.e. the PTE) active but marked as "no
access". This means that if the driver then attempts to access the memory, an access violation
will occur and the system will bugcheck.

Special Pool is not a magic bullet for a couple of reasons though. First of all, it will not catch
stray pointer accesses that point to valid allocations. It is such a common practice for one
component in the system to allocate memory and pass it for use in another component that
checking for something like this would be impossible. Also, as has been previously reported in
The NT Insider, when enabling Special Pool for your driver, your pool allocation tags are not
preserved. This means that if you are trying to track down memory leak issues, it’s probably
best to not test with Special Pool enabled.

Pool Tracking
The Pool Tracking option enables one check that is similar to the Special Pool overrun check
and another to track resource cleanup.

The overrun check in Pool Tracking does essentially what the Special Pool check does – it
adds a page to the tail of memory allocations, except the guard pages are not marked as "no
access." Instead, they are filled in with a particular pattern. If the pattern is modified when the
memory block is freed, the system bugchecks. This is slightly less helpful than the special pool
option because it only catches the corruption after the fact, making it more difficult to find the
true source.

The other check that Pool Tracking enables concerns driver unloading. When the driver is
unloaded, Pool Tracking makes sure that all of the resources allocated by the driver have been
freed. If the driver is unloaded and it has not freed all of its memory resources, the system
bugchecks and indicates how much memory has been leaked. Further, if pool tagging has
been enabled, the pool tags of the leaked memory allocations are also indicated. This option is
extremely helpful if your driver supports being unloaded, but if you are a file system driver, for
example, this check does not provide any additional help.

Force IRQL Checking
We always stress in our classes that you cannot write a driver if you do not understand IRQLs.
If you spend a few minutes browsing the NTDEV and NTFSD newsgroups, it will quickly
become obvious that not everyone has taken an OSR seminar. But, even if you know the rules
like the palm of your hand, you still need to obey them and the Force IRQL Checking option
can help you do just that.

Force IRQL Checking enforces the number one IRQL rule: you must not touch any pageable
memory at IRQL DISPATCH_LEVEL or above. The reason for this, of course, is that if the
pageable memory happens to not be resident, a DISPATCH_LEVEL software interrupt must
be executed to bring the page into memory. If the code that is currently running is already at
DISPATCH_LEVEL or above, the DISPATCH_LEVEL interrupt cannot run and the page fault
cannot be satisfied. Because the Memory Manager aggressively caches pages, it is entirely
possible that this bug will go unnoticed during your testing because the pages have already
been faulted in at an earlier time by a thread running at a proper IRQL.

The way that Driver Verifier enforces the pageable memory and IRQL rule is by paging out all
pageable memory after every IRQL raised to DISPATCH_LEVEL or above. This ensures that
all accesses to memory regions marked as pageable at an elevated IRQL generate a
DRIVER_IRQL_NOT_LESS_OR_EQUAL bugcheck.

I/O Verification
I/O Verification gets brken down into two creatively named levels: Level 1 and Level 2. On
Windows XP you always get both Level 1 and Level 2 when you select I/O Verification from the
Driver Verifier GUI, but on Windows 2000, Level 2 must be explicitly enabled (see the DDK
docs for details on how to do this).

When Level 1 I/O Verification is enabled, all IRPs are allocated out of special pool, which is
helpful in catching some common errors (if you’ve ever tried to fill in the current stack location
of an IRP that you’ve allocated, then you definitely want to flip on Level 1 I/O Verification).
Other Level 1 checks include:


       Calling IoCompleteRequest on an IRP with a cancel routine still set
       Calling IoCallDriver from a dispatch routine at a different IRQL than you were called at
       Calling IoCallDriver with an invalid device object

Level 2 I/O Verification expands upon Level 1 I/O Verification with one difference: If a kernel
debugger is attached, Level 2 I/O Verifications will not bugcheck the system. Instead, an
ASSERT is issued with a detailed description of the error and, in some cases, even a URL
where you can get more information. If you choose to ignore these errors, the machine will
continue to run, potentially giving you the ability to fix your code and reload your driver without
a reboot. This is quite a convenience, to say the least. Also, Level 2 I/O Verification comprises
over fifty I/O checks. Here are some good ones:


       Calling IoCallDriver on an IRP with a cancel routine still set
       Deleting a device that is attached to a lower device without first calling
        IoDetachDevice
       Completing IRP_MJ_PNP requests that you don’t handle, instead of passing them
        down
       Manually copying a stack location instead of using
        IoCopyCurrentIrpStackLocationToNext and not clearing the upper driver’s completion
        routine

Enhanced I/O Verification
Enhanced I/O Verification is a feature added to Driver Verifier in Windows XP to add to the
laundry list of I/O checks done by Driver Verifier. These checks are reported in the same way
as the Level 2 I/O Verifications in that they appear as ASSERTs when a kernel debugger is
attached and can be ignored without bug checking the system.

Does the golden rule we violated in Bug #2, "If you mark the IRP as pending you must return
STATUS_PENDING," sometimes escape you? If so, Enhanced I/O Verification is your friend
as it monitors your IRPs and ensures that you follow this rule. Another neat trick that this
option enables is mixing up the PnP load order of devices in the system. This ensures that just
because driver A starts before driver B on every system you’ve run your driver on, you don’t
code to that fact.

This is also the option that will trap Bug #1, by sending bogus PnP, Power, and WMI IRPs to
your stack to check for proper processing of IRPs of each type.

Deadlock Detection
Deadlock Detection is another Driver Verifier option that was added to Windows XP. Enabling
it causes Verifier to track all of your driver’s acquires and releases of spinlocks, mutexes and
fast mutexes and ensures that a locking hierarchy is in place and is followed. An interesting
thing to note here is that Deadlock Detection is constantly monitoring your acquires and
releases, and building a large graph of the use of your locks throughout the driver. If it finds a
potential deadlock condition, it will bugcheck the system. What this means is that your code,
as written, may never hit a deadlock, but if there’s a potential for it, the system will still
bugcheck.
The thinking here is that you should have a locking hierarchy in place and always follow it,
even if in some places you "know better." This provides a more robust code base that is less
prone to develop locking issues in the future.

If your system does bugcheck due to Deadlock Detection, the !deadlock WinDBG command
may be used to get detailed information revealing why the bugcheck occurred.

DMA Checking (a.k.a. DMA Verification a.k.a. HAL Verification)
Also only available in XP and later, DMA Checking enables a wide array of checks that ensure
proper use of the DMA APIs. One nice feature that you get with DMA Checking is that it
causes all DMA transfers to be double-buffered by Verifier. Though the chances are small that
this will discover bugs in your code, it guarantees that your driver will work properly on PAE
systems with greater than 4GB of RAM.

An exhaustive list of the checks can be found in the DDK documentation, but here are some of
the more interesting ones:


       Catch buffer overruns and underruns on the DMA buffer
       Check proper allocation and destruction of adapters, common buffers and scatter
        gather lists
       Proper use of map registers
       Use of valid DMA buffers (i.e. ensuring they are not NULL or pageable)

The DDK documentation lists over twenty checks that it makes to your DMA operations when
enabled, so this option is something that you’re, without a doubt, going to want to enable if
you’re writing a driver that supports DMA.

The !dma WinDBG extension knows about DMA Checking and can be used to get extended
information about DMA adapters currently being verified.

Low Resources Simulation
Low Resources Simulation is the one test that we generally recommend that you do not enable
until your final rounds of testing. Enabling this option will result in random failures for memory
requests. What your driver does in these situations and how gracefully it must handle them is
entirely device and environment specific, but at the very least the system should not bug check
because of a NULL pointer dereference.

How your driver handles low resource conditions is device specific because, for most drivers,
just checking for NULL and returning an error is sufficient. However, there are several types of
drivers that need to be fully capable of handling these situations by falling back to memory that
was previously allocated when resources were not scarce. No testing cycle is complete until
your driver has proven to not bring down the entire system because of a call to
ExAllocatePoolWithTag failing.
Disk Integrity Checking
Disk Integrity Checking was added to Driver Verifier in Server 2003. If you are working with a
driver in the storage stack, this option can be extremely helpful in finding data corruption errors.
Every time a sector is read from or written to the disk, this check computes the CRC and, if it
has been previously accessed, compares it to its previous CRC. If the CRCs don’t match, the
system will bugcheck. As you can imagine, enabling this option puts a serious strain on the
resources of the system, so it should generally not be enabled during day-to-day testing and
developing.

IRP Logging
IRP Logging was also added in Server 2003 and is sort of an oddball Driver Verifier option.
What it does is keep a copy of the last twenty IRPs that the driver being verified has received
in a circular buffer. You can then extract the information about the last twenty IRPs to a text file
by using the DC2WMIParser utility. There doesn’t appear to be anything documented about
how to retrieve this info from the debugger, and Driver Verifier usually lets you know that
something went wrong by bugchecking the system, so we’re not quite sure enabling this option
is very useful. But, it’s there so check it out and see if it suits any of your needs.

Conclusion
If you aren’t running with Driver Verifier enabled from day one of development, you’re not
being a responsible member of the driver development community. Period. This is not some
esoteric hardcore developer utility that no one but you will ever run, it’s a standard O/S utility
that ships in-box. This means that if you don’t run it, your users will, and we all know how much
it sucks when customers start complaining (for some reason the sales and marketing people
just hate that).

We didn’t mention it here, but there are special Driver Verifier options for SCSI and graphics
drivers so if you’re in one of those spaces, check out the documentation in the DDK. Also, be
on the lookout for Static Driver Verifier (SDV), a new utility that will be released at a date TBD
that will be able to do static analysis on your driver and find bugs actively at compile time
(which should help out my aforementioned newsgroup buddy). For the latest information on
SDV you can keep hitting refresh at http://www.microsoft.com/whdc/devtools/tools/sdv.mspx
or just stay tuned here and we’ll let you know what we know, when we know it.
Testing from the Ground Up -- Getting a Good Start
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




Driver testing starts right at the stage of driver development. The best time to ensure your
driver is both testable and diagnosable is during development. Making a few good decisions,
and spending a little extra time, will pay off in major dividends in the long run.

So, here are some guidelines about building testability and diagnosibility into your driver.

1) Practice Defensive Driver Writing
As one NTDEV participant wrote recently, "Be a paranoid developer!" This is excellent advice.
As you write code, aggressively add ASSERT statements to verify your assumptions. On
those days when you’re sort of burned out, why not take an hour or two to add yet more
aggressive assertions. Put these tests in the checked build of your driver and you won’t have
to worry about performance penalties in your final code. I can’t imagine such a thing as too
many ASSERTS.

The concept behind using ASSERTS is that you need to make explicit any assumptions that
you have. For example: Let’s say you have a function that takes a pointer to a
UNICODE_STRING as an input parameter. As you write your code, you might know that the
caller will never pass a NULL pointer into this function, so you don’t allow for this in your code.
If that’s the case, at the very least, place an ASSERT statement that validates your
assumption:

ASSERT(DevName != NULL);


That way, if something strange happens and a NULL is passed into your function by mistake,
you’ll at least catch the problem. There are some obvious cross-checks you can perform on
UNICODE_STRING structures as well. For example, if the string is not NULL, the buffer
pointer cannot be zero. Also, the maximum length must always be greater than or equal to the
length of the string (duh!). ASSERT these things:

ASSERT(DevName->Length != 0? (DevName->Buffer != 0) : TRUE);

ASSERT(DevName->MaxLength >= DevName->Length);


Perhaps as you write your function, you know that the string has to be at least a certain length.
At the very least, you’ll want to ASSERT this, too:

ASSERT((DevName->Length > 0) && (DevName->Length <= 3) );
You’ll also want to ASSERT other things you know about the string, including any assumptions
you make about the string’s format.

Before you go all crazy on me and whine about the overhead and about how Windows
kernel-mode components are supposed to trust each other, remember this: We’re talking
about checks that’ll only be in the checked build of your code. As soon as we’re talking
checked build, we know that performance isn’t an issue.

2) You Need Tracing
When you’ve got a driver problem at a field test site in another city (or, more likely, in another
country) you need a way to know what was going on just before your driver crashed.

Yes, your driver needs to have tracing, and it probably needs to have tracing in the free build.
Not too long ago, the way we’d get this tracing is that we’d each have to write our own circular
buffer in-memory trace log package. But, no more. Now we can use Event Tracing For
Windows – and specifically Windows Pre-Processor (WPP) Tracing – to add low-overhead
trace points into our code, and have it work all the way back to Windows 2000.

In using WPP Tracing, don’t forget that you can specify both Flags and Levels. Flags are
typically used to specify the trace "path". Levels are typically used to indicate trace "volume",
that is, how much spew is generated in a particular path.

Remember, there’s nothing wrong with using both WPP Tracing and DbgPrint(Ex) in the same
driver. DbgPrint output is nice for your normal debugging work. WPP Tracing excels at
providing "traceability" in your driver.

3) Get With The 64-Bit Program
As you write your code, write it to be 64-bit compatible. This means that you should use the
standard 64-bit compatible type names (ULONG_PTR, for example) by default. We also
recommend that you build the AMD-64 version of your driver and (at the very least) correct any
compilation errors.

We will be in a 64-bit world sooner than you think, and building in 64-bit support will save you a
lot of hassle later. Further, using the 64-bit types makes more explicit your intent: Do you mean
something you’re casting to be 32-bits wide specifically, or do you mean it to be the width of a
pointer on the target system?

4) You Need Infrastructure
Any testing or diagnosability infrastructure that you create for your driver will usually provide an
enormous payback in saved time and less annoyance. Here at OSR, we build macros for
frequently performed operations in a complex project. Even if the operation requires nothing
more than calling a function, we’ll typically create an encapsulating macro.

This approach makes it easy to create code that automatically and predictably checks status
return values and outputs a tracing statement. When you get into the habit of using macros this
way, you can’t "forget" to put in the status check or add the trace statement when you start to
get deadline pressure.

One other piece of infrastructure that we’d traditionally recommend is a lock tracking package.
However, note that Driver Verifier actually does a pretty darn good job of tracking spin lock
usage for you.

5) You Need Type Checking
While we don’t recommend writing a Windows drivers in C++, We do strongly recommend that
you name your driver source files with .CPP file types. This will result in using the C++
compiler, by default, to compile your driver. The advantage that you get is strong type
checking. You’d be surprised at how many errors this avoids.

Here at OSR, we write our drivers with CPP file types by default. Every once in a while, we
have to convert a project (an old one done here or one written by a client) from using .c file
types to using .CPP file types. Believe it or not, we have never done this conversion process
without finding at least one significant error.
Test Lab Basics -- Helpful Hardware Accessories
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




Lab needs for driver testing can vary widely depending on what your specific driver is designed
to do. The number of platforms, required peripherals, size of your company, and number of
products to be tested all factor in to what you’ll need to effectively test your driver. We’re not
about to propose a setup that will work for everyone. However, there are a couple of small
hardware additions that will prove to be useful to any driver testing lab, regardless of the setup.

KVM Sharing Switch
As you add test systems to your test lab, it quickly becomes apparent that having a separate
monitor, keyboard, and mouse for each system requires a lot of space. A 4 port
keyboard/video/mouse sharing switch will enable the connection of 4 different systems to one
monitor, one keyboard and one mouse, drastically reducing the desktop space required per
system. Be sure to select a switch that allows port switching using a simple keyboard hot-key
sequence in addition to pressing a button on the unit. One such example is the StarView
4-Port KVM Switch with 3-in-1 Cables (SV411K) from StarTech.com. The benefits of this
switch are its small footprint (3 ¼" x 7 ¼"), low cost (around $150), the 3-in-1 cables to reduce
cord tangling, and its ability to prevent boot-up errors by providing keyboard and mouse
emulation to the non-selected PC. To switch between systems simply press Ctrl-Ctrl-# of
system. If you want to go to system 3 press Ctrl-Ctrl-3. To go to system 1, press Ctrl-Ctrl-1.

USB Serial Adapter
Now that you have 4 test systems set up using the KVM sharing switch – maybe running
different versions of the operating system or different drivers under test – you need separate
debugger systems to attach to these test systems (debuggees) for testing/debugging
purposes. But do you really need four different debuggers? Keyspan has a USB 4-port Serial
Adapter (around $140) that will enable the use of one debugger system for debugging 4
different debuggee systems. Obviously, the debugger system must have a USB port to use
this handy device. You connect the serial adapter via the USB cable to your USB port, connect
the test systems to the four serial ports on the adapter with serial cables, install the software
that comes with the device and you now have four additional serial ports available for use.
Assuming you already have two serial ports on the debugger system, the four new ports will be
COM3, COM4, COM5 and COM6.

Running multiple versions of WinDbg on the same system is not a problem. Simply specify the
port to be used for each test system and you can have four debugging sessions going at any
one time. Hmmm, you may need to invest in a large monitor for this system…
Do you use any indispensable pieces of hardware in your test lab that you’d like to share with
the driver community ? Tell us about them and we’ll pass on the information in a future edition
of The NT Insider.
Test Lab Basics -- Choosing Machines for Your Lab
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




One of the most important decisions that you need to make when provisioning a test lab is
what machines to use. Obviously, much of the answer will be dictated by the type of drivers
you write and your customer base – If you make software that’s primarily used in laptop
environments, we’re thinkin’ that putting a bunch of Data Center capable systems in your lab
probably isn’t a priority. However, if you’re doing general driver testing, there are a few
common issues that you might want to consider.

What’s the goal of a outfitting a test lab with test machines? Well, if you spend some time
thinking about it, you’ll realize there are at least two separate goals that are dictated by the
types of testing, you want to accomplish:

         Goal #1: You want to create environments that closely mirror those of your customers.
          In this case, you want to be able to create a facility that will allow you to reproduce
          customer scenarios (including the ability to repro problems that are reported in the
          field).
         Goal #2: You want to create environments that will allow you to achieve the broadest
          possible test coverage of your drivers. Here, your goal is to be able to perform general
          Q/A tests, including stress and regression.

One mistake that is often made in addressing goal #1 is believing that it’s more important, or
even sufficient, to have systems that replicate the customer’s actual environment. While the
ability to replicate customer environments is important, you have to remember that (for most
test labs) you’ll never be able to replicate every possible customer environment. Also, don’t
forget that customer environments will usually change and evolve. When this is the case, it’s
usually very hard to keep a test lab outfitted with the latest and greatest systems that your
customers might use.

That’s why it’s typically much more important to consider goal #2, and outfit a test lab with
machines that’ll help you exercise and stress your code in the broadest possible ways. For
example, at OSR we usually rotate machines from development use into our test lab. This
helps keep the developers happy, and provides test machines that are at least slightly
under-configured relative to the state of the art. We sometimes even remove memory from
these machines. The goal for these machines is to stress our code in ways that it wouldn’t
normally be stressed on a developer’s test system.

Putting your driver under heavy stress on a system that has a slow CPU and/or limited
memory configuration, while also running other background applications, is very effective at
forcing your driver to page, stress pool usage, and in general change the timing of your code
execution drastically. How far do we take this? Well, would you believe that we have a
dual-processor 90Mhz Pentium with 128MB of memory that we use for testing? We do. Yes,
it’s annoying to use. But, we catch a lot of errors on this system when we crank-up IOMETER
or our internal ActGen tool and start pounding!

On the other hand, you can’t just throw your cast-off junk into the test lab and claim you’re
doing the right thing. If you’re going to find those tough timing problems, you need systems of
many different speeds and capabilities, and that includes the high-end. It should go without
saying that, multi-processor testing is a must. The more CPUs that you can afford to equip
your test systems with, the better. A quad-processor system will be much more efficient at
finding locking problems than a dual-processor system. The only large memory
quad-processor system we’ve ever had at OSR lived in our test lab. These days, we’re waiting
for our friends at Unisys to send us one of those awesome 32-processor systems that they
make. Not that they ever said they would send us one, mind you. But, we can always keep
hoping that one will spontaneously appear, right?

Another thing to consider is 64-bit support. Even if you’re not yet convinced that 64-bit support
is important in your market (trust me, you will be convinced eventually) when you buy new test
systems, why not get systems that can at least potentially perform double-duty? Here at OSR,
when we recently expanded our test lab we added a dozen AMD64 systems. These systems
offer the fastest 32-bit Windows performance available (for testing current releases of NT V4,
Win2K, XP, and Server 2003 Sp1) and let us validate and test our drivers on the pre-releases
of Windows-64 just by rebooting. It’s like getting a two-for-one deal.

In summary, getting a set of machines that will allow you to exercise your drivers in the
broadest and most thorough manner is the key to building a comprehensive test facility. This
means low-end as well as high-end systems, in varying processor speeds, number and
architecture.
Sometimes You Have to Write Your Own -- Case Study: ActGen IO Utility
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




The ability to generate native Windows I/O API events could be an invaluable process in
testing functionality in Windows drivers. For file system drivers in particular, this level of testing
is absolutely necessary. Alas, it isn’t possible to test all variations of the native Windows API
using Win32 calls, since Win32 is implemented as a subsystem. That is, you’re really talking to
an emulation layer when you use the Win32 API, which precludes you from testing things such
as extended attributes, case sensitivity, open by file id, and the enumeration of streams.

So where does that leave us? Well, we do know that the Win32 API, used by applications and
services, is translated into native system calls – the infamous "native" Windows API. However,
many of the native function calls beginning with NtXxx are undocumented and unsupported by
Microsoft, which makes working with the native Windows API complicated. Also (and more to
the point) who wants to write a program every time you wanna test a specific feature? A tool to
simplify the process of generating native I/O system service requests would be most welcome.

A Potential Solution
The OSR I/O Activity Generator (ActGen) is a development support and test utility. It is one
possible solution to the above stated need. ActGen facilitates the development of tests,
provides an infrastructure to directly access the native Windows API and has proven to be an
invaluable tool in testing Windows drivers here at OSR. The foundations of ActGen are
prototypes for the native Windows NtXxx functions accessed via a Windows Console
application and commanded with script-based tests. Hopefully, this introduction to ActGen will
provide fodder for your own investigations in this area.

API Prototypes
Given the undocumented nature of much of the native Windows API, securing prototype
functions to develop ActGen required a bit of effort. Trial and error, as well as careful
observations of system behavior went a long way, and there certainly are more public sources
available these days to provide insight (and in some cases, prototypes themselves) to
generate a useful list to work with. In the end, we determined prototypes for the NtXxx
functions as best we could. With native Windows API prototypes "in hand" we were ready to
develop our script-based tool.

The Scripting Language
The scripting language we developed is very easy to understand and, hence, easy to use.
ActGen is started from a Windows Console Prompt and processes script commands from a
user-written script file, which is passed to ActGen upon invocation. Commands are processed
from the test script until the end of the script file is reached or a script-terminating failure
occurs. Using a lexical parser, ActGen reads an input line from the test script and calls the
appropriate functions to get command-specific parameters and then actually performs
processing for the command. Since it’s important that ActGen is able to generate invalid
requests, the tool does not in any way attempt to validate the commands sent to it.

An Example Test Script
So what kinds of things can you do with ActGen?

Here is a simple example of how one would run the test script mytest.act and indicate that it
should be run on drive F:

C:> actgen mytest.act –d F:

Let’s dissect a sample test script written to test files opened with read-only access. We’ll start
by looking at the beginning of the script in Figure 1.

!----------------------------------------------------------------------------------------------------
!   Before we begin, make sure the directory being used in this script exists.
!   Create the directory if it does not exist.
!-----------------------------------------------------------------------------------------------------


CREATE            77, "$DISK1$\OSRTEST", GENERIC_ALL, DIRECTORY, FILE_OPEN_IF
CLOSE            77


!-------------------------------------------------------------------------
!   Create a file, write some data to it, and close it
!-------------------------------------------------------------------------


Create 1         "$DISK1$\OSRTEST\TEST1.DAT",GENERIC_WRITE,CREATE_ALWAYS
MWrite 1        100, 512, 0
Close 1


Figure 1 -- Sample Script: Testing Create

The Create command can be used to create/open directories or files. To override the default of
creating a file, use the flag DIRECTORY to specify that a directory is to be created. Notice that
file numbers are specified when a file/directory is created/opened. This number is then used to
identify which file a command is to be run against.

You’re probably wondering about the presence of $DISK1$. This is the string that will be
replaced with the drive letter passed in to ActGen with the –d option. In the ActGen invocation
example above, the driver letter F: is passed in so $DISK1$ will be replaced with F:. It is
important to note that in addition to drive letters, UNC path specification can also be passed in
the with the –d option. Also, if your script requires more than one drive letter or UNC path, you
can supply up to 26 drive strings to replace $DISKxx$ values in the test script.
Create dispositions can be specified to indicate how to create/open a file. In the above
example, FILE_OPEN_IF is used to ensure that the directory where data files will be stored
already exists. If it doesn’t exit, then the directory will be created. For the creation of the data
file, the disposition CREATE_ALWAYS is used to ensure that a new file will be created, even if
a file with that name already exists.

Once we’ve ensured that the directory exists, and we’ve created a file in the test directory, we
can write some data to the file to be used for this test. In this example we use Mwrite to
perform multiple write requests. A total of 100 writes of 512 bytes each will be performed, and
the first write will be at offset 0 – the beginning of the file.

Let’s continue to look at this script in Figure 2.

!-------------------------------------------------------------------------------------------------
! Now open the file as read-only and make sure that you can read but not
! write or delete.
!---------------------------------------------------------------------------------------------------


Create                    1           "$DISK1$\OSRTEST\TEST1.DAT", GENERIC_READ, OPEN_EXISTING
MRead                     1            100, 512, 0
ReadAsync          1 512,0
Read                          1           512,512
Expect                    1       0xc0000022
Write                         1   512,0
WriteAsync         1 512,512
MWrite                        1   2, 512, 0
Delete                1
Expect               1            Success
Close                 1


!---------------------------------------------------------------------------------------
!   Check to be sure the file can be opened to ensure delete failed
!-----------------------------------------------------------------------------------------


Create 1       "$DISK1$\OSRTEST\TEST1.DAT", GENERIC_ALL, OPEN_EXISTING


Close 1


Figure 2 -- Multiple Read Requests (Mwrite)

This time the file is opened with only read access by specifying GENERIC_READ as a create
access option. Notice that the file is opened with the create disposition of OPEN_EXISTING
because we expect this file to exist since it was just created. If it does not exist, there’s a
problem and we need to know about it.
At this point we want to ensure that reads succeed and writes fail. To do this, we test the
different kinds of read requests to ensure they complete with STATUS_SUCCESS and test
different writes to ensure they complete with STATUS_ACCESS_DENIED.

Like its counterpart, MRead performs multiple read requests. To reflect the write done when
this file was created, 100 reads of 512 bytes each is performed, and the first read is at offset 0
– the beginning of the file. Both a synchronous read, Read, and also an asynchronous read,
ReadAsync, are performed, each specifying that 512 bytes be read but starting at different
offsest in the file.

By default, ActGen assumes that the expected command completion status will be
STATUS_SUCCESS (0x0000000). However, if you know the command you are about to issue
is going to fail, you can specify the expected completion status with the Expect command. For
all the write tests and the delete to be performed on this file opened with read access, we
expect STATUS_ACCESS_DENIED to be returned. Therefore, we tell ActGen that we expect
a completion code of 0xC0000022. Once we have completed this set of tests we can set the
expected status back to the default with Expect 1 Success.

As a way of double checking that the file did not get deleted upon close, due to the issuing of
the Delete command, we open the file with the create disposition of OPEN_EXISTING to
ensure that the file still exists. If the file has been deleted and does not exist, the script will fail
because the expected status is different than the actual completion status.

Mapped Reads can also be tested with ActGen, see Figure 3.

!-------------------------------------------------------------------------
!   Check MAPPED_READ access as well...
!-------------------------------------------------------------------------
Create             1        "$DISK1$\OSRTEST\TEST1.DAT", MAPPED_READ, OPEN_EXISTING
MRead               1 100, 512, 0
ReadAsync 1              512,0
Read                   1         512,512


Close                1


Figure 3 -- Testing Mapped Read Access

Memory mapped read access is tested by specifying MAPPED_READ as the desired access
on open of the file. As in the previous code section, we test the various kinds of read requests
to ensure they all succeed.

Cleaning up files used during the test is the last step in this script and can be seen in Figure 4.

!--------------------------------------------------------------------------
!   Clean up files so this script will work next time
!---------------------------------------------------------------------------


Create 1       "$DISK1$\OSRTEST\TEST1.DAT", GENERIC_ALL, FILE_OPEN, DELETE_ON_CLOSE
Close 1


CREATE            77, "$DISK1$\OSRTEST", GENERIC_ALL, DIRECTORY, FILE_OPEN
DELETE           77
CLOSE            77


:end


!END OF SCRIPT


Figure 4 -- Clean Up of Files

Other Features of the Scripting Language
In addition to the commands described in the example, the ActGen scripting language is
capable of testing many other native interfaces. ActGen can be used to get and set extended
attributes, see Figure 5.

!--------------------------------------------------------------------------------------------------------
!   Add three new EAs to the file. After this, there should be 4 EAs on the file.
!---------------------------------------------------------------------------------------------------------
SetEa 1                 "FRED",300
SetEa 1                 "DICK",200
SetEa 1                 "HARRY",225


!---------------------------------------------------------------------------------
!   List all EAs on the file, there should be 4 EAs on the file.
!---------------------------------------------------------------------------------
GetEa 1


!---------------------------------------------------------------------------------
!   Now try to list DICK and HARRY
!---------------------------------------------------------------------------------


GetEa 1                "HARRY,DICK"


Figure 5 -- Get and Set Extended Attributes

Extended Attributes can be set on an open file using the SetEa command and then listed using
the GetEa command. All extended attributes defined for the file are listed by default with the
GetEa command. To list just certain extended attributes, simply specify the name of the
extended attribute(s) to be listed as the second parameter of the GetEa command.
Script flow control statements allow for jumping to other areas of the script with the stand alone
Goto command or based on file system name and the file system mount characteristics. File
locking can also be tested with ActGen, see Figure 6.

!---------------------------------------------------------------------------------
! Open the file for READ
!---------------------------------------------------------------------------------


CREATE           1         "$DISK1$\OSRTEST\lock1.dat", GENERIC_READ, OPEN_EXISTING, SHARE_ALL


!---------------------------------------------------------------------------------
! Open the file a second time
!---------------------------------------------------------------------------------


CREATE           2         "$DISK1$\OSRTEST\lock1.dat", GENERIC_READWRITE, OPEN_EXISTING, SHARE_READ


!---------------------------------------------------------------------------------
! Now take a lock out on the first open instance of the file
!---------------------------------------------------------------------------------


!                      Length, Starting Offset
Lock        1,         200, 0


!--------------------------------------------------------------------------------------------------------------
! Now attempt to read the locked region, using the second open instance of the file
!-------------------------------------------------------------------------------------------------------------


Expect       2,          0xc0000054
READ          2           100,        0
Expect       2,          Success
READ          2             100,300
!---------------------------------------------------------------------------------
! Unlock and close the file
!---------------------------------------------------------------------------------


Unlock           1,    200, 0
CLOSE         1
CLOSE         2


Figure 6 -- File Locking

We open two instances of an existing file and lock a range of the file. We can then use the
second open instance of the file to ensure that the locked portion of the file cannot be read and
that unlocked portions of the file can be read.
Multi-Threaded Operation and Synchronization
One of ActGen’s more powerful features is the ability to spawn multiple threads and
synchronize the operation of these spawned threads. Test scripts can be run synchronously or
asynchronously and they can be grouped to allow for synchronization using SYNC points. Let
us show you with an example.

We’ll use three scripts in this simple example: a master script (master.act), see Figure 7, an
asynchronous script (async.act), see Figure 8 and a synchronous script (sync.act), see Figure
9.

!-------------------------------------------------------------------
!     Define the start of a group and provide a name
!-------------------------------------------------------------------
GroupStart “SimpleGroup”
!---------------------------------------------------------------------------
!    This script is run asynchronously in its own name space
!----------------------------------------------------------------------------
run “async.act”
!--------------------------------------------------------------------------
!    This script is run synchronously within the name space
!    of the master script
!---------------------------------------------------------------------------
@sync.act
!-------------------------------------------------------------------
!     End of group
!-------------------------------------------------------------------
GroupEnd
!----------------------------------------------------------------------------
! End of script
!----------------------------------------------------------------------------


Figure 7 -- MASTER.ACT

!-------------------------------------------------------------------
!     Create a file in an existing directory
!-------------------------------------------------------------------
CREATE            77, "C:\OSRTEST", GENERIC_ALL, DIRECTORY, FILE_OPEN_IF
CLOSE            77


Create 1                  “C:\OSRTEST\TEST5.DAT", GENERIC_ALL, CREATE_ALWAYS
MWrite 1                  100, 512, 0
Delete 1
Close 1


!----------------------------------------------------------------------------------------------
! Set a sync point and wait “forever” for sync wait conditions to be met
!----------------------------------------------------------------------------------------------


SYNC                  “CompletionPoint”,-1


!-------------------------------------------------------------------------------------------
! Once the sync point has been reached, we can delete the directory
!---------------------------------------------------------------------------------------------
CREATE            77, "C:\TEST", GENERIC_ALL, DIRECTORY, FILE_OPEN_IF
DELETE           77
CLOSE            77


!----------------------------------------------------------------------------
! End of script


!----------------------------------------------------------------------------


Figure 8 -- ASYNC.ACT

!-------------------------------------------------------------------
!    Create a file in an existing directory
!-------------------------------------------------------------------
CREATE            77, "C:\OSRTEST", GENERIC_ALL, DIRECTORY, FILE_OPEN_IF
CLOSE            77


Create 1                  “C:\OSRTEST\TEST6.DAT", GENERIC_ALL, CREATE_ALWAYS
MWrite 1                  100, 512, 0
Delete 1
Close 1


!----------------------------------------------------------------------------------------------
! Set a sync point and wait “forever” for sync wait conditions to be met
!----------------------------------------------------------------------------------------------


SYNC                  “CompletionPoint”,-1


!----------------------------------------------------------------------------
! End of script
!----------------------------------------------------------------------------


Figure 9 -- SYNC.ACT

The GroupStart command begins a new group and names it "SimpleGroup". The group name
is optional and will only be used for debug output purposes if an error occurs. Subsequent
scripts that are executed prior to the GroupEnd command are considered to be within this
group and can, therefore, synchronize on SYNC points.

The next command we issue, run "async.act", executes one instance (more can be specified)
of the script async.act asynchronously and control returns immediately to the calling script.
The instance of this script runs in its own namespace rather than in the namespace of the
master script.

Before ending the group, we synchronously run the script sync.act with the command
@sync.act. This causes the script to be run within the namespace of the master script and
control does not return to the calling script until this script finishes executing.

Let’s look at the content of async.act in Figure 8 and sync.act in Figure 9.

Notice that both scripts have a synchronization point named "CompletionPoint" and the
timeout value is set to -1 which will result in an infinite wait. The script async.act will be run and
then control will return to master.act. However, async.act will halt execution at the SYNC
command and wait for sync.act to reach the same SYNC point because they’re in the same
group. This simple example just has each script create a different file in the same directory,
write to it, delete it and then one script (async.act) waits for the other to complete before
deleting the directory.

Conclusion
Developing a tool to exercise the native Windows API can be a useful addition to any Windows
driver testing suite. When used in conjunction with Driver Verifier and Call Usage Verifier
(CUV), both available from Microsoft and discussed elsewhere in this issue, you have a
powerful combination of test tools for any driver under test. Don’t let unforeseen errors cripple
the success of your driver!
One Special Case -- Testing File Systems
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




When the recent double issue’s theme of testing was announced, one of the topics we knew
had to be covered was testing in the file systems space. This is a difficult topic to actually
discuss intelligently because there is so little actually available for testing in this arena.

Part of this is the nature of this type of driver – most people developing in the file systems
space are building filter drivers and the typical testing paradigm for testing any filter driver is to
actually test the underlying device and make sure the filter driver doesn’t break anything! For
those actually building a file system, the task is more difficult because there are few "new" file
systems built, many of the APIs that must be tested are not supported by all file systems, and
this is such a highly specialized area that it seldom receives much attention.

In this article, we’ll start by discussing what already exists for testing file systems, since that
will be directly applicable to those building both new file systems and file system filter drivers.
Next, we’ll cover a couple of key areas of testing to consider. Finally, we’ll finish up by
discussing additional areas to explore in testing, which will be most useful to those building a
new file system.

Existing Test Tools
The existing test tools that are available from Microsoft are all included in the Hardware
Compatibility Test kit. At the time of this article, you can download this kit from Microsoft’s web
site http://www.microsoft.com/whdc/DevTools/HCTKit.mspx. Most of these are tests that have
been around – in one form or another – and are either generic driver tests (e.g., the DC2 tool
that tests to ensure a driver handles IOCTLs properly) or are specific tests for file systems.

The IFS Tests are most directly applicable, but there are numerous other, non-specialized
tests that should also continue to be useful – tests for verifying that I/O works correctly, that
disks respond as expected, etc. Be careful to read the published errata for the IFS Kit tests in
particular – there are a number of test cases that are known to fail under circumstances where
the failure is not considered to be "incorrect". This can happen because a file system filter
driver might actually change the behavior of the underlying file system. For example, an
anti-virus scanning filter might scan a file after it has been modified, causing the access time of
the file to change from the value expected by the test.

At OSR, we have our own in-house test suite (see Sometimes You Have to Write Your Own).
In addition, we have augmented this with the Microsoft HCT tests and a number of other
third-party benchmark and test suites, such as IOMETER (a program that performs I/O and
measures that performance) and an industry benchmark (Netbench) that we have found tests
another commonly observed range of functionality.

Environment Testing
As important as which test suite to use, is the understanding of the environment in which your
software will operate. A common issue that is overlooked by those performing testing in the
Windows environment for the first time, is the simple fact that how a file system is exercised is
often a measure of the specific type of access used.

The important point here then is for testing to be done against the type of environment in which
the product will be used. If this is primarily a server-side product, there is generally little benefit
from focusing all of your testing energy on the behavior under local access. Instead, you
should focus on remote (network) access, preferably using the same configurations you would
anticipate from your customers. Thus, you should determine which of the various file servers
would be used: SRV (for LanManager/CIFS access), SFM (for Macintosh access via Apple
File Protocol), SFU (for UNIX client access via NFS) or some other file server component,
whether provided by Microsoft or by a third party vendor. At OSR, we find each of these
produces subtly different behavior sometimes or uses interfaces unique to that particular
server product.

Testing for Interoperability
In addition, it is essential to test interoperability. Microsoft’s IFS web site
(http://www.microsoft.com/whdc/devtools/ifskit/testing.mspx) includes a list of known products
that can be used as a form of interoperability test matrix. There are so many of these products
that it is unlikely to be feasible to test against all of them. However, it is important to identify
which of these products is likely to be present within your target environment. For instance,
unless you are developing your own anti-virus product, it is quite likely that you will find an
anti-virus product present on your target system. Thus, testing with one (or more) anti-virus
products is essential to ensuring your own product will interoperate properly.

Keep in mind that the order of driver loading can also make a difference. This is especially true
with file system filter drivers, where the functionality of one filter might interfere with another
filter. However, it is also true with file system drivers, where we have seen instances in which a
failed device mount by one file system, blocks another from even noticing the mount attempt –
thus looking like a failure of the underlying product. To test these characteristics, you must
control the load order of your driver relative to other drivers. You can use the DeviceTree utility
to observe the attachment order of filter drivers.

Future Areas to Explore
We are always looking for ways to further expand our own ability to test file systems and file
system filter drivers. One interesting technique suggested by the new Filter Manager model is
to use an "encapsulation" test mechanism, where I/O operations are intercepted both before
and after they are sent to a particular file system filter driver. This provides tremendous insight
into how the subject filter driver is handling the I/O operations.
For the file systems developed here at OSR, we use a core set of tests (common functionality)
and then build specialized tests that exercise the unique functionality of the file system. Our
normal yardstick for "common functionality" is that functionality present in the FAT file system
(for read/write) or CDFS file system (for read/only). Thus, when we are not sure how
something should behave, we check to see what happens when we run the test on one of
these file systems. Most of the time, we get the same results – sometimes we find differences
between the way that FAT and NTFS process a particular operation. In those cases, we
usually choose one or the other for that particular file system. Occasionally, we find a feature
that we decide should work differently on our file system than it does on FAT or NTFS.

Testing Methodology
In soliciting feedback on this topic, one person suggested that we should review the basics of
testing methodology. Since our advice for those developing file system products is to develop
their own tests, we think it worthwhile to reiterate these points. Specifically, when developing
tests (and developing code, for that matter):

       Pay particular attention to "edge" conditions (e.g., counter overflow conditions,
        unusual circumstances, etc.).
       Ensure you handle extremes properly (e.g., buffer sizes that are too small or too big).
       Ensure that failure cases and error paths are properly handling all conditions – most of
        the fantastic failures we see are actually in error paths, even within our own code.
       Consider the behavior of resource limitations that lead to allocation failure or process
        starvation. This is a serious issue within file systems because they must not only
        compete for resources, but must also cooperate with the rest of the system to ensure
        fairness.
       Ensure that there are adequate resources for testing – particularly for file systems,
        testing is usually more resource intensive than the actual development. At OSR
        (where we are responsible for both) we spend far more time testing than we do
        performing just "raw" development.
       Look for ways to force timing issues to surface, for example through load testing or
        holding scarce resources for a long time. Our observation here is that as the code
        base becomes more mature, race conditions begin to surface. Ensure that you test on
        multi-processor machines, with both large and small memory configurations. One trick
        we have used (for example) is to force context switches at inconvenient times (e.g.,
        drop a spin lock and then sleep).
       Build your own test harness, be it a scripting language, GUI script builder, or whatever
        other tools that you can find.

The important thing to keep in mind is that no matter what you do in terms of testing, finding
the problems before you ship the product is less expensive for the organization than finding
them after you ship the product. Supporting users in the field that are having problems is
painful, difficult and expensive – and nobody is happy with the outcome!

As much as we wish there were a simple suite of tests to which we could point people building
file systems and file system filter drivers, there is not. However, we hope that this article
provides you with some basic pointers and gets things moving in the right direction. We trust
that in the future we will see more initiatives in this area so that those of us building file
systems and file system filter drivers will be able to better ensure our products work properly
before we ship them out to our customers!
On the Right Path -- Testing with Device Path Exerciser
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




The Device Path Exerciser, DC2, is one of the most powerful and important test tools in the
DDK. Unfortunately, because of its power, it’s also one of the most complex tools in the DDK.
Until recently, there was no good documentation for DC2. But, if you check the Windows
Server 2003 (or later) DDK, you’ll see outstanding documentation! Check it out.

In this article, we’ll expand on that documentation a bit. We’ll also try to provide you with a few
guidelines for how you can best use DC2 as a test tool for your driver. Pay attention, now! This
is good stuff!

How Does DC2 Work?
DC2 is a user mode application that sends a wide variety of valid, unusual but valid, and
absolutely perverse and invalid requests to a "Device Under Test". DC2’s tests are highly
iterative. That is, tests are performed repeatedly, each time with (perhaps) slightly varied
parameters. In a typical test run, DC2 might send a few hundred thousand requests to a driver
that it’s testing!

One of the keys to understanding DC2 is realizing that there are two different categories of
operational controls. They are:


         How a Device Is Opened.
         What Tests Are Performed on the Device (once it has been opened).

The first category above, How a Device Is Opened, is controlled by the "Basic Open
Operations" and any additional open operations that you specify.

The second category above, What Tests Are Performed, indicate the tests that are run on the
device for each successful open operation. For example, if 4 of 5 open operations succeed,
then all the tests you specify will be executed on the Device Under Test 4 times: Once for each
time one of the open operations succeeds.

Running Tests
DC2 tests are run from the command line. Its command line syntax is as follows:

                     DC2 [switches] [[/dr driver] | [devicename]]


The device(s) to be tested may be specified by referencing the driver name with the /dr switch.
When this is selected, all devices created by the driver are tested. Alternatively, one device
name may be specified on the command line. The syntax for this name must be the native
Windows NT device name (that is, the name with the \device\ prefix attached).

While the test runs, the menu bar of the command window will display the active test. The
criteria for passing any DC2 test is that the system does not crash, DC2 does not hang, and
the Device Under Test behaves as expected by the developer. The fact that DC2 completed
successfully (without hanging or crashing) is confirmed by the message "ALL TESTS
COMPLETE" appearing as the last text line of DC2.LOG and on the console.

Logging
DC2’s logging otions are both highly flexible, and complex. This is because DC2 creates four
different log files, each with a different purpose. The contents of three of these log files can be
controlled by the user. In fact, the test logging options that are chosen can have a dramatic
effect on the amount of time required to run a set of DC2 tests. So, choose wisely! In this
section, we’ll describe the options that control two of the most common logs: The main test log
(DC2.LOG) and the diagnostic log (DIAGS.LOG).

The main log that DC2 creates is named, quite reasonably, DC2.LOG. This file logs the tests
that are performed. DC2s tests are organized into Categories. Test categories are what are
selected via testing options. Within each test category there may be one or more test Groups.
Test groups comprise sets of related tests within a single category. Within a test group, there
are individual tests.

You can use the /ltX option (where "X" varies) to specify the level of logging performed by DC2
in the DC2.LOG file. /ltc writes one entry to the log indicating the start and end of each test
category. /ltg writes one entry to the log indicating the start and end of each test group. /ltt
writes an entry to the log for each individual test performed. Each of these logging options
causes DC2 to take successively longer to run, as more data is logged. The default logging
level for DC2.LOG is /ltc.

The diagnostic log, named DIAGS.LOG, was designed to contain diagnostic information about
each individual test that is run. You have to be very careful about how much detail you ask for
here. Asking for too much information to be logged can change your test run time from 2
seconds to 2 hours (no joke).

The /ldX switch (where "X" varies) controls diagnostic logging levels. /ldn specifies that no
diagnostic log be created. Using this option, obviously, results in the fastest possible DC2 test
runs. The other possible options for the value for "X" are: F to log fatal errors only, E to log any
errors, and I to log information.

Controlling diagnostic logging levels is most useful when you’re attempting to isolate one or
two tests that are causing your driver to fail. Unless you’ve got days to wait, you’ll never want
to specify /ldi when you’re running multiple categories of tests!
Open Operations
DC2’s operations begin by opening the Device Under Test in a number of different ways and
with several different options. Basic Open Operations comprise the standard ways in which
DC2 opens the Device Under Test for testing. Basic Open Operations are performed by
default – that is, they take place without specifying any command line switches. There are no
command line switches that can disable any of the Basic Open Operations during DC2 testing.

During Basic Open Operations, if any of the following operations fail with
STATUS_SHARING_VIOLATION, progressively more liberal ShareAccess is specified and
the Open is repeated. If an operation fails with STATUS_ACCESS_DENIED, progressively
less RequestedAccess is specified and the Create is repeated. In either of these cases, an
appropriate message is logged (with severity Error) to diags.log if this logging has been
enabled.

The Basic Open Operations are:


       Normal (IRP_MJ_CREATE). The device is opened, specifying the native device name
        only, for asynchronous operation. No CreateOptions are specified, and no
        ShareAccess is specified. The EaBuffer pointer and EaLength are set to zero.
        DesiredAccess is MAXIMUM_ALLOWED. CreateDisposition is specified as
        FILE_OPEN.
       With Added Backslash (IRP_MJ_CREATE). The device is opened as indicated in
        item 1 above, except specifying the native device name followed by a backslash. This
        test specifically probes for how device drivers and file systems deal with a root
        directory path being supplied along with the device name. Device drivers that do not
        support a namespace in addition to their device name (that is, almost every driver)
        should typically fail this request. Note, if you have specified FILE_DEVICE
        _SECURE_OPEN in your Device Object characteristics, this open request will be
        (properly) failed.
       As Named Pipe (IRP_MJ_CREATE_NAMED_ PIPE). The device is opened as a
        named pipe, using the ZwCreateNamedPipeFile function. In this test, ShareAccess is
        initially set to share read and share write. Most device drivers will fail this request.
        What would the driver do with such a strange open?
       As Mailslot (IRP_MJ_CREATE_MAILSLOT). The device is opened as a mailslot
        using ZwCreate MailslotFile. As with item 3, above, most device drivers should fail this
        request.
       As Tree Connection (IRP_MJ_CREATE). The device is opened by ZwCreateFile
        specifying CreateOption OPEN_TYPE_TREE_ CONNECTION. In this test, share
        read and share write are initially specified for ShareAccess. Drivers might either ignore
        or fail these requests.

If one of the Basic Open Operations succeeds, a few requests are sent to the Device Under
Test using the opened handle. This is to verify that the device can in fact be used for an I/O
operation. These requests are:
       ZwQueryObject on the returned file handle for ObjectNameInformation
       ZwQueryInformationFile for FileNameInformation
       ZwQueryInformationFile for FileFsDevice Information
       For the Normal open, an IOCTL_TDI_QUERY_ INFORMATION is sent (presumably
        to see if the device identifies itself as a TDI device)

The above Basic Open Operations form the basis upon which other DC2 tests build. The
contents of the test log (DC2.LOG) created (with /ltc) for these operations appear in Figure 1.

DC2 V3.2 Starting at Mon Jul 26 13:06:40 2004
Command Line: dc2 /ltc /ldn /lrn \device\null

START TESTING: device \device\null
Trying to open \device\null
Trying to open \device\null                with added backslash
Trying to open \device\null                 as NamedPipe
Trying to open \device\null                 as Mailslot
Trying to open \device\null                with option FILE_CREATE_TREE_CONNECTION
END TESTING: device \device\null
ALL TESTS COMPLETE at Mon Jul 26 13:06:40 2004


                    Figure 1 -- DC2.LOG for Basic Opens wiht /ltc Switch

Because the command line defaults to no tests being performed, only the 5 Basic Open
Operations (and the associated queries) are performed. If further tests had been requested,
either by specifying /hct (which selects a pre-defined group of tests) or by specifying the
individual test category to be performed, those tests would be performed while the Device
Under Test was open. For most test categories, tests are repeated for each successful open
mode.

Note that you can see every operation DC2 performs, and its associated results, by changing
the logging options to /lta

/lda /lra. In this case, every operation DC2 performs will be logged to DC2.LOG. The
operations performed by DC2, and the results of those operations, will be logged to
DIAGS.LOG. For example, when DC2 is run with no tests selected (just the basic opens), the
information is logged to DC2.LOG, see Figure 2.

DC2 V3.2 Starting at Mon Jul 26 13:09:22 2004

Command Line: dc2 /lta/lda /lra \device\null

START TESTING: device \device\null
Trying to open \device\null
\device\null Open
ZwQueryObject, Buffer=0x2650b8, Length=1032., Type=FileNameInformation
ZwQueryInformationFile, Buffer=0x264c40, Length=1032.,
Type=FileNameInformation
ZwQueryVolumeInformationFile, Buffer=0x6e958, Length=8.,
Type=FileFsDeviceInformation
ZwDeviceIoControlFile, ControlCode=0x210012
(IOCTL_TDI_QUERY_INFORMATION), InBuf=0x6f220, 24. OutBuf=0x6f23c, 40.
Trying to open \device\null              with added backslash
\device\null Open          with added backslash
ZwQueryObject, Buffer=0x2659a8, Length=1032., Type=FileNameInformation
ZwQueryInformationFile, Buffer=0x265530, Length=1032.,
Type=FileNameInformation
ZwQueryVolumeInformationFile, Buffer=0x6e958, Length=8.,
Type=FileFsDeviceInformation
Trying to open \device\null                as NamedPipe
\device\null Open            as NamedPipe
Trying to open \device\null                as Mailslot
\device\null Open            as Mailslot
Trying to open \device\null               with option FILE_CREATE_TREE_CONNECTION
\device\null Open            with option FILE_CREATE_TREE_CONNECTION
ZwQueryObject, Buffer=0x266298, Length=1032., Type=FileNameInformation
ZwQueryInformationFile, Buffer=0x265e20, Length=1032.,
Type=FileNameInformation
ZwQueryVolumeInformationFile, Buffer=0x6e958, Length=8.,
Type=FileFsDeviceInformation
END TESTING: device \device\null

ALL TESTS COMPLETE at Mon Jul 26 13:09:22 2004


             Figure 2 -- DC2.LOG with No Tests Selected (Basic Opens Only)

Additional Open Operations
By specifying additional switches, other types of open operations, in addition to the Basic
Open Operations, can be performed by DC2. These include the operations seen in Figure 2.


       /k – This switch causes synchronous Open variants be added to the 5 Basic Open
        Operations above. All 5 Basic Open Operations are performed, and in addition, the
        Basic normal, mailslot and named pipe open operations are repeated specifying
        CreateOptions with FILE_SYNCHRONOUS_IO_ ALERT and AccessMode with
        SYNCRHONIZE. An additional open, termed a "direct open", is performed if Direct
        Device Open (/dd) is also specified. In this "direct open", AccessMode is specified as
        SYNCHRONIZE| FILE_READ_ ATTRIBUTES| READ_CONTROL| WRITE_ OWNER|
        WRITE_DAC, and CreateOptions is set to zero. As a result of specifying /k, a total of 8
        Open operations are performed, as shown in Figure 3 (assuming /dd was not also
        specified).
           /dd – As described above, when specified with /k this switch causes an additional
        synchronous Open variant, the "direct device open," to be performed. If /dd is
        specified but /k is not specified, this Open is not performed.

DC2 V2.4 Starting at Mon Oct 29 18:44:27 2001

Command Line: dc2 /k /ltc /ldn /lrn \device\null

START TESTING: device \device\null
Trying to open \device\null              synchronous
Trying to open \device\null
Trying to open \device\null               with added backslash
Trying to open \device\null              synchronous       as NamedPipe
Trying to open \device\null              synchronous       as Mailslot
Trying to open \device\null                 as NamedPipe
Trying to open \device\null                 as Mailslot
Trying to open \device\null                with option FILE_CREATE_TREE_CONNECTION
END TESTING: device \device\null

ALL TESTS COMPLETE at Mon Oct 29 18:44:28 2001 (elapsed time 00:00:01)


             Figure 3 -- DC2.LOG with /k Swtich (synchronous Open variants)

Selecting Tests
Now that you understand how DC2 performs open requests, you’ll need to decide which tests
you want DC2 to perform once the device has been opened. DC2 has the capability of
performing a large range of tests. We’ll list a few of the most interesting here.

Miscellaneous Test (Part 1 and Part 2)
These tests are requested using the /m switch. The tests in this category are performed for:


       Each successful open in the Basic Open Operations, plus
       Each successful open in the Additional Open Operations, plus
       Each successful open in the Sub-Open (Relative Open) tests (if this test category is
        selected)

The Miscellaneous Tests are performed in two parts. During Miscellaneous Tests Part 1, a
series of ZwReadFile and ZwWriteFile operations are performed, specifying valid data buffers
pointers and varying lengths (including zero). The byte offsets specified include both zero byte
offset and varying 64-bit bytes offsets. ZwWriteFile operations are also performed explicitly
specifying a ByteOffset value of -1.

ZwCancelIoFile and ZwFlushBuffersFile functions are also issued.
Miscellaneous Tests Part 1 also issues a series of ZwQueryDirectoryFile operations for a list of
common information types, including:


       FileNameInformation
       FileDirectoryInformation
       FileFullDirectoryInformation
       FileObjectIdInformation
       FileQuotaInformation
       FileReparsePointInformation

Each of the above requests is issued many times, with valid user data buffer pointers and
varying buffer lengths (including zero). As part of Miscellaneous Tests Part 1, a series of
unusual system services are issued with the indicated device as the target. These include
ZwCreateSection, ZwLockFile, and ZwNotifyChangeDirectoryFile. Most device drivers will fail
the IRPs that result from these requests.

Miscellaneous Tests Part 2 test a series of requests, including ZwQueryAttributesFile,
ZwQueryFullAttributesFile, and Zw DeleteFile. Each of these requests is issued with valid
pointers to an ObjectAttributes structure.

IOCTL/FSCTL Zero Length Buffer Tests
This test category is selected with the switch /in (for IOCTL) or /fn (for FSCTL). The tests in this
category are performed for:

       Each successful open in the Basic Open Operations, plus
       Each successful open in the Additional Open Operations

These tests issue a sequence of IRP_MJ_DEVICE_ CONTROL (for /in) or
IRP_MJ_FILE_SYSTEM_CONTROL (for /fn) IRPs with zero specified for the InBuffer and
OutBuffer lengths. The InBuffer and OutBuffer pointers are specified as addresses high in
kernel virtual address space (0xfffffc00).

The IOCTL or FSCTL control code used is dynamically determined according to DC2
parameters. For each function from ioctl_min_function to ioctl_max_function, an IOCTL or
FSCTL is issued with that function and a device type ranging from ioctl_min_devtype to
ioctl_max_devtype. For each device type and function pair, each buffering method (i.e.
transfer type: METHOD_BUFFERED, METHOD_IN_ DIRECT, METHOD_OUT_DIRECT,
METHOD_NEITHER) and each type of requested access (FILE_READ_ACCESS,
FILE_WRITE_ACCESS, FILE_ANY_ACCESS) is used.

Ioctl_min_function to ioctl_max_function is determined by the /fl and /fu switches, which
default to 0 and 400, respectively. Ioctl_min_devtype to ioctl_max_devtype, determined by the
/dl and /du switches, default to 0 and 200, respectively.
IOCTL/FSCTL Random Tests
The IOCTL/FSCTL Random Tests are the heart of the testing that DC2 performs.

This test category is selected with the switch /ir (for IOCTL) or /fr (for FSCTL). The tests in this
category are performed for:


       Each successful open in the Basic Open Operations, plus
       Each successful open in the Additional Open Operations

These tests issue a sequence of IRP_MJ_DEVICE_ CONTROL (for /ir) or
IRP_MJ_FILE_SYSTEM_CONTROL (for /fr) IRPs with random functions, device types,
methods, and access specifications. IRPs are issued with both valid and invalid InBuffer and
OutBuffer pointers and lengths. The contents of the InBuffer and OutBuffer, when valid, are
random. The number of IOCTLs generated may be controlled by the /t switch. If the /t switch is
not specified, DC2 will generate 100,000 random IOCTLs per test sequence.

The range over which the device type is chosen is specified by ioctl_min_devtype (specified
via the /dl switch) to ioctl_max_devtype (specified via the /du switch). When running this test,
you’ll want to be sure that the device type for your device falls within the ranges you specify
with the /dl and /du switches. The IOCTL function code range that’s tested is specified using /fl
for the ioctl_min_function and /fu for the ioctl_max_function. Again, you will want to be sure
that your driver’s device type falls within the specified range.

The control code, in buffer and out buffer information are logged to DC2.LOG when /ltt is
specified.

Tailored IOCTL/FSCTL Tests
These tests are automatically selected whenever the IOCTL/FSCTL Random Tests are
selected. The tailored tests attempt to further probe IOCTL or FSCTL requests using a
heuristic based on the results obtained during the basic IOCTL/FSCTL Random Tests.
Depending on the results observed during testing, DC2 may issue zero, few, or many
additional IOCTL or FSCTL requests in this category.

In and out buffer pointers and lengths may be valid or invalid, and a variety of transfer types
(METHOD_BUFFERED, METHOD_IN_DIRECT, METHOD_OUT_DIRECT, and
METHOD_NEITHER) may be used.

The control code, in buffer and out buffer information are logged to DC2.LOG when /ltt is
specified.

Open/Close Tests
This test category is selected with the switch /oc. The tests in this category are performed for
only two of the Basic Open Operations – Normal open and open With Added Backslash.
This test spawns several threads, each of which performs several thousand create/close
sequences. Each create issued is an IRP_MJ_CREATE, specifying RequestedAccess as
MAXIMUM_ALLOWED, and ShareAccess as FILE_ SHARE_READ| FILE_SHARE_WRITE|
FILE_SHARE_ DELETE for sharing, and OpenDisposition as FILE_OPEN.

When selected, the Open/Close Tests are run twice: Just prior to Normal open, and just prior
to open With Added Backslash

Other Tests
As previously mentioned, there are many more tests that DC2 can perform. Refer to the
documentation in the Windows Server 2003 DDK (which is exceptionally well written, by the
way) for more information on other tests that DC2 can perform.

Wrapping It Up
DC2 is a flexible and powerful tool. No driver should be considered "done" until it has passed a
complete set of DC2 tests! If you combine the information in this article, and the info in the
Windows Server 2003 or later DDK documentation, you should be able to take maximal
advantage of this cool tool.
Just Checking Revisited -- Installing a Partially Checked Build
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 01-Jun-05




The checked build of Windows has lots of additional cross-checks, what we at OSR call
"reasonableness checks," that are performed on operating system data structures and
function call arguments passed from drivers. No driver has been sufficiently tested for release
until it runs on the checked build without displaying obvious problems. Let me repeat myself for
emphasis: No driver has been sufficiently tested until it has run on the checked build without
problems. This means: You?ve got to test on the checked build. Period.

Of course, there are a few disadvantages to running the complete checked build of Windows.
One disadvantage is that the checked build is both larger and slower than the free build.
Another issue is that to install the entire operating system checked build, you must maintain a
completely separate operating system installation. This means you have to keep things (like
registry parameters and versions of the driver being tested) in-synch between the free and
checked installations on your system. This can get down-right annoying, fast.

One solution to these problems is to install just part of the checked build. In this updated article
(Just Checking, The NT Insider,Jan/Feb 2001), we describe how to install just a checked
operating system image and checked HAL. See the Windows DDK documentation, under the
topic, "Using the Checked Build of Windows," for more information on using the checked build.

First, You Have To Find It
Obviously, you can?t install the checked build of the OS if you can?t find it. You can usually
find the checked build for the latest version of the OS in your MSDN distribution. Alternatively,
you can go to www.osronline.com and look for the little block labeled "The Latest" on the home
page. In this block, you?ll see a link that reads "Checked Build Downloads" ? this link is kept
up to date with the latest pointers to downloads of the checked build.

Of course, these files are likely to be executable install files. No matter, just open the file
(either the one you download or the one from MSDN) with WinZip or a similar utility and you
can extract the specific checked files that you want.

Choosing What To Install
An alternative to installing the complete checked build on the target system is to manually
install only the checked versions of the operating system image and HAL. This procedure will
result in an additional boot option that lets you start the system using just the checked
operating system image and HAL, but the free versions of all other system components.
One advantage to this approach is that drivers get the benefit of the operating system and HAL
debug cross-checks while performance impact on the entire system is minimized (due to the
fact that free versions of system components other than the operating system image and HAL
are being used). Another advantage is that it allows a single installation (and thus one system
directory, one set of executable components, and one set of registry parameters) to utilize
either the checked or the free versions of the operating system image and HAL as determined
at boot time.

Installing a Checked OS Image and HAL
Installing the checked versions of the operating system image and HAL involves copying the
appropriate files from the checked distribution kit to new, unique, file names in
the %SystemRoot%\system32\ directory. There are two important guidelines to keep in mind
when installing a partially checked build on an otherwise free installation:

        The operating system image and HAL must be kept in synch at all times. Therefore, if
         a checked version of the operating system image is used, the checked version of the
         HAL must be also be used (and vice versa). Failure to keep the operating system
         image and HAL in synch can result in rendering the system on which they are installed
         unbootable.
        Take special care not to overwrite the free versions of the operating system image and
         the HAL that are installed by default in the installation of the free build. Overwriting the
         free versions of the operating system image and HAL can result in the system
         becoming unbootable, and can make it difficult to recover from errors. Therefore,
         always be careful to copy the checked versions of the operating system image and
         HAL to unique files names in the %SystemRoot%\system32\ directory.

As long as you keep the above guidelines in mind (and edit carefully!) installing the checked
versions of the operating system image and HAL is easy. Now we continue in further detail.

Step 1: Identifying the Files to Install
There are several different versions of the operating system and HAL images supplied as part
of the Windows distribution kit. These different versions exist to properly support specific
combinations of processor and system hardware. When Windows is installed, the installation
procedure automatically identifies which operating system image and HAL image to use, and
copies the appropriate files to the %SystemRoot%\system32 directory of the system being
installed.

The original names of the operating system image files as stored on the distribution media
may include one or more of the following:


        NTOSKRNL.EXE - Uniprocessor x86 architecture systems with 4GB of physical
         memory or less.
        NTKRNLPA.EXE - Uniprocessor x86 architecture systems with PAE support, that is,
         more than 4GB of physical memory installed.
        NTKRNLMP.EXE - Multiprocessor x86 architecture systems with 4GB of physical
         memory or less.
        NTKRPAMP.EXE - Multiprocessor x86 architecture systems with PAE support, that is,
         more than 4GB of physical memory.

When copied to the %SystemRoot%\system32\ directory by the Windows installation
procedure, the operating system image files and HAL use fixed, well known, names. This
makes it easy for the loader to locate these files at boot time. The well-known names for these
files are:


        NTOSKRNL.EXE - Operating system image for x86 systems with 4GB or less of
         physical memory;
        NTKRNLPA.EXE - Operating system image for x86 systems with PAE support, that is,
         more than 4GB of physical memory.
        HAL.DLL - Loadable HAL image.

During system installation, the installation procedure creates each of these files by copying the
file appropriate for the system?s hardware from the distribution kit, and renaming it from its
original name to one of the fixed names above.

The first step in installing the checked operating system image and HAL is to determine the
original names of the images that were copied to your system during system installation. You
do this by examining the file %SystemRoot%\repair\setup.log. An example of this file is
shown in Figure 1. This file is used during the system installation process to copy files from the
distribution medium to the %SystemRoot%\system32 directory.

[Paths]
TargetDirectory = "\WINNT"
TargetDevice = "\Device\Harddisk0\Partition1"
SystemPartitionDirectory = "\"
SystemPartition = "\Device\Harddisk0\Partition1"
[Signature]
Version = "WinNt5.1"
[Files.SystemPartition]
NTDETECT.COM = "NTDETECT.COM","f41f"
ntldr = "ntldr","3e8b5"
arcsetup.exe = "arcsetup.exe","379db"
arcldr.exe = "arcldr.exe","2eca9"
[Files.WinNt]
\WINNT\system32\drivers\kbdclass.sys = "kbdclass.sys","e259"
\WINNT\system32\drivers\mouclass.sys = "mouclass.sys","7e78"
\WINNT\system32\drivers\uhcd.sys = "uhcd.sys","10217"
\WINNT\system32\drivers\usbd.sys = "usbd.sys","5465"
(?several similar lines omitted?)
\WINNT\system32\framebuf.dll = "framebuf.dll","10c84"
\WINNT\system32\hal.dll = "halmacpi.dll","2bedf"
\WINNT\system32\ntkrnlpa.exe = "ntkrpamp.exe","1d66a6"
\WINNT\system32\ntoskrnl.exe = "ntkrnlmp.exe","1ce5c5"
\WINNT\inf\mdmrpci.inf = "mdmrpci.inf","96a3"


Figure 1 -- Example of %SystemRoot%\repair\setup.log

Again, regardless of which of the above files is installed, the non-PAE operating system image
file is always called NTOSKRNL.EXE and the PAE operating system image file is always
called NTKRNLPA.EXE when they are copied to the %SystemRoot%\system32\ directory.
You can identify which operating system image files were installed on your system by
searching setup.log for the above file names. Make a note of which operating system image
files are used on your system, because you will need to copy the checked versions of these
same files from the checked distribution kit. You will find the standard, well known, name of the
operating system image file on the left of the equals sign, and it?s original name from the
distribution medium immediately to the right of the equals sign on the same line.

In the example setup.log file shown in Figure 1 you can see that two operating system image
files were copied to the \winnt\system32\ directory (which is %SystemRoot%\system32\)
during installation. The file ntkrpamp.exe is copied from the distribution medium to
ntkrnlpa.exe and the file ntkrnlmp.exe is copied from the distribution medium to
ntoskrnl.exe.

Like the file for the operating system image, the file containing the HAL in
the %SystemRoot%\system32 directory is always named hal.dll. And, again like the
operating system image, because the HAL varies with the hardware platform on which
Windows is installed, the HAL file may have been renamed during the installation process. To
find the original name of the HAL file, examine setup.log just as you did for the operating
system image file. You will see the file name hal.dll on the left of the equals sign, and the
original file name on the right of the equals sign on the same line. For example, in Figure 1 you
will see that the hal.dll installed on this system was originally named halmacpi.dll. Carefully
make note of the original name of the HAL file.

Once you have identified which operating system image and HAL files are installed on your
machine, you?re ready to copy the checked versions of these files to
the %SystemRoot%\system32\ directory.

                                              Tip!

                         Some HAL files have deceptively similar
                         names.      For example, halacpi.dll and
                         halapic.dll are two commonly used HALs.
                         Be careful to use the correct version of the
                         HAL for your system.     Selecting the wrong
                         HAL will result in a system that is not
                         bootable.
Step 2: Getting and Copying the Checked Files
Now that you know which files to copy, you will need to copy the checked versions of these
files to unique file names in the %SystemRoot%\system32\ directory. Find the files you have
identified in the checked distribution kit.

Apparently, not everybody thinks finding the appropriate files is straight forward. Note that the
checked distribution kit is most likely provided as a compressed executable archive named
setup.exe or something similarly helpful. Thus, you?ll need to use a utility (I use Windows
Commander, a knock-off of the venerable Norton Commander, but I?m sure there are others)
to extract the files from the executable archive. In Windows Commander, you just select the
executable and type ctrl/page-down to open the executable archive. Alternatively you can just
run the executable on the wrong version of the system, causing it to fail, perhaps aborting it in
the process. This will leave the files from the checked build in a temporary install directory
where you can grab them. You can thank SNoone for that last suggestion. I?m sure none of
you are surprised that he would come up with such an idea.

Once you?ve found the files you want, copy them to the %SystemRoot%\system32\ directory
of your system, giving them new, unique, file names. The copies of these files may be named
just about anything you like, however the file names must adhere to the 8.3 naming
convention.

One way to ensure unique, 8.3-compliant, file names is to rename the file types from their
original file types (.dll or .exe) to .chk when they are copied. Thus, using the example started
previously, we would copy files from the checked distribution kit as seen in Figure 2.


                    Original File Name on File Name Copied to in
                    Checked Distribution       %systemroot%\system32\

                    Ntkrnlmp.exe               ntkrnlmp.chk

                    ntkrpamp.exe               ntkrpamp.chk

                    halmapic.dll               halmapic.chk


Figure 2 -- Renaming Checked Distribution

Some files in the checked distribution are provided in compressed form. These files are
indicated with an underscore character as the last character in their file type. For example, if
you look for the file halapic.dll in the checked build distribution, but you find the file
halapic.dl_ you have found the correct file but it is compressed.

To decompress compressed files from the checked distribution, use the expand utility found in
the %SystemRoot%\system32\ directory. For example, to expand halapic.dl_ and name the
expanded file halapic.chk you can use the following command from a command prompt
window:

                                   expand halapic.dl_ halapic.chk


Step 3: Editing 'boot.ini"
Once you have copied the checked files to the %SystemRoot%\system32\ directory, you will
need to create a boot-time option that allows the system to be started using these checked
files. You do this by editing the file boot.ini.

Boot.ini resides in the root directory of the boot volume of your system, and is normally set as
a "hidden" and "system" file. As a result, it typically will not be displayed by the default "dir"
command from a command prompt window. You may verify the existence and attributes of
boot.ini using the "attrib" command from a command prompt window. Once you have verified
that you can find boot.ini, you must change its attributes so that it is no longer marked as
being either "hidden" or "system". This can be done using the following command:

                                   attrib ?s ?h boot.ini


See the example in Figure 3.




Figure 3 -- Locating and Setting Attributes in "boot.ini"

Once the "system" and "hidden" attributes have been removed from boot.ini, you can edit the
file with a text editor such as Notepad. The goal of editing boot.ini is to create a new boot-time
option that allows you to start the system with the checked version of the operating system
image and HAL that you previously copied to %SystemRoot%\system32\.
When you begin your editing session, the unmodified version of boot.ini will look something
like the example shown in Figure 4.




Figure 4 -- Example of file "boot.ini" Before Editing

The file boot.ini controls the operating system options that are displayed at boot time. The
example file in Figure 4 shows a computer with a single operating system installed. It is
possible that the version of boot.ini on your system has multiple lines under the [operating
systems] heading. This will be the case if you have more than one operating system (or
operating system version) installed on the computer, such as if you have both Windows Server
20003 and Windows XP installed on the same computer, or you have a Beta and released
version of Windows XP installed on the same computer.

Locate the line that describes the operating system for which you want to install the checked
operating system image and HAL. You can identify this system by the name of
the %SystemRoot% directory which appears after the first backslash in each of the lines in
the [operating systems] section. In Figure 4, the %SystemRoot% directory for the one system
image in boot.ini is named \WINDOWS.

Once you have located the line that describes your desired operating system, make a copy of
the line, and paste it at the end of boot.ini, in the [operating systems] section, on a line by
itself. The result is shown in Figure 5.




Figure 5 -- Coping the Desired Operating System Line

Continue editing boot.ini, and add to the end of the copied line the following options:

        /kernel=osfilename /hal=halfilename


where osfilename is the name (file name and type) of the checked version of the operating
system image file that you previously copied from the checked distribution kit, and halfilename
is the name (file name and type) of the checked version of the HAL that you previously copied
from the checked distribution kit.

If the line that describes your operating system contains the /PAE option, be sure to utilize the
checked version of the operating system image that supports PAE (as previously described). If
the line that describes your operating system does not have the /PAE option (as that shown in
the Figures), utilize the checked version of the operating system image that you copied without
PAE support.

The text that appears within quotes on each operating system boot line in boot.ini is displayed
at boot time to identify the operating system to start. Before completing your edits, be sure to
change the text within quotes on the line you are editing to identify that the checked version of
the operating system image and HAL are being used (e.g., adding "DEBUG" to the name).
When your editing is complete, your version of boot.ini should look similar to that shown in
Figure 6.




Figure 6 -- "boot.ini" After Final Editing

Once these changes have been made, save the changes and exit from the editor. The next
time you boot this system, a new operating system boot option will be displayed that allows
you to select your checked operating system image and HAL.
It's Easy to be Hard -- Testing with HCTs
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




The Hardware Compatibility Tests, commonly called the HCTs, are a set of tests,
documentation and processes that hardware/software is required to pass for participation in
the "Designed for Microsoft Windows Logo Program"
(http://www.microsoft.com/whdc/winlogo/default.mspx). But, they are actually more than that.
They are a set of tests that many driver writers can use to directly or indirectly test their drivers
– even if they do not directly control hardware and even if they are not going to submit their
driver/device for certification.

This article explains what the HCTs are and how to use them.

What Are They?
The HCTs are provided to developers so that they can pretest/preview the requirements that
Microsoft has for hardware/software in order to ensure that quality standards necessary to
obtain a Designed for Microsoft Windows logo are being met.

The Current HCTs support the testing of the hardware types seen in Figure 1.


         Audio (Adapter, Global Effects Filter, Synthesizer)
         Bus Controllers (Bluetooth,Cardbus/PCMCIA,1394,IrDA,and USB Controllers or
          Hubs)
         Display (Monitors)
         Anti-Virus/File System Filter Drivers
         Imaging (Cameras, Printers, Scanners)
         Input and HID (Game Devices, Keyboards, KVM Switches, Pointing and Drawing
          Devices, Smart Card Readers)
         Modems (Analog, Wireless, ISDN)
         Network Devices (ATM, Cable Modems, DSL, LAN, NDIS IM, NDIS Universal,
          Winsock Direct, WAN Devices)
         Storage Controllers and Devices (ATA/ATAPI,ATA RAID, Fibre, iScsi, Raid, SCSI,
          SATA, SATA RAID, CD-ROM, CD-RW, DVD, DVD-RW, DVD Combo, Tape, Hard
          Disks for all controllers , Removal Media, Cluster, RAID, Media Changer, Disk Storage
          Systems, Bridge Devices
         Streaming Media and Broadcast (DVD Playback, DirectX VA DVD Playback, Video
          Input and Capture)
         Systems (Desktop, Mobile, Motherboard, Server, Data Center Server, Cluster, Fault
          Tolerant, HAL)
         Unclassified (UPS, Miscellaneous Device, Universal Server Device)
Figure 1 -- HCT-Supported Hardware Device Types

The types of tests to be run and the testing configuration to be used vary depending on the
category of the hardware/software. The HCT documentation outlines each category and the
set of tests to be run. Testing for categories may contain both automatic and manual tests. For
example, if you want to test either a keyboard device or a piece of software that is in the
keyboard stack, you would use the HCTs for Keyboards as shown in Figure 2 (and as
documented in Microsoft HCT documentation).

        ACPI Stress Test – Examines devices in the system, and then puts the system into a
         sleep/hibernation state.    When the system returns from sleep/hibernation, the test
         verifies that the devices function properly after sleep/hibernation.
        ChkINF – Uses a Perl script that checks the structure and syntax of the device drivers
         INF files.
        Device Path Exerciser – Designed to crash a driver by calling in through various
         user-mode I/O interfaces.    It tests driver robustness rather than driver functionality
        Driver Verifier – stresses device/driver combinations and tests the device’s use of
         system resources, including the driver’s memory paging behavior
        Public Import – Verifies that no APIs called by drivers are off-limits.
        SysParse – Inventories the devices, drivers, and software on the test system and
         records it.   This test always passes, the information it gathers is used as part of the
         Logo submission process
        Winkey Media -    Keyboard specific tests
        Winkey One-Key Combination –        Keyboard specific tests
        Winkey Two-Key Combination – Keyboard specific tests
        Winkey Three-Key Combination – Keyboard specific tests
        IEEE 1394 - Run only the IEEE 1394 tests that are listed in Test Manager for your test
         system or device because your system or device may not require all 1394 tests.
        Cardbus/PCMCIA - Run only the Cardbus/PCMCIA tests that are listed in Test
         Manager for your test system or device because your system or device may not
         require all Cardbus/PCMCIA tests.
        PCI - Run only the PCI tests that are listed in Test Manager for your test system or
         device because your system or device may not require all PCI tests.
        USB- Run only the USB tests that are listed in Test Manager for your test system or
         device because your system or device may not require all USB tests.

Figure 2 -- Example: Keyboard-Specific HCT Tests

You might recognize a number of the tests above from the DDK where they’re also available.
In most cases, when a tool appears both in the DDK and the HCT, the tools are the same
(though this is not always true). And even when a tool does appear in both the DDK and the
HCT, when the tool is run in the HCT for logo testing, it’s run with a particular set of options
selected. These options may not be the ones you want to use, if you’re using the HCTs to
exercise your driver’s functionality (and not to submit your driver for the Designed for Windows
logo).
The HCT documentation describes each test in detail, including the approximate run time of
the test, the output log file name, the OS that the test is supported on, test assertions,
additional hardware requirements, and related links. So if you have a question on a test, all the
documentation is there to help you find your answer.

Note: Though you may have no desire to "logo" your product, use caution in shipping a driver
that has not passed the HCTs required for your specific device – your customer will take notice
of system instability caused by your driver, and they won’t be happy about it.

Oh yeah…if you have a device/software that you want to "logo" and it does not appear in the
provided device list, you can contact Microsoft and work with them to make it happen.

How Are They Used?
When the HCT kit is installed (make sure that you select all the tests you want installed; not all
tests are installed by default) it adds the HCT Test Manager, as seen in Figure 3, to your Start
Menu. The Test Manager is the GUI application that allows the tester to select test categories,
select tests, run tests, use test groups, view tests logs, monitor tests runs, and in general,
manage the test process. It will automatically scan your test system for the test categories you
installed and allow you to select, run and review the results of the relevant tests.




Figure 3 -- HCT Test Manager
For some categories of tests, it is possible to run tests from the command line. For example,
the file system test, IFSTEST, the storage device stress utility, SdStress, and the NDIS test,
NDTest, can be run from the command line

Another command line test worthy of mention is "devpathexer.exe" (fraternal twin to DC2 in
the DDK). This nasty little program exercises your drivers IRP_MJ_DEVICE_CONTROL entry
point. Many driver developers who support this entry point fail to perform all the necessary
checks to ensure that each request that they receive is entirely valid, especially if they are
supporting METHOD_NEITHER. Be it zero length buffers or invalid buffer addresses, most
developers miss something, and fail to do the full set of rigorous tests necessary to truly
shake-out their driver. DevPathExer will test this part of your code by sending it a variety of
malformed requests. Passing this test will give you that extra level of confidence in your driver
– before you’re made a fool of by your customer.

For more information on DC2, see the article On the Right Path.

The possible results for any one test are:


         A device required for this test is missing. The test cannot be run.
         The test has not been run successfully
         The test has been run but has not successfully passed.
         The test has been run and has successfully passed.

After completing the tests, the "Test Log Information" tab in the Test Manager allows the user
to select logs for viewing. Figure 4 shows the log file for the "Disk Stress" test run on our test
system.
Figure 4 -- Disk Stress HCT Log File

Not Just For Logo Testing
Of course, the HCTs are useful for testing beyond fulfilling the Windows logo requirements.
They can also be useful for doing general functionality testing for a driver.

Here at OSR, we do a lot of file system development. Thus, we use the HCT File System Test
Specifications to verify the correctness of our file systems and file system filter drivers. As you
probably know, file systems and file system filter drivers are some of the most difficult drivers
to get right in Windows, so having a test suite is invaluable. The File System Test Descriptions
provide a description of test "suites", the test "groups" that are included in the suite, and a
description of the "tests" in each test group. So if you looked up "TD-21.15 ReadWrite File
System Test Group" in Chapter 21 of the Anti-virus/File System Filter Driver Test
Specification in the HCT Documentation, it would list the tests in this group and give a
description of each of the tests.

The descriptions in the documentation are quite detailed, actually. In fact, it’s detailed enough
that if you had to, you could write a test program to perform these tests yourself, if your file
system was having problems in these areas. But why bother? The HCTs give you the ability to
run individual tests to help you isolate problems. Why re-invent the wheel?
Things to Remember
Don’t forget, it is also a good idea to write your own tests if your driver has additional
functionality not tested by the HCTs. Further, remember that running and passing the HCTs

does not remove you from your professional obligation to utilize other standard debugging and
testing practices such as running with Driver Verifier, using the checked build, CUV, etc.

Summary
The HCTs are not only a requirement for the "Designed for Microsoft Windows" logo. They’re
also a pretty good toolset for either directly or indirectly testing your driver. Whether running
the tests via the HCT Test Manager or via the command line, the HCTs offers developers a
well-documented set of tests that can help ensure that you are developing the highest quality
driver possible.

The latest HCT downloads can be found by following links from
http://www.microsoft.com/whdc/hwtest.
It's a Setup -- What You Need to Start Developing Drivers
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




One of the most common questions that we get here at OSR is, "what do I need to get started
writing drivers?" This article addresses that question by providing a minimal level of guidance
to get you headed in the right direction.

Where Do You Start?
In order to start developing drivers for Windows, there are a few things that we need to discuss
in detail:

         The Windows Driver Development Kit (DDK)
         A code editor
         A development machine
         A target machine
         A debugger
         A source control system

The Driver Development Kit
The Windows DDK is required for building drivers. You can either order the DDK via
http://www.microsoft.com/whdc/devtools/ddk/orderddkcd.mspx (just pay shipping), or you can
get it as part of a Microsoft Developers Network (MSDN) subscription. MSDN has four
applicable subscription levels available to assist driver writers: Universal, Enterprise,
Professional, or the minimum Operating Systems subscription. Each of these subscriptions
gives you the latest DDK and checked and free versions of the Windows operating systems.
Which MSDN subscription you pick depends upon what extra features you require. You can
buy an MSDN subscription from a variety of resellers, including OSR.

The current versions of the DDK (XP and later) contain the header (".H") files and all the tools
you need to compile and link your driver. Thank your lucky stars that this has now all been
integrated into one kit. By the way, you always want to be building with the latest DDK
available, since Microsoft tends to improve the associated tools with each release.

The other thing that the DDK gives you is sample code. While once pretty sad, the DDK
samples have been cleaned up immeasurably and are great starting places for your driver.

A Code Editor
Let’s face it – Notepad and WordPad don’t cut it when it comes to code development. Whether
you use CodeWright, SlickEdit, Visual Studio, or Emacs, you need an editor that understands
code formatting. If it offers other features like the ability to build from within the environment;
then that is another plus.

Here at OSR, we use Emacs, CodeWright, and Visual Studio, so pick your favorite. For those
of you using Visual Studio you may want to check out the article If you Build it – Visual
Studio and Build Revisited. This article talks about how to use the batch file ddkbuild.bat to
perform builds with Visual Studio and Visual Studio .NET.

A Development Machine
When selecting a development machine, you need to be aware of what you are going to be
using it for. Remember, you are going to be running the compiler, editor, debugger, source
control, DDK help, and web browser simultaneously – and let’s not forget the operating system!
These all take memory and processing power and none of us likes to wait. Personally, I
recommend at least 768 MB of memory on your development machine, and as for processing
power, the more the merrier. I have dual 1.5GHz Xeons which I am more than happy with.
Your mileage may vary…

Oh and don’t forget your debugging connection. For kernel debugging on Windows, you either
need a serial connection or an IEEE 1394 connection (only supported on Windows XP and
Later OSes) on both your development and target machine.

A Target Machine
Now some people may say, "I can’t afford another machine", and will argue that they can use
SoftICE and do everything on one machine. We completely disagree! No matter how good a
driver developer you are, sooner or later you are going to either trash your disk or corrupt the
registry and the system will not boot. Do you realize how much time you’ll spend restoring your
system? Think about the changes you could lose. If you are getting paid $30 dollars an hour
(good God, I hope not) and you lose 40 hours of development time because you are restoring
your system, that $1200 will go a long way towards buying a second machine. Have you
checked the prices on new PCs lately?

The driver-writer version of Murphy’s Law states; "Your customer will be running your software
on a computer with more CPU’s than you have." This means you better be running at least a
dual-processor. You can’t test your software solely on a uni-processor system and expect it to
work automatically on a multi-processor.

Here at OSR, we highly recommend that your primary target machine be a dual-processor
AMD-64 based system. Why? This allows for 32-bit and 64-bit testing, as well as single and
dual processor testing.

By the way, no one likes a slow test machine. If you haven’t already read Just Checking -
Revisited, then I suggest that you do. It will save testing time and will provide a way to save
some of that precious processing power.
A Debugger
When it comes to picking a debugger there are really only 2 choices: Microsoft’s WinDBG (or
kd.exe) and Compuware’s SoftICE. Which one you choose is a decision that falls somewhere
between a personal preference and religion. They both have their pros and cons.

WinDBG users for many years have lived with the serial port as the only way of connecting the
host to the target. But take note! As of Windows XP, this restriction was lifted so that a WinXP
host (i.e. the one running the debugger) could connect to a target running WinXP or later over
IEEE 1394. I’m thinking that 400 MBps for a transfer rate over 1394 is much better than
115200 Kbps over serial.

OSR is a WinDBG shop. You can find the latest versions of the WinDBG debugger and other
debugging needs at http://www.microsoft.com/whdc/DevTools/Debugging/default.mspx.

A Source Control System
Some people just don’t appreciate the value of a good source control system. Copying your
project into different directories and running WinDiff to determine what has changed is really
not acceptable. It doesn’t matter which one you use as long as you use one that allows you to
label base levels and easily keep track of changes. Some source control systems that you may
want to check out are Visual Source Safe, RCS, and CVS to name a few.

Common Sense and Plenty of Stimulants
Let’s face it – writing a Windows driver is not the easiest thing to do. If you make a mistake in
kernel mode it is usually fatal or soon will be. Having the right tools in-hand is a great first step
to simplifying the development process and preventing critical errors from popping up while
your software is in the hands of those that feed you. Of course, coffee, soda or your stimulant
of choice will help too.

Writing a driver takes patience and attention to detail. Remember, how your driver behaves
impacts the entire system. If your driver is awful, the entire system is awful.
Go Diskless -- Using the Microsoft Symbol Servers
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




By now, all Q/A engineers are intimately acquainted with Microsoft’s Symbol Server, right?
Well, if you’re not you should be! The use of symbol servers for accessing the symbols that
match the version, service pack or hot fix of the operating system you’re testing on saves loads
of time. Just think of the time you’ve spent trying to find the right symbols for the specific
version of the operating system you’re running – do you have the symbols on CD/DVD? Do
you have to download them? Are they stored on a network share? Here are some quick tips on
using the Microsoft Symbol Server to make your life easier.

Using Microsoft’s Symbol Server technology makes getting the correct symbols a snap. To
use a symbol server, you first set up what’s called a downstream store, which is any drive or
share that you want to use as a symbol destination. It can be a local folder or a share that’s
been set up for your engineering team. You then back this up with Microsoft’s symbol server
and add the following to your symbol path:

SRV*downstream store*http://msdl.microsoft.com/download/symbols



When searching for symbols during a debugging session, the downstream store will be
searched first and then the symbol server. If symbols are found on the symbol server they will
be cached in the downstream store for future access. This downstream store is your "local
symbol server".

But that’s not all. What about the symbols for the different versions of your driver? Let’s say
you’ve released multiple versions over the years and you have customers using different
releases. Why not add the symbols for your driver to the same symbol store that you’re using
as a downstream store? Well, you can – and should do just that. SymStore is a tool included in
the Debugging Tools for Windows package that is used to create/add to symbol stores. For
example, let’s say I want to add the Windows XP checked symbols for mydriver.pdb to my
symbol store:

Symstore add /f c:\appdir\lib\wxp\chk\i386\mydriver.pdb /s \\symshare\symstore /t "My Driver"

/v "V2.0 for WinXP Checked"



This command will add the symbol file mydriver.pdb, located in c:\appdir\lib\wxp\chk\i386 to
the symbol store \\symshare\symstore. The name of the product is "My Driver" and the version
is "V2.0 for WinXP Checked".
This is a very simple example and there are many other command line options available for
use with the SymStore utility, such as recursive symbol adding, use of pointers, and use of an
index file. See the documentation in the MSDN Library under Windows Development/Windows
Base Services/SDK Documentation/Debugging and Error Handling/Debug Help Library/About
DbgHelp/Symbol Servers and Symbol Stores for more detailed information on symbol stores
and using SymStore to add your own symbols to the downstream store.

Be on the lookout for Microsoft’s Source Server (Debugging Tools for Windows V6.3 or later) –
which sounds like the perfect companion to the symbol server. The source server will allow for
the retrieval of the exact version of source code files that were used to build the version of the
driver being debugged. Seems like the next logical step in simplifying testing and debugging.
Easy Once You've Done It -- Setting Up the Debugger
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04




For you experienced folks, setting-up a debugger is something done with your eyes closed.
But, for those developers new to debugging, setting-up a debug connection between two
machines can be an exercise in frustration. This article provides a walk-through of this process.
For this discussion, We will be talking about debugging using WinDBG. If you are using
another debugger, please consult its documentation.

Getting Started
The most important thing about setting-up debugging is ensuring connectivity between the
host and the target. If they cannot talk to each other, there is no way that you are going to be
using the debugger. How you test connectivity depends on what you plan to use as the
communication medium between the host and the target.

If you are planning on using the serial port, than we recommend you test the connection
between the 2 systems using HyperTerminal. On the host and the target, while in
HyperTerminal, select the serial port and baud rate to be used (it works much better if you
select the same baud rate on both systems). Once the connection is established, you should
be able to type characters on one system (input characters do not get echoed on the local
system by default) and see them appear on the other system. Again, if this doesn’t work, don’t
expect WinDBG to work.

If you are planning on using IEEE 1394, then my suggestion is boot up both systems with the
1394 cable attached and see if Windows installs a 1394 network adapter on each system. If it
does, then you can assign each adapter a fixed IP address (for example 10.1.10.1 and
10.1.10.2), and then attempt to copy a file from one system to another: "copy x.txt
\\10.1.10.1\sharename". If this works, then there’s a pretty good chance that WinDBG will be
able to talk 1394 between your two systems.

Setting-up Debugging on the Target
Okay, now that you have proven that the 2 systems can talk to each other, you now have to
set-up debugging on the target system. It does not matter whether the system is running a free
version of the OS, a checked version of the OS, or a "hybrid" version -- (see "Just Checking
Revisited", p 9 in this issue), the system must have debugging set up in order for WinDBG to
be able connect to it. For x86 and AMD64 based systems, debugging is enabled by modifying
the boot configuration file "boot.ini", as seen in Figure 1.
Figure 1 -- Normal Boot.ini File

Boot.ini is located in the root directory of your boot drive ("C:" for example) and is usually a
hidden, read-only file. So if you intend to enable debugging, you are going to have to modify
these attributes. To enable debugging, you would modify boot.ini by copying the currently
existing boot line to a new line, and appending to it the desired communication to be used.
Figure 2 illustrates how the boot.ini file would be modified to use either the serial port (Com1)
or 1394 (channel 62) for debugging. Notice that we did not modify the existing boot line in the
file. Why? Well, boot.ini is a critical file for booting. If there is anything wrong with this file,
Windows will not boot. So to play it safe, I always leave the existing boot line, so that we have
a boot line to fall back to (if you have multiple lines in your boot.ini file, Windows will prompt
you to select the configuration to boot from). Always better to be safe than be sorry.




Figure 2 -- Debug Enabled Boot.ini File

One thing that we should mention before continuing: If you’re going to be using a partially
checked build, then you would include the additional options "/Kernel=" and "/HAL=" to the 2
lines that we added.

Setting up the Host to Debug the Target
Once you have set up the target to be debugged, you now need to set up WinDBG on the host
machine so that it can debug the target. Assuming that you’ve already installed WinDBG,
clicking on the WinDBG icon should bring up the screen shown in Figure 3.
Figure 3 -- WinDBG

As you probably know, WinDBG is a symbolic debugger. It works based on symbol information
provided to it by the user. No symbols = no debugging – it’s that simple. So in order to have a
meaningful debugging session, you must provide WinDBG with the location of the symbols for
your driver and for the operating system. Setting the location of the symbols in WinDBG can be
done by selecting the "Symbol File Path…" submenu under the "File" menu and then entering
a path or multiple paths (separated by ";"). These paths should point to the symbols for your
driver and for the operating system itself.

Getting symbols for your driver is no problem, because when you build it, a ".PDB" file
containing the symbols of your driver is generated by the compiler and linker. Once built, you
can take the .PDB file and put it in a directory where WinDBG will look for it – in our case,
"C:\DEBUG". Locating symbols for the operating system is another matter.

They can either be obtained by downloading them from links off of
http://www.microsoft.com/whdc/DevTools/Debugging/default.mspx or WinDBG can be set up
to obtain the symbols from a Symbol Server. As you can see in Figure 4, we used the symbol
server syntax to download symbols from Microsoft’s Symbol Server (See the WinDBG
documentation, or Go Diskless for details on using the Microsoft Symbol Server).
Figure 4 -- Setting the Symbol File Path

Once you set up your symbol file path, the next thing that you need to do is to set up the
WinDBG source file path. This path allows the debugger to display your source code in the
event that a crash occurs in your driver or in the event that it hits a dynamic breakpoint. The
source file path is set by selecting the "Source File Path…" submenu under the "File" menu.
The displayed dialog box allows the user to enter a path or multiple paths to source code as
seen in Figure 5.




Figure 5 -- Setting the Source File Path

Now that WinDBG knows where our symbols and sources are, we need to tell it what it is going
to be debugging. WinDBG is capable of debugging applications, crash dumps, doing remote
debugging, and doing kernel debugging. Since we are debugging a kernel driver, we would
select the "Kernel Debug…" submenu under the "File" menu. The selection displays the dialog
box shown in Figure 6 which allows us to select the method of communication with our target
system.
Figure 6 -- Selecting the Debugger Method

Of the three possible options, only the COM and 1394 options are of interest to us. The Local
option is a very restricted type of kernel debugging and is not useful for debugging a driver.

When selecting the COM or 1394 option, the user is required to enter in either the IEEE 1394
channel, for 1394 debugging, or the COM port and baud rate for serial port debugging. What
you enter depends on how you set up the target. If your target is set up for debugging via the
COM port, then you enter the COM port and baud rate that you used on this system when you
tested it with HyperTerminal. If your target is set up for IEEE 1394 debugging then you would
enter the channel number that you selected on the remote system.

Once you set your kernel debug communication parameters and hit the "OK" button, WinDBG
will typically display a message indicating that it is waiting to reconnect (see Figure 7). If the
target that you are going to debug is already rebooted in debug mode, then you can force
WinDBG to take control of that system by selecting the "Break" submenu under the "Debug"
menu. If, however, the target system has not been rebooted in debug mode, you can do so at
any time and WinDBG will automatically connect to the target as it boots.
Figure 7 -- WinDBG...Waiting to Connect

Once WinDBG has taken over control of the target, the target is halted until WinDBG releases
control. At this point you are going to have to spend some time to learn WinDBG command
syntax. Ahhhh….the joys of debugging…
Brand New 'Bag -- The Latest on WinDBG
The NT Insider, Vol 11, Issue 3&4, May-August 2004 | Published: 18-Aug-04| Modified: 18-Aug-04



What good would an entire issue about testing be without a discussion of our favorite kernel
debugger, WinDBG? As I’m sure most of you are aware, when you’re first starting out with
WinDBG it’s a good idea to keep a copy of the Serenity Prayer handy. But, once you’ve
worked the ‘Bag a bit and you’ve learned to accept the things that you cannot change about it,
its real power and elegance come through. Hopefully you haven’t thrown that Serenity Prayer
card away just yet though, because there’s a new ‘Bag in town and you just might need it for a
little while longer.

Judging the Book by its Cover
The major change in the latest version of WinDBG is kind of hard to miss. If you’ve already
tried out the latest version you know exactly what I’m talking about: the new UI. WinDBG has
moved from its old, one window workspace interface to a multi-dock, floating window, tabbed
interface. It sounds scary, and it is until you’ve recited your prayer, dealt with it as something
that you cannot change, and embraced it. Once you’ve done that, it actually turns out to be
kind of useful.

UI Change: Docks
Docks are basically containers for other windows (i.e. source windows, locals windows, etc.)
and are created by selecting Window->Open Dock from the WinDBG menu. In the old UI there
was the one dock, but now you can have as many as you’d like. I never thought I’d actually
want to use this, but it turns out to be extremely handy, especially if you’re running on dual
monitors. Right now, for example, my WinDBG workspace has three open docks: one for the
command window and call stack, one for the source window and locals, and one for
disassembly and registers.

UI Change: Windows
In the new UI, any window can either be floating or docked. You can dock and undock
windows either by clicking and dragging them around individually or by using the
Window->Dock All and Window->Undock All menu items. Manually docking the windows can
be wonky at best and the only advice that I can give is to take it slow. It generally takes a
couple of tries to get things docked exactly how you want them, but once you get everything
just the way you like it, you can save your workspace and, theoretically, never have to deal
with it again. I say "theoretically" because I occasionally have issues where the windows
appear to take on a life of their own and it takes me a few minutes to get them back to the way
they were. But, it just wouldn’t be WinDBG if it didn’t push back at your attempts to make it do
what you want every once in a while.
Probably the nicest thing about the new UI is the fact that it now supports tabbed windows.
When you drag a window into a dock, if you drag it over an existing window it will create tabs to
use to switch between the windows. This is fantastic for having multiple source windows open
simultaneously.

Another quick tip for the new windows - you can access different options for the different types
of windows by right clicking the window’s title bar or by clicking the bizarre little icon next to the
X.

Debugger Command Programs
The coolest feature in the new WinDBG is the addition of debugger command programs.
These can best be described as "ghetto debugger extensions," because they let you write
small, relatively easy scripts to automate tasks that would normally require a full debugger
extension. They support flow control operations such as if, then, while, and for - pretty much
every command that you can do in WinDBG you can do from the extensions (for a list of the
exceptions check the WinDBG documentation). The documentation in the debugger even
provides a small sample script that will walk the active process list and print out the module
names.


Source Server
WinDBG has supported having a symbol server for a while, but it now supports the ability to
have a source server. The way this works is you add a few steps to your build process and the
resulting PDBs will contain information about the location of the source files in your source
control system. When a breakpoint is hit within a source module, WinDBG can use the
information in the PDB to pull down a copy of the source file that matches the one used to build
the currently running version of the driver. This means that if you have a local symbol server
setup on which the symbols have been source indexed, WinDBG will just "do the right thing"
when analyzing, say, a crash dump, and grab not only the proper symbols, but also the proper
source code.

To get more information on how to setup a source server you need to do a custom installation
of WinDBG, select the SDK option and read srcsrv.doc in the \sdk\srcsrv directory.

Actual Debugger Extension Documentation
If you’ve been hesitant to write debugger extensions because the documentation was, well,
pretty much useless, you might want to give it another look. Over the past couple of releases
of WinDBG the documentation has been improved vastly and is definitely worth checking out.

And the Wisdom to Know the Difference…
Found a bug in the ‘Bag? A new version of WinDBG is not released until all known issues are
resolved, so it behooves you to actually report your bug instead of just whining about it. If you
find a bug or if you have a feature request, you can submit it to the WinDBG team at
Windbgfb@microsoft.com.
WDF PnP/Power Interview with Microsoft's Jake Oshins
The NT Insider - Online Special Edition | Published: 14-Jun-04| Modified: 14-Jun-04



As part of our Video Update series from WinHEC 2004 in Seattle (May 2004), our own Peter
Viscarola sat down with Microsoft's Jake Oshins to find out the key message he wanted driver
developers to come away with about the new PnP and Power management design that Jake
was instrumental in creating. Here's the transcript from that interview:

Peter:     I'm here with Jake Oshins, the chief designer of WDF Plug and Play (PnP) and power
management. What can you tell us about WDF Plug and Play and Power
Management? What do you want people to know about the work that you have done over the
last nine months?

Jake:     Previous versions that we have shown at the DDC last fall, and previous betas that
people may have seen, have a very different model than the one we are showing this time [at
WinHEC in May 2004]. We have given out a CD here and it is downloadable off of the web as
part of our beta program. This is a new model, one where we have replaced what we did
before, with one that I think more closely matches with the primitives the driver developers are
interested in.

We had a set of design goals that I thought weren’t being met by the WDF pre-releases that
we had given out before. Specifically, I’m trying to enable automatic Plug and Play and
(particularly) Power Management behavior that leads to a better designed class of
systems. For instance, a mobile machine like a tablet PC or an ultra light laptop where you
would really like to see all of the devices in the machine just default to off, except when they
are actually being used. When they are being used, then they are automatically transitioned
back into a high power state where the driver doesn’t actually know or care that the device was
in a low power state while it wasn’t being used.

We think by changing the primitives involved and by spending more time on design, we have
reached something that driver developers would actually prefer to use because it abstracts
enough of the complexity. For instance, you don’t have to worry about previous state history,
[and] you don’t have to decide what to do in response to particular events based on what
happened in the past. By going for better design primitives, I think we just generally simplified
it. More than anything, I am hoping that people will take a look at it and decide for themselves
whether or not it actually solves their problems.

Peter:     That’s interesting. So, you basically automated a lot of the Plug and Play and Power
Management is what I hear you saying. I’m going to ambush you with a question you didn’t
expect and that is: Does this mean that because you automated a lot of stuff, as a driver writer
I can’t innovate anymore, and that I can’t get to the lowest level stuff that I want to because you
are doing it for me? So, I get [PnP and power management] the Jake Oshins way or no
way? Is that how framework works now?

Jake:     No, actually. We have learned that lesson. As you might have noticed, we have lots
mini-port models in our operating system. We don’t really want to continue down that
path. In fact, we would really like to make sure that absolutely any behavior you need to
accomplish is accomplishable. In order to do that, we created a system where you can
override absolutely anything. You can call outside of our model. If you choose to override
and go outside our model, you take on more responsibility for the complexity in your driver.

We also spent some time thinking about how to stage the learning curve for simple drivers,
particularly ones which don’t actually drive hardware. There are a lot of drivesr that are really
software only. There are filters or kernels extensions of one sort or another that are really
there as a presence in kernel mode but not directly corresponding with any piece of
hardware. In our previous models, like WDM, it is often very hard to accomplish what you
want to do because there is a tremendous amount of boiler plate code associated with just
making these drivers work. We tried to get rid of nearly 100% of that. We showed earlier
today in a session that a keyboard filter sample can have a total of six lines of Plug and Play
and Power Management code using the current [version of] WDF. It is our goal to make it so
that additional functionality comes gradually and so that the drivers can be built up a piece at a
time. If you really need to take complete control, you can. You can handle Plug and Play and
Power Management directly if you choose to. You can augment the behavior of the
framework and you can influence by setting constraints. Or, you can completely ignore it, if
you care to.

Peter:     Sweet! Well Jake, thanks so much for your time, I really appreciate it.


Related Articles
Who Cares? You Do! - Implementing PnP for WDM/NT V5

Converting Windows NT V4 Drivers to WDM/Win2K

Lots of New PnP and Installation Information

Special Win2K PnP Tracing and Checks

A New Interface for Driver Writing -- The Windows Driver Framework

Updated SeaLevel DIO-24 Sample Driver Available

Updated SeaLevel DIO24 Driver Available

Power Play - Power Management Changes in Vista

Write No Code...Get a GUI - Vista Power Plan Integration

Starting Out: M Versus F
A New Framework
The NT Insider, Vol 11, Issue 2, Mar-Apr 2004 | Published: 03-May-04| Modified: 04-May-04




    Click Here to Download: Code Associated With This Article Zip Archive, 1,102KB


In this article, we’ll show you how to write a simple WDF driver. The hardware we’re going to
be writing this driver for is a simple 24-channel digital I/O PCI card.

If you want to try this driver, and perhaps experiment a little, it’s available for download above.
The driver is for the PIO-24.PCI card, Part #8008, made by Sealevel Systems, Inc.
(www.sealevel.com). You’ll probably also want to order a model TA01 "Terminal Block Relay
Rack Simulation Module", which is a fancy name for the 3 inch by 5 inch card, shown in Figure
1, with 24 LEDs and a set of switches. With this card, at least you’ll be able to play with the
driver and see it blink the lights. The PCI card and the lights/switches card will cost you just
under US$200 total, plus about $10 shipping within the US.

The driver we write for this device will support inputs and output, as well as interrupts. The
driver will also support powering down the device during periods when it has no work to do.
We’re not trying to show you how to write the simplest possible driver. Rather, we’re trying to
show you a broad range of WDF features, so you can get a feel for how WDF actually works.

Before we get started, let’s discuss the usual restrictions. We’re writing a driver example here,
not a production driver. The driver you can download, and the examples shown here, are for
instructional purposes only. The only thing you should use this sample for is as a learning
exercise.




Figure 1 - Pretty Lights and Switches
And, of course, there’s still plenty of more development to be done on the Framework. So you
should expect some of the details to change over time

Introducing The Kernel Mode Framework

The Windows Driver Foundation has a Kernel Mode Framework and a User Mode Framework.
Because we haven’t spent any time here at OSR trying to play with the User Mode Framework,
we’re going to confine our comments to Kernel Mode.

The Framework has an object oriented approach. Its basis is, therefore, a set of objects. You
use methods to create these objects, and to set and get properties on the objects. The objects
are abstract - You refer to them via handles, not by pointers.

The main types of objects that we’ll be dealing with in this example are:

       WDFDRIVER - This object describes an instance of your driver in memory, including
        where it’s loaded, its attributes, and the devices that it serves. A WDFDRIVER object
        refers to, but is not identical to, a WDM DRIVER_OBJECT.
       WDFDEVICE - This object describes a single instance of a specific device that your
        driver supports. The WDFDEVICE may be named or unnamed. It may optionally be
        made available to users for access using either a device interface GUID or a device
        name. WDFDEVICE objects have a very rich set of attributes, including PnP and
        power-related event processing callbacks. A WDFDEVICE object refers to, but is not
        the same as a WDM DEVICE_OBJECT.
       WDFREQUEST - This object represents an I/O request. It refers to, but does not
        correspond directly to, an IRP in WDM.
       WDFQUEUE - The WDFQUEUE object describes a particular queue of I/O Requests.
        Each WDFQUEUE is associated with a WDFDEVICE object. Each queue has a set of
        event processing callbacks that you can specify to request that your driver be called
        back when a request of a given type arrives at the queue. There may be more than
        one WDFQUEUE per device, and you can very easily specify how incoming I/O
        requests are to be distributed among those queues. By default, WDFQUEUEs are
        power managed, which means that they are automatically paused when the
        WDFDEVICE, with which they are associated is powered down.
       WDFINTERRUPT - This object represents the device’s interrupt(s). Assuming your
        device supports interrupts, you’ll need one of these. It associates the device interrupts
        with an ISR and DPCforISR. If your device supports interrupts, you’ll also need to
        specify interrupt enable and interrupt disable event processing callbacks when your
        WDFINTERRUPT is created.

Your job as yeoman driver writer is to appropriately initialize and create these WDF objects
and then "wire them up" so that they handle the work of your device correctly. Ready to go?
Let’s get started.

DriverEntry
Every Windows driver starts with a DriverEntry function. In the Framework, the only two
things you need to do in DriverEntry is instantiate your WDFDRIVER object, and tell the
framework where to call you back each time one of your devices is discovered connected to
the system. You’ll see the code to do this in Figure   2.

NTSTATUS
DriverEntry(PDRIVER_OBJECT DriverObj, PUNICODE_STRING RegistryPath)
{
    NTSTATUS code;
    WDF_DRIVER_CONFIG config;
    WDFDRIVER hDriver;


    DbgPrint("\nWDFDIO Driver -- Compiled %s %s\n",__DATE__, __TIME__);


    //
    // Initialize the Driver Config structure:
    //      Specify our Device Add event callback.
    //
    WDF_DRIVER_CONFIG_INIT_NO_CONSTRAINTS(&config, DioEvtDeviceAdd);


    //
    //
    // Create a WDFDRIVER object
    //
    // We specify no object attributes, because we do not need a cleanup
    // or destroy event callback, or any per-driver context.
    //
    code = WdfDriverCreate(DriverObj,
                              RegistryPath,
                              WDF_NO_OBJECT_ATTRIBUTES,
                              &config,     // Ptr to config structure
                              NULL);       // Optional ptr to get WDFDRIVER handle


    if (!NT_SUCCESS(code)) {


         DbgPrint("WdfDriverCreate failed with status 0x%0x\n", code);
    }


#if DBG
    DbgPrint("DriverEntry: Leaving\n");
#endif


    return(code);
}
Figure 2 – DriverEntry



As you can see in the code in Figure 2, we start by initializing a WDF_DRIVER_CONFIG
structure, specifying DioEvtDeviceAdd, as the function that we’ll use as our device add event
processing callback. The initialization is performed using the
WDF_DRIVER_CONFIG_INIT_NO_CONSTRAINTS macro.

After the WDF_DRIVER_CONFIG structure has been initialized, we create our WDFDRIVER
object by calling WdfDriverCreate. We pass pointers to our Windows DRIVER_OBJECT and
Registry Path as the first two arguments, which were provided to us by the I/O Manager as
inputs to our DriverEntry function. We also pass a pointer to our initialized
WDF_DRIVER_CONFIG structure, and pass NULL as the optional pointer into which the
Framework will return us the handle of the WDFDRIVER object that it creates. Why NULL? We
simply don’t need the handle in this function. The Framework will pass it to us later at our
EvtDeviceAdd event processing callback.

Assuming our call to WdfDriverCreate returns success, when we exit from DriverEntry the
next call to our driver will be at its EvtDeviceAdd event processing callback.

Object Creation In General

Stop for a second and note the sequence of operations that we performed in DriverEntry to
create our WDFDRIVER object. This is important, because it’s a simple pattern that we’ll see
repeated many times in any WDF driver that we develop. The pattern is:


       Initialize a "config" structure using a "config init" macro
       Fill in any optional fields in the "config" structure
       Optionally create an object attributes structure if you need a callback on object
        destruction or if you need to associate context with the object
       Create the object, specifying the config structure and the object attributes structure if
        you used one.

First, we allocate (in DriverEntry, this allocation is on the stack) and initialize a configuration
structure. Configuration structures are typically named WDF_YYY_CONFIG, where YYY is the
name of the object or feature for which the configuration structure is used. The initialization is
performed with a macro. Initialization macros for configuration structures are typically named
something like WDF_YYY_CONFIG_INIT. For example, you saw in DriverEntry that we
initialized our WDF_DRIVER_CONFIG structure using the macro
WDF_DRIVER_CONFIG_INIT_NO_CONSTRAINTS. This is a newer, and simpler, form of the
original WDF_DRIVER_CONFIG_INIT macro (which seems to have been replaced).
After performing the initialization of the configuration structure with the macro, you may fill into
the initialization structure any optional attributes or properties. In the case of the WDFDRIVER
object that we create in DriverEntry in our example, we accept the defaults provided by the
initialization macro, so we don’t fill in any additional fields in the WDF_DRIVER_CONFIG
structure.

Prior to object creation, you may also optionally allocate and initialize a
WDF_OBJECT_ATTRIBUTES structure. You may optionally provide object attributes for any
WDF object that you create. WDF_OBJECT_ATTRIBUTES allows you to specify an object
specific context (you’ll see how this is used when we create our WDFDEVICE object, later), as
well as cleanup and destroy event processing callbacks for the object. The cleanup and
destroy callbacks act like destructors, and are called when the object is last closed or last
dereferenced.

Once the configuration and object attributes (if we’re using one) structures are properly
initialized, and any optional fields in them filled in as needed, we create the object. This is
typically done by calling a function named WdfYyyCreate, where Yyy is the type of object that
you’re creating. In our example DriverEntry function, we call the function WdfDriverCreate.
Assuming the call returns a success status, our WDFDRIVER object was successfully created.

Device Add

The Framework will call our driver’s EvtDeviceAdd event processing callback each time a
new device that is handled by our driver is discovered in the system. EvtDeviceAdd is called
with a handle to our WDFDRIVER object, and a pointer to a WDFDEVICE_INIT structure. The
WDFDRIVER object is the one that we created in our DriverEntry entry point by calling
WdfDriverCreate. The WDFDEVICE_INIT structure is a container for WDFDEVICE attributes
that can be set before the WDFDEVICE object is created (this will become a bit more clear
later, when we discuss how it’s used). Code for the EvtDeviceAdd event processing callback
for our sample driver is shown in Figure 3.

WDFSTATUS
DioEvtDeviceAdd(WDFDRIVER Driver, PWDFDEVICE_INIT DeviceInit)
{
    WDFSTATUS status = STATUS_SUCCESS;
    WDF_PNPPOWER_EVENT_CALLBACKS pnpPowerCallbacks;
    WDF_OBJECT_ATTRIBUTES objAttributes;
    WDFDEVICE device;
    PDIO_DEVICE_CONTEXT devContext;
    WDF_IO_QUEUE_CONFIG ioCallbacks;
    WDF_INTERRUPT_CONFIG interruptConfig;
    WDF_DEVICE_POWER_POLICY_IDLE_SETTINGS idleSettings;


    //
    // Initialize the PnpPowerCallbacks structure.
//
WDF_PNPPOWER_EVENT_CALLBACKS_INIT(&pnpPowerCallbacks);


//
// Setup the callbacks to manage our hardware resources.
//
// Prepare is called at START_DEVICE time
// Release is called at STOP_DEVICE or REMOVE_DEVICE time
//
pnpPowerCallbacks.EvtDevicePrepareHardware = DioEvtPrepareHardware;
pnpPowerCallbacks.EvtDeviceReleaseHardware = DioEvtReleaseHardware;


//
// These two callbacks set up and tear down hardware state that must
// be done every time the device moves in and out of the D0-working
// state.
//
pnpPowerCallbacks.EvtDeviceD0Entry= DioEvtDeviceD0Entry;
pnpPowerCallbacks.EvtDeviceD0Exit = DioEvtDeviceD0Exit;


//
// Register the PnP and power callbacks.
//
WdfDeviceInitSetPnpPowerEventCallbacks(DeviceInit,
                                         pnpPowerCallbacks);
//
// Create our Device Object and its associated context
//
WDF_OBJECT_ATTRIBUTES_INIT(&objAttributes);


WDF_OBJECT_ATTRIBUTES_SET_CONTEXT_TYPE(&objAttributes,
                                         DIO_DEVICE_CONTEXT);


//
// We want our device object NAMED, thank you very much
//
status = WdfDeviceInitUpdateName(DeviceInit, L"\\device\\WDFDIO");


if (!NT_SUCCESS(status)) {
     DbgPrint("WdfDeviceInitUpdateName failed 0x%0x\n", status);
     return(status);
}


//
// Because we DO NOT provide callbacks for Create or Close, WDF will
// just succeed them automagically.
//


//
// Create the device now
//
status = WdfDeviceCreate(&DeviceInit,     // Device Init structure
                       &objAttributes, // Attributes for WDF Device
                       &device);    // returns pointer to new
                                                                WDF Device


if ( !NT_SUCCESS(status)) {
     DbgPrint("WdfDeviceInitialize failed 0x%0x\n", status);
     return(status);
}


//
// Device creation is complete
//
// Get our device extension
//
devContext = DioGetContextFromDevice(device);




devContext->WdfDevice = device;


//
// Create a symbolic link for the control object so that usermode can
// open the device.
//
status = WdfDeviceCreateSymbolicLink(device, L"\\DosDevices\\WDFDIO");


if (!NT_SUCCESS(status)) {
     DbgPrint("WdfDeviceCreateSymbolicLink failed 0x%0x\n", status);
     return(status);
}


//
// Configure our queue of incoming requests
//
// We only use the default queue, and we only support
// IRP_MJ_DEVICE_CONTROL.
//
// Not supplying a callback results in the request being completed
// with STATUS_NOT_SUPPORTED.
//
WDF_IO_QUEUE_CONFIG_INIT(&ioCallbacks,
                       WdfIoQueueDispatchSerial,
                       WDF_NO_EVENT_CALLBACK,      // StartIo
                       WDF_NO_EVENT_CALLBACK);     // CancelRoutine


ioCallbacks.EvtIoDeviceControl = DioEvtDeviceControlIoctl;


status = WdfDeviceCreateDefaultQueue(device,
                                  &ioCallbacks,
                                  WDF_NO_OBJECT_ATTRIBUTES,
                                  NULL); // pointer to default queue


if (!NT_SUCCESS(status)) {
     DbgPrint("WdfDeviceCreateDefaultQueue failed 0x%0x\n", status);
     return(status);
}




//
// Create an interrupt object that will later be associated with the
// device's interrupt resource and connected by the Framework.
//




//
// Configure the Interrupt object
//
WDF_INTERRUPT_CONFIG_INIT(&interruptConfig,
                       FALSE,                // auto-queue DPC?
                       DioIsr,
                       DioDpc);


interruptConfig.EvtInterruptEnable = DioEvtInterruptEnable;
interruptConfig.EvtInterruptDisable = DioEvtInterruptDisable;


status = WdfInterruptCreate(device,
                         &interruptConfig,
                         &objAttributes,
                         &devContext->WdfInterrupt);
if (!NT_SUCCESS (status))
    {
         DbgPrint("WdfInterruptCreate failed 0x%0x\n", status);
         return status;
    }


    //
    // Initialize our idle policy
    //
    WDF_DEVICE_POWER_POLICY_IDLE_SETTINGS_INIT(&idleSettings,
                                                 IdleCannotWakeFromS0);


    status = WdfDeviceUpdateS0IdleSettings(device, &idleSettings);


    if (!NT_SUCCESS(status)) {
         DbgPrint("WdfDeviceUpdateS0IdleSettings failed 0x%0x\n", status);
         return status;
    }




    return(status);
}




Figure 3 – Device Add Event Processing Callback



In the EvtDeviceAdd event processing callback, a WDF driver generally performs the
following operations:


        Initialize and create a WDFDEVICE object - This includes specifying any necessary
         PnP and power management callbacks to be used for this device.
        Initialize and create one or more WDFQUEUE objects associated with the
         WDFDEVICE. At the very least, we’ll need a default queue, and we’ll need to specify
         where our driver is to be called when I/O Requests arrive for our driver to process.
        Initialize and create a WDFINTERRUPT object. This object specified the interrupt
         service routine (ISR) and deferred processing callback for ISR completion (DpcForIsr)
         to be used by our device.
        Initialize and specify any idle or wake properties associated with the WDFDEVICE.

Our example driver will fully support PnP and power management, and (just for fun) will even
support automatically powering down the device when its not actively being used.

Plug and Play Callbacks
Looking at Figure 2, the first thing you can see our example driver doing in its EvtDeviceAdd
event callback function is initializing a WDF_PNPPOWER_EVENT_CALLBACKS structure by
calling WDF_PNPPOWER_EVENT_CALLBACKS_INIT.

Next, we specify event processing callbacks in the WDF_PNPPOWER_EVENT_CALLBACKS
structure for EvtDevicePrepareHardware and EvtDeviceReleaseHardware. These two
important callbacks are used to manage our device’s hardware resources. The Framework
calls EvtDevicePrepareHardware with a WDFCOLLECTION of hardware resources for our
device, each time hardware resources are assigned to our device. These hardware resources
might be ports, registers, shared memory buffers, and interrupts, for example. If our device is
never removed (unplugged) from the system, and it never surrenders its resources for
potential use in resource rebalancing, our driver will be called exactly once at
EvtDevicePrepareHardware for each device our driver supports. This callback will occur after
EvtDeviceAdd has been called.

Typical things that you might do in your EvtDevicePrepareHardware event processing
callback include: Saving away your device’s I/O port addresses and mapping any shared
memory or registers on your device into kernel virtual address space. We’ll talk more about
EvtDevicePrepareHardware later.

The Framework calls EvtDeviceReleaseHardware any time your driver must surrender the
resources that had been previously allocated to your device. This can happen (a) when your
device is removed from the system, or (b) when the PnP manager needs you to return your
device’s resources temporarily in support of a "resource rebalancing" operation. Within your
EvtDeviceReleaseHardware, all your driver needs to do is undo any work it did during the
previous call to EvtDevicePrepareHardware for the device.

After we specify callbacks for EvtDevicePrepareHardware and EvtDeviceReleaseHardware,
we specify callbacks for EvtDeviceD0Entry and EvtDeviceD0Exit. The Framework calls
these event processing callbacks every time your device is about to move into or out of the D0
(fully powered, working) state. EvtDeviceD0Entry is first called during startup, after
EvtDevicePrepareHardware has been called. It is also called any time the device is about to
be powered up because the system (and thus your device) is returning from standby or
hibernate or (if your device supports it) when your device is being awakened from an idle state.

EvtDeviceD0Exit is called any time your device is about to be powered down due to system
shutdown, system suspend, or (if your device supports it) device idle.

With the WDF_PNPPOWER_EVENT_CALLBACKS structure initialized as necesary, we call
WdfDeviceInitSetPnpPowerEventCallbacks, passing a pointer to the WDFDEVICE_INIT
structure that was passed into our EvtDeviceAdd function by the Framework, and a pointer to
the WDF_PNPPOWER_EVENT_CALLBACKS structure that we just finished initializing. This
sets the attributes described by the WDF_PNPPOWER_EVENT_CALLBACKS structure into
the WDFDEVICE_INIT structure that we’ll later use to create our WDFDEVICE object.
Object Attributes

The next thing we do in our EvtDeviceAdd callback is initialize a standard
WDF_OBJECT_ATTRIBUTES structure, by calling WDF_OBJECT_ATTRIBUTES_INIT.

We then use the macro WDF_OBJECT_ATTRIBUTES_SET_CONTEXT_TYPE to associate
the context structure type DIO_DEVICE_CONTEXT with our WDFDEVICE. This context
structure is the non-paged per-device context that we’ll use for storage of most of the
information about our device. In WDM, the WDFDEVICE context would be the Device
Extension.

Name That Device!

Before we create our WDFDEVICE, we do one more thing. We call the rather strangely named
function WdfDeviceInitUpdateName to establish a name for our device. Note that we set the
name into the WDFDEVICE_INIT structure - which is the same place we set the PnP and
power callback functions. Just like in WDM, this is the internal name of the device. If you want
your device to be accessible to user’s by name, you’ll have to create a symbolic link. We
describe how to do this a little later.

Create The Device

Now that the WDFDEVICE_INIT structure has been fully initialized with data such as the
desired device name, and the PnP/power event processing callbacks, we’re ready to create
the WDFDEVICE. We do this by calling WdfDeviceCreate, passing a pointer to the
WDFDEVICE_INIT and WDF_OBJECT_ATTRIBUTES structures. We also pass a pointer to a
location where WdfDeviceCreate will return to us a handle for the WDFDEVICE that we
created.

Note that after creating the WDFDEVICE, we also create a symbolic link in the object
manager’s name space for the name \DosDevices\WDFDIO. This will allow a user program to
directly open our device using the Win32 function CreateFile, and the name "DWDFDIO."

Create A Queue

Now that we have a WDFDEVICE object that represents our device, and that WDFDEVICE
has a name and a symbolic link that users could open and send requests to (once the device
has been properly enabled), we should probably create a queue that can be used to present
those requests to the driver.

Once again, we follow the typical pattern: We build a configuration structure, which in this case
is a WDF_IO_QUEUE_CONFIG structure. We initialize that structure with an initializer macro,
which in this case is named WDF_IO_QUEUE_CONFIG_INIT. This macro takes a pointer to
the structure to be initialized, and three other data items:
The queue serialization model to be used by the queue: WdfIoQueueDispatchSerial,
WdfIoQueueDispatchParallel, or WdfIoQueueDispatchManual.
WdfIoQueueDispatchSerial simply says the WDFQUEUE will send your driver one request
at a time. When you complete that request, it will give you another if it’s available.
WdfIoQueueDispatchParallel will result in the WDFQUEUE calling your driver’s I/O callbacks
as each request arrives at the queue.

In other words, you’ll get requests in parallel. WdfIoQueueDispatchManual allows you to
manage the queue entirely by yourself, and relies on you calling WdfIoQueueGetRequest. An
EvtIoStart event processing callback. If provided, this is the function that will be called if no
more specific I/O request event processing callback applies. An EvtIoCancel event
processing callback. If provided, the Framework calls this function when an I/O request is to be
canceled.

In our example, we choose the simplest queue serialization model:
WdfIoQueueDispatchSerial. We do not provide either an EvtIoStart or EvtIoCancel event
processing callback function.

After calling WDF_IO_QUEUE_CONFIG_INIT, we fill in one additional field in the
WDF_IO_QUEUE_CONFIG structure. This is the EvtIoDeviceControl event processing
callback. Other possible callbacks that our driver could utilize include EvtIoRead, EvtIoWrite,
and EvtIoInternalDeviceControl. When we’ve filled in all the callbacks for I/O events that you
want to handle, we call WdfDeviceCreateDefaultQueue to create the default queue for our
device

Fun With Queues

We think you’ll find WDFQUEUEs to be one of the nicest features in the Framework, because
they give you lots of options for how I/O requests will be queued to your driver. It’s possible to
create multiple queues, and either manually or automatically forward requests from one queue
to the other. By default, queues are power managed. This means that when your device is
powered down, any queues associated with your device are automatically "held." Of course,
you can override this quite easily. Better yet, you can set things up to allow you to process, for
example, device control requests even when your device is powered down, but to have the
Framework automatically hold the queue that’s used for all the read and write requests for your
device. Sadly, space doesn’t permit a complete discussion of all the fun you can have with
WDFQUEUEs here. Perhaps we’ll have the time and energy to address this topic further in a
follow-up article.

Recall that our sample driver provides no EvtIoStart callback, but does provide an event
processing callback for EvtIoDeviceControl. Let’s examine the effect of this design. In
reading the description below, it’s important to keep in mind that IRP_MJ_CREATE and
IRP_MJ_CLOSE are currently handled in a separate way by the Framework. At present, these
requests succeed by default. Thus, the procedures described below do not apply to create or
close operations.
When the Framework gets a request for a device (other than a create or a close), it determines
which queue will receive the request, and which of the driver’s event processing callbacks will
be called, based on the request’s I/O function code. In identifying a queue, the Framework first
looks for a specific queue to use. If one is not provided, it will use the default queue. Our
example driver only creates the default queue, so that’s the one the Framework will use for all
incoming requests.

Once a queue has been identified, the Framework looks to see which of the driver’s event
processing callbacks it should call. If a specific I/O event processing callback that matches the

I/O request function code has been provided, the Framework places the request on the
identified queue, and calls the callback (if appropriate). Whether or not the callback is called
immediately depends on many factors including the device’s state, whether or not the queue is
power managed, and the queues serialization method. If a specific I/O event processing
callback that matches the request’s I/O function code has not been provided, the Framework
checks to see if an EvtIoStart callback was specified. If so, then the request is placed on the
queue and the EvtIoStart is called (again, whether or not the callback is called immediately
depends on those same factors described previously). If no specific I/O event processing
callback matching the request is provided by the driver, and no EvtIoStart callback has been
provided, the request is completed with STATUS_NOT_SUPPORTED. Again, note that this
process does not apply to either create or close operations, which presently succeed
automatically.

So how does all this apply to our sample driver? Because we don’t supply an EvtIoStart
function in our driver, and the only specific I/O event processing callback that we provide is
EvtIoDeviceControl, any requests that arrive at our driver with I/O function codes other than
IRP_MJ_DEVICE_CONTROL will be rejected by the Framework. The user will get back
STATUS_NOT_SUPPORTED. Device control requests will be placed on the default queue,

and the driver’s EvtIoDeviceControl callback will be called eventually, depending on the state
of the device and driver. So, you can see that in our sample driver, the only requests we
support are create and close (because they are handled differently than described here) and
Device Control.

Interrupts

Next, our sample driver initializes and creates a WDFINTERRUPT object. Once again, we use
the same pattern we’ve been using all along: We create a configuration structure of type
WDF_INTERRUPT_CONFIG and we initialize it using the macro
WDF_INTERRUPT_CONFIG_INIT.

Note that there are two important callbacks that we need to fill into the
WDF_INTERRUPT_CONFIG structure before the WDFINTERRUPT object is created. These
are the EvtInterruptEnable and EvtInterruptDisable callbacks. These are called by the
Framework when it wants our driver to enable or disable interrupts, respectively, on our
device.

EvtInterruptEnable will be called to enable interrupts on our device after the Framework has
called our EvtDeviceD0Entry callback, and after the Framework has connected our ISR to
interrupts from our device.

You may be surprised to find that EvtInterruptDisable is called before your device enters any
low power (non-D0) state, just prior to calling EvtDeviceD0Exit. This is because the
Framework both disables your device interrupts, and disconnects it from your driver’s interrupt
service routine whenever your device is powered down. This helps to avoid potential interrupt
storms and other problems.

Power Down on Idle

As we mentioned earlier, just for fun we decided to add support for powering down our device
while it was not being used. In WDM, this isn’t exactly a simple feature to implement. But in
WDF, it’s a breeze. Note that, following the pattern once again, we initialize a
WDF_DEVICE_POWER_POLICY_IDLE_SETTINGS structure by calling
WDF_DEVICE_POWER_POLICY_IDLE_SETTINGS_INITWe note that our device
implements idle support, and that it is not capable of waking itself while the system remains in
S0.

In our example, we let the rest of the parameters default, including the idle timer which defaults
to 5 seconds, and our idle power state which defaults to D3. We call
WdfDeviceUpdateS0IdleSettings to set the parameters on the WDFDEVICE.

As a result of the idle policy we requested, the Framework will automatically transition our
device to D3 after our device has been idle, receiving no I/O requests, for 5 seconds.

Wrapping Up EvtDeviceAdd

That’s all you have to do in your EvtDeviceAdd event processing callback to support a real
device, with real interrupts, and add idle support on top of it. The rest of the driver’s functions
are pretty straight forward. While we can’t go into the same level of detail that we have for
DriverEntry and EvtDeviceAdd, let’s briefly look at the rest of the callbacks in our driver.

EvtPrepareHardware

As we mentioned above, this callback is called to present a set of device hardware resources,
such as ports, shared memory addresses, and interrupts to your driver. Your driver’s job,
during this callback, is to save any information that you may need (such as port addresses), as
well as map any shared memory addresses into kernel virtual address space. The prototype
for this function is:
NTSTATUS

DioEvtPrepareHardware(WDFDEVICE Device,

WDFCOLLECTION Resources,

WDFCOLLECTION ResourcesTranslated)



Note that the resources arrive in a WDFCOLLECTION. That means, to walk through your
resources in this function, all you have to do is something like what’s shown in Figure 4 (note
this is a code snippet, not a working code sample).

for (ULONG i=0; i<WdfCollectionGetCount(ResourcesTranslated); i++) {


        //
        // Get the i'th item from the collection of resources
        //
        hResourceTrans = (WDFRESOURCECM)
                                 WdfCollectionGetItem(ResourcesTranslated, i);


        //
        // From that, get the Partial Resource Descriptor
        //
        resourceTrans = WdfResourceCmGetDescriptor(hResourceTrans);


        if(!resourceTrans){
             DbgPrint("NULL resource returned??\n");
             return(STATUS_DEVICE_CONFIGURATION_ERROR);
        }


        switch (resourceTrans->Type) {


             case CmResourceTypePort:
                           ...




             case CmResourceTypeMemory:
                           ...


             case CmResourceTypeInterrupt:
                           ...
        }




    Figure 4 – Demonstration code for handling a WDFCOLLECTION

Another thing that’s worthy of note about your EvtPrepareHardware callback is that you do
not connect your ISR to your device’s interrupt resource in this function. As previously
mentioned, when the Framework wants you to connect your ISR to your device’s interrupt, it
calls you at your EvtInterruptConnect event processing callback.

EvtDeviceD0Entry

As we mentioned above, this callback is called to each time your device is going to be
transitioned into D0. For you WDM pros, it’s important to note that your driver is called at this
entry point during the implicit transition to D0 that takes place during system startup. The
prototype for this function is as follows:

NTSTATUS

DioEvtDeviceD0Entry(WDFDEVICE Device,

WDF_POWER_DEVICE_STATE


PreviousState)



The point of this callback is to let you do stuff that you need to do every time you enter D0. This
might be downloading microcode to your device, or restoring previously the device state that
you previously saved in your EvtDeviceD0Exit event processing callback when your device
was being powered down.

EvtInterruptEnable

This interrupt is called with a pointer to your WDFINTERRUPT and WDFDEVICE, when the
Framework wants you to enable interrupts on your device. The prototype for this function is:

BOOLEAN

DioEvtInterruptEnable(WDFINTERRUPT Interrupt,


WDFDEVICE Device)



Pretty straight forward, right? There’s not much more to say about this function.

Interrupt Service Routine

This routine is called when your device interrupts. The prototype for this function is:

BOOLEAN

DioIsr(WDFINTERRUPT Interrupt,


ULONG MessageID)



Note the parameter MessageID. This function is ready for Message Signaled Interrupt support
in a future version of Windows.
Of course, your ISR runs at DIRQL. Therefore, there are very few functions that your driver
can call from your ISR. To complete interrupt processing at a lower IRQL, you will want to
queue a Deferred Procedure Call (DPC) for ISR completion, AKA a DpcForIsr. This is most
easily done by calling the function WdfInterruptQueueDpcForIsr, passing a handle to your
WDFINTERRUPT object.

DpcForIsr

Your DPC for Interrupt Service Routine completion will be called at IRQL DISPATCH_LEVEL.
The prototype for this function is:

VOID


DioDpc(WDFINTERRUPT Interrupt,


WDFOBJECT Device)



Nice and clean, and no silly extra parameters like in WDM.

EvtIoDeviceControl

So, how exactly do you process I/O requests in WDF? Actually, it’s just as easy as everything
else we’ve discussed so far. Recall that our driver only supports device control operations. The
prototype for the EvtIoDeviceControl callback is:

VOID

DioEvtDeviceControlIoctl(WDFQUEUE Queue,

WDFREQUEST Request,

ULONG OutputBufferLength,

ULONG InputBufferLength,

ULONG IoControlCode)



Note that the function does not return a value (yay!), and that the most useful parameters are
provided as input to your driver on the function call. The WDFREQUEST contains all the
information needed to describe the I/O request.

You can get a pointer to a data buffer containing the user’s

InBuffer from the WDFREQUEST by calling WdfRequestRetrieveBuffer. This is the same
function you would call to get a user’s data buffer when processing a read or a write operation.

You can get a pointer to a data buffer to be used to return data to the user’s OutBuffer from the
WDFREQUEST by calling WdfRequestRetrieveDeviceControlOutputBuffer.

A couple of quick points about buffer handling in
WDFREQUESTS are in order here. What’s most interesting about the way buffers are handled
for WDFREQUESTs is that their handling is uniform regardless of transfer type. What does
this mean? Well, let’s say you’re processing a write request from a user. If you want a data
buffer that contains the user’s data within kernel virtual address space, all you have to do is
call WdfRequestRetrieveBuffer. It doesn’t matter if the request utilizes Direct I/O, or Buffered
I/O. Pretty cool, huh? What? You say you wanted an MDL that describes a buffer, and not the
buffer itself? Oh, in that case you just call WdfRequestRetrieveMdl. It really is just that
simple.

When it comes time to complete a request, you use the following two functions:

WdfRequestSetInformation - This function takes a pointer to a WDFREQUEST and a value
to set into the request’s Information field (IoStatus.Information).

WdfRequestComplete - This function takes a pointer to a WDFREQUEST and the status with
which you want to complete the request. Poof! It’s done.

Need to return from one of your I/O event processing callbacks with the I/O request still in
progress? No problem, just do it. Well, you probably want to store a handle to that
WDFREQUEST someplace first, but all the complicated rules around returning
STATUS_PENDING are gone.

In Summary

The place is here. The time is now. The Windows Driver Foundation, Kernel mode Framework
will be replacing WDM. If you want a say in how it turns out, now’s the time to start learning the
Framework.

The folks here at OSR have made it as easy as we can. Get the sample driver we’ve written.
Join the WDF Beta program, the Framework distribution comes with lots of samples of varying
complexity (yes, even DMA samples that work on real hardware).

We think once you try it you’ll agree WDF is pretty cool.

Want More Info?

We’ll be spending lots of space in The NT Insider talking about WDF in coming issues.

There are also a series of Whitepapers coming from Microsoft’s Windows Hardware and
Driver Central team. Go check them out at www.microsoft.com/whdc. There’s even supposed
to be a pretty decent, and up to date, paper on WDF PnP and power management. I
recommend it highly.
Beware the Guarded Mutex
Blocking Special Kernel APCs at IRQL PASSIVE_LEVEL The NT Insider, Vol 11, Issue 2, Mar-Apr 2004 | Published:

08-Apr-04| Modified: 03-May-04



In the folklore of file systems, we know that IRQL APC_LEVEL is used to prevent the delivery
of all forms of APC - both normal and kernel mode APCs. This is because paging I/O
operations are always sent at APC_LEVEL. However, beginning in Windows Server 2003,
this is no longer the case. Instead, the Memory Manager now uses guarded mutexes. These
are similar to fast mutexes except that instead of raising the IRQL of the processor, they
merely block the delivery of all forms of kernel APC object.

Thus, for those developing file systems and file system filter drivers, it is now important to use
KeAreApcsDisabled rather than just relying upon the IRQL of the system (it would be
wonderful if KeAreApcsDisabled returned TRUE if IRQL > PASSIVE_LEVEL but right now
that is not the case). This is particularly critical because if special kernel APCs are disabled,
no calls should be passed that rely upon I/O completion processing - otherwise the operation
might hang and never completed.

For a network file system, no calls should be made to TDI if KeAreApcsDisabled returns
TRUE. Similarly, no kernel component should call any of the Zw API operations for I/O
operations if KeAreApcsDisabled returns TRUE.

Note that in the current (Windows Server 2003, Build 3790) the documentation for
KeAreApcsDisabled is incorrect - it says "The system still delivers special kernel APCs even
if KeAreApcsDisabled returns TRUE". This was certainly true in Windows XP and earlier
versions, but is not the case in 2003. This can be seen from the disassembly (from Windows
Server 2003, Build 3790):

0: kd> u nt!KeAreApcsDisabled
nt!KeAreApcsDisabled:
80827bf0 64a124010000                  mov         eax,fs:[00000124]
80827bf6 83787000                      cmp         dword ptr [eax+0x70],0x0
80827bfa 0f95c0                        setne       al
80827bfd c3                            ret


Thus, this returns the DWORD value at offset 0x70 in the KTHREAD (fs:[124] is always the
address of the current thread). From the command dt nt!_KTHREAD we can find the value at
offset 0x70:
   +0x070 KernelApcDisable : Int2B
   +0x072 SpecialApcDisable : Int2B
   +0x070 CombinedApcDisable : Uint4B


This is quite different than on Windows XP:


lkd> u nt!KeAreApcsDisabled
nt!KeAreApcsDisabled:
804f3756 64a124010000 mov eax,fs:[00000124]
804f375c 33c9      xor ecx,ecx
804f375e 3988d4000000 cmp [eax+0xd4],ecx
804f3764 0f95c1      setne cl
804f3767 8ac1     mov al,cl
804f3769 c3      ret

And at offset 0xd4:

+0x0d4 KernelApcDisable : Uint4B

There is no concept of SpecialApcDisable in Windows XP. In Windows Server 2003,
however, this function will return TRUE if special kernel APCs are disabled through
this explicit mechanism.

This can be a rather substantial change (we found this here at OSR while testing a low
level I/O routine in a file system. The system hung, we could see the special kernel
APC had not been delivered and yet the thread was blocked at
PASSIVE_LEVEL.) It is an unfortunate oversight that this was not documented in
the IFS Kit for Windows Server 2003, although we can hope that this oversight is
addressed in the SP1 kit.
The Future Is Now -- The WDF Kernel Mode Framework
The NT Insider, Vol 11, Issue 2, Mar-Apr 2004 | Published: 03-May-04| Modified: 03-May-04




It’s the biggest thing to happen in the world of Windows drivers since PnP and power
management. It’s a totally new interface for writing drivers that will (not might, not could, but
will) replace the existing Windows Driver Model (WDM). It will apply to most driver types and
even allow many of the drivers you write today to run in user mode. It'll be supported on
operating systems from Win2K forward. It’s the Windows Driver Foundation. And its latest
revision is being previewed at WinHEC 2004.

Given the importance of this topic, you might rightly ask: "Why haven’t I read more about this
stuff over the past year?" The simple answer is that most of the details are covered by Non
Disclosure Agreements (NDA) with Microsoft. Sigh! However, because WinHEC’s sessions
are not under NDA, we’re happy to be able to share with you details of the latest WDF release.

We last discussed the Windows Driver Framework (WDF) in an article for last year’s WinHEC
(May-June 2003, http:/www.osronline.com/article.cfm?article=212).

Since then, there have been very significant changes. Even the name’s been changed. Now,
it’s the Windows Driver Foundation, which comprises a Kernel Mode Framework and a User
Mode Framework. Hey, whatever. It’s all WDF to us.

What’s WDF?

If you remember a little from what we wrote last year, you already know that the Framework is
meant to be simple, conceptually scalable, highly diagnosable, flexible and extensible. And,
like I said before, it’s meant to replace WDM.

I can hear some of you now: "Microsoft is going to force some crappy beginner’s model on us,
and shove us in a box, and we’ll be back to the miniport game of trying to fit 10 pounds of
product differentiation into a model built to hold 5." I say: "Shut up." This model doesn’t suck,
and it’s not a trivial subset driver model designed to please newbies and losers. It’s a C
language, object oriented, model that takes a lot of the mundane work out of driver writing by
doing the right things by default. Well, at least most of the time. And if you don't like the
defaults WDF provides you, you can drop right back on down to the direct use of WDM without
having to totally abandon WDF.

I’ve had time to play with WDF extensively, including some real quality time over the past
couple of months. Now remember, I already know WDM from having written tons of WDM
drivers. And guess what? Given my choice, I would very strongly prefer to write a driver using
WDF today than write a driver using WDM. Yes, on the whole, it’s that’s good.

What’s Changed From A Year Ago?

Well, lots of the bumps have been smoothed out, for one thing. But the really big news is that
WDF has an entirely new, integrated, PnP and power management model. It totally abstracts
you from the hideousness that is PnP and power management in WDM. That means that if you
never did get around to learning how to write a proper power policy owner in WDM, once the
Framework is released you’ll never have to.

The available serialization models in the Framework have significantly changed in the past
year, too. After a lot of thinking, the WDF team settled on a couple of models that either
provide you serialization by default, or basically let you do your own thing.

How Cool Is It?

If you’ve been paying attention, you already know that I think it’s pretty cool. I managed to write
a Digital I/O driver (a power policy owner yet) that fully supports PnP, power management,
power-down on idle, and the ability to wake itself and the system from sleep, in something like
1200 lines of code. And that includes copious comments.

This is about the same size as the simplest power management implementation that I could
write to support those same features. Assuming, of course, I would even dare to try to
implement those features in a WDM driver (which, to be perfectly honest, I usually would not).
To me, that’s already cool.

What’s ice cold is that the WDF team is still working to smooth out the rough edges. They
recognize that they don’t know every type of driver that we write out here in the real world. So
they’re looking to the community for feedback. Does the model work for all types of devices?
Does the model meet your needs? What could be done better? These folks are serious about
wanting your feedback.

So, Show Me Some Code Already!

Well, check out the detailed article A New Framework - Writing a WDF Driver and download
our sample driver example and check it out. Join the WDF beta program. Write a driver or two
for your own favorite devices and see if the Framework works in the ways that you need it to.
And provide feedback.

This is the time when you can really influence how WDF works, what the interfaces look like,
and what parameters are passed on each function call. Take the time to get involved. If you
don’t, and you don’t like the way the Framework turns out, you’ll only have yourself to blame.
Or, if you’re just a bit more cynical, how’s this for a reason to get involved: Let’s say you do
take the time to provide good quality feedback to Microsoft. If they don’t listen to you, and it
turns out that you were right, you’ll be able to publicly gloat. How much fun could that be?


User Comments
Rate this article and give us feedback. Do you find anything missing? Share your opinion with
the community!
   Post Your Comment


"Questions and Answers"
Good questions! Thanks for asking.

1) Will source code be released? My understanding from talking informally with members of
the WDF team is that they are all strongly in favor of releasing the WDF source code. I haven't
talked to a single person involved in WDF at MS, in fact, who ISN'T in favor of releasing WDF
source code. So if the source code is NOT available at first release, my guess would be that it
will be because of some major legal obstacle. We'll just have to wait and see.

2) Will there be documentation? Yes. There will be standard DDK-type documentation and
there'll be white papers. The documentation is very time consuming to write, and the
Framework has gone through a tremendous amount of change in the past six months. So, the
documentation is lagging behind the release.

3) Where will it be supported? WDF drivers will be supported on Win2K forward. Nar
Ganapathy (the architect and dev mgr responsible for WDF) said in a talk at WinHEC that
supporting systems back to Win2K was a direct result of feedback from us folks in the
community.

4) Must You Choose? Can You Escape? In fact, ensuring that there's an escape path in WDF
(down to WDM) has been a major focus during the development, and again (according to Nar)
this was another input from the community that they heard loud and clear. You can ALWAYS
get to the "real" WDM underlying objects if you need to. You can even get callbacks within
WDF for low-level PnP/Power events.

06-May-04, Peter Viscarola (xxxx@osr.com)




"Where is WDF going to run?"
You negelect to mention which OS's will be running the WDF model. Will it be compatible with
existing OS's or is this for the future?

Will you be required to choose either say WDM or WDF? Or will there be an opportunity to
drop into the core api as needed?
Will there be documentation, or will we end up having to guess and do 1000's of experiments
to figure this new mess (er design) out?


Rating:
06-May-04, Marley Lucy (xxxx@sbcglobal.net)




"Another blackbox"
If microsoft cannot release the source code of WDF, we just have another black box.

From the current interface from the document, its idea is simple but how about the overhead it
could introduce? Now even a simplest job might be implemented by a wrapper function.

Instead of another blackbox, I prefer to more detail samples.


Rating:
05-May-04, Xinhai Kang (xxxx@promise.com)




"re: "Object oriented model in C!""
The prior comment about "full C++ VM" in the kernel is bogus. First of all, the MS C++ compiler
doesn't have a virtual machine, it generates real code. The "extra overhead" is only when you
use certain features. If you take an existing C driver, rename the files to be .cpp, put in a few
"extern "C"" declarations, you now have a C++ driver that ends up with the same binary code.
Convert functions to member functions of classes, and you still have the same binary code.


Rating:
04-May-04, David Harvey (xxxx@syssoftsol.com)




"re: "Object oriented model in C!""
Writing in C++ involves a lot more runtime overhead. They'd have to put a full C++ VM in the
kernel. Not worth it. You can write OO in any language. OOPLs simply make it easier to a point.
However, remember that earlier compilers used C as an intermediate language.


Rating:
04-May-04, Scott Neugroschl (xxxx@yahoo.com)




"WDF"
I just took the WDM course. What percentage of that material is becoming obsolescent? Will it
be possible to write a WDM driver?


Rating:
04-May-04, Barry Morris (xxxx@honeywell-tsi.com)
Don't Blow Your Stack -- Clever Ways to Save Stack Space
The NT Insider, Vol 11, Issue 2, Mar-Apr 2004 | Published: 15-Apr-04| Modified: 09-Nov-04



A persistent problem that plagues filter drivers of all sorts is the limitation on the size of the
kernel stack - a svelte 12KB in the x86 kernel mode environment. When this is coupled with
the re-entrant nature of the storage stack in particular, it can lead to stack overflow conditions.
In this article we will suggest several different strategies you can consider when trying to
minimize the stack usage within your driver.

Meet the Stack
No doubt, the basic nature of the stack is familiar to most kernel level developers, given that it
has the execution history of the processor at hand and is thus one of the essential elements
used when debugging. The boundaries of the individual stack are found in the KTHREAD
structure used by the Windows kernel to track the InitialStack, StackLimit, and KernelStack
fields. Of course, the actual current location of the "top" of the stack is in the stack pointer
register (ESP on x86, RSP on AMD-64, and SP or R12 on the IA-64). Because of the way that
stacks are managed on all three processors - with the stack "growing down" - debugging
normally starts at the current stack location and displays information at successively
increasing stack addresses.

On the x86 and AMD-64, the stack is used for parameters, return addresses, local variables,
etc. The IA-64 circumvents much of this stack usage by taking advantage of its numerous
registers (a topic unto itself!). For example, someone recently asked us about debugging a
stack overflow they were seeing - by the time we explained what was happening they said
"never mind, my disk is now so full that SR (the system restore filter driver) has stopped doing
anything so I am not stack overflowing." What we found when looking at the stack usage was
that the developer had coded large character arrays out of stack space! With several
re-entrant calls he quickly exhausted the stack and the system crashed.

STOP 0x7F
Normally, the manifestation of a stack overflow is an UNEXPECTED_KERNEL_MODE_TRAP
error (0x7F) and the first parameter is 0x8 (a "double fault"). Typically the instruction will be
something innocuous, such as "push esi" or other stack manipulation, although once in a
while we’ve seen memory references cause this (something like "mov [ebp-0x3c], eax")
because a field in a stack-based data structure is being set. In either case, it causes a page
fault. The page fault handler in the CPU attempts to push the CS, EFLAGS and EIP value onto
the stack, but since the stack is not valid it causes the double fault handler to be invoked. Trips
through the double fault handler on Windows are one-way - they always result in a termination
of the OS (the "blue screen of death") so of course it is best programming practice to try and
ensure they don’t happen.
Basically, there are two different mechanisms that we can use in the kernel environment in
order to eliminate stack overflow conditions:


       We can minimize our stack usage;
       We can detect when we’re running low on stack and take appropriate measures.

In the balance of this article we will discuss both of these techniques and provide some basic
guidance on how to implement these techniques in your driver.

Minimizing Stack Usage
The most important technique for minimizing stack utilization is to use the kernel memory
allocation functions for anything large, with the most likely candidates being data structures
and character buffers.

The simplest way to achieve this is to simply use ExAllocatePoolWithTag. It is simple but
does require that you ensure your code always frees the memory prior to exiting the function
(or, in the worst case, prior to unloading your driver). One trick for ensuring that you always
free the memory is to wrap the code within your function using the __try/__finally operation(s).
This ensures that all exit paths within the __try block must always execute the code within the
__finally block. The only downside to this is that (of course) this uses additional stack space
itself (since the termination block is stored on the stack). While this will ensure that you do not
leak memory, you can also do so through carefully testing and checking your code. This might
be necessary in extreme cases where the stack utilization must be minimized as much as
possible.

The most extreme case of stack minimization, and one often taken when all unnecessary stack
usage must be eliminated, is to create a structure definition for all of the variables within a
given routine. Upon entry to the routine, you would allocate the necessary storage from the
pool (ideally in a register defined variable, so there is no additional stack space used) , use
the individual entries within your structure (rather than local variables) for temporary storage,
and then upon exit from the function free the stack space.

A few notes here are in order:


       This is an extreme approach, generally reserved only for those cases where stack is at
        a total premium. It does work reasonably well and given modern processor and cache
        designs works reasonably well in comparison to the use of local variables.
       You must be prepared for memory allocation failures. In other words, a call to
        ExAllocatePoolWithTag might fail. Your driver must handle this potential failure or
        otherwise you might cause the system to crash. In the case where your allocation fails,
        your driver should return STATUS_INSUFFICIENT_RESOURCES, just like any other
        memory allocation failure.
       If you call from a pageable code path, you should use paged pool. Note that except
        under very unusual circumstances the memory that you allocate in this fashion will not
        actually get paged out, but it will come from the larger paged pool address region in
        the kernel.
       If you call from a non-pageable code path, you must use non-paged pool. Similarly, if
        you are in a storage driver and your driver can be called to access the paging file, you
        must use non-paged pool (even though your driver might be called at IRQL
        PASSIVE_LEVEL or APC_LEVEL).

For frequent allocation/free operations on fixed size structures (e.g., some fixed size data
structure) it is best to use a lookaside list (check the DDK documentation for
ExInitializedPagedLookasideList or ExAllocateFromNPagedLookasideList as
appropriate for your particular driver). In this case, the lookaside list that you create will
manage a list of available buffers of the given size. If there are no available buffers, one will be
allocated from pool (paged or non-paged, depending upon the way the lookaside list was
initialized). Periodically, a background thread in the OS will trim the buffer list so that it does
not become too large, with the goal of the OS to ensure you have a regular supply of
appropriately sized buffers and are not using too much memory in doing so.

Lookaside lists are not as useful for drivers that allocate variable sized buffers. The choices in
this case are either to use lookaside lists with buffers that are large enough for all (or most)
cases, or to use ExAllocatePoolWithTag. Of course, if you use a buffer that is large enough
for "most" cases, your driver will need to fall back to allocating from pool when the buffer must
be larger. Still, this can be used to optimize the "common case" while still supporting the
extreme case.

For example, in a file system driver, the maximum path name is 65534 bytes long. However,
the Win32 API itself has a much smaller internal limit (around 1KB in current versions) and
thus we might use fixed size 2KB buffers (1024 16-bit character buffers) and allocate larger
buffers as needed. Or perhaps we’d instrument our driver to figure out what the 90% size is
and use that as our common buffer allocation size. In some versions of Windows the size of
lookaside lists is dynamically resized in order to improve the overall performance of the system
- a technique that would certainly work within your driver as well.

Finally, it is always worth noting if your driver is using a recursive procedure, it is ill-advised in
the kernel environment because of the scarcity of the stack environment. Instead, you are
better advised to implement your functions iteratively (note that recursion and iteration can be
substituted for one another). In other words: write your code to use a loop, rather than to use
recursive function calls!

Stack Overflow Detection
The other technique for handling stack overflow conditions is to detect when the stack is not
large enough to handle subsequent calls down to lower level functions within this driver, or out
to other drivers. The key function call here is IoGetRemainingStackSize. This function
provides the caller with information on the amount of stack space that is remaining on the
stack for the current thread (remember, stack space is maintained on a per thread basis, so
once we switch context to a different thread we will have a different stack - we’ll exploit this in
just a minute!).

So, prior to performing some stack-consuming complex operation, your driver can check to
ensure there is sufficient stack space. If the remaining stack is not enough (and "not enough"
is a value likely to be adjusted based upon your experience running your driver in a variety of
environments) then you can take appropriate measures. Such "appropriate measures" might
include:

          Failing the request (STATUS_ INSUFFICIENT_RESOURCES) and having the caller
           perform recovery, much like they would need to do in any other resource exhaustion
           case.
          Posting the work item to a worker thread; this trick essentially leverages the fact that a
           different thread will have a new stack to work with.
          Making an alternative implementation path that hoards stack space.

In addition, we’ve seen one or two cases where kernel drivers allocate and switch to their own
stack. This approach is one we do not encourage - it is difficult to implement and has some
serious restrictions on calling back into the OS (the rest of the OS can’t use this alternative
stack safely). But, because you will sometimes see it used we wanted to mention that it is an
option other developers have pursued.

While failing the request is simple, it often leads to unsatisfactory results because the error
might manifest to the user in a rather unsavory fashion - for example, some application might
terminate prematurely, or print some unfathomable error code. If this is a rare circumstance,
this can be an excellent suggestion, but if stack overflow is a common occurrence in your
driver environment, we suggest that this is not a good solution.

The second approach of posting to a worker thread can be very effective, although we note
that it should not be used to post from a worker thread into the same work queue because that
can lead to deadlocks, thus, this might be a case where the cure is as bad as the cause! For
example, both NTFS and FAT will post work items to a worker thread when they detect low
stack conditions. There is a dedicated thread in the file system runtime library (the File
System Stack Overflow Thread) that handles these requests. You could construct such a
worker thread within your own driver as well, spawing it in your DriverEntry function, for
instance. Another alternative is to use one of the system work queues (again, provided that
your driver cannot already be using this queue for the code where you check for stack
overflow). To use a work item a driver:


          Allocates the work item by calling IoAllocateWorkItem;
          Allocate the context structure (if needed) to pass to the work routine;
          Calls IoQueueWorkItem;
          Waits for the work item to complete (this is optional but generally the case used when
           a work item is queued to handle stack overflow);
          Frees the work item in the work routine by calling IoFreeWorkItem;
A file system or file system filter driver may use the Ex versions of the work routines, but
because they are not safe for unloadable drivers, they are deprecated for use in normal
drivers.

Once running inside the work routine, your driver will complete the processing. Keep in mind
that there are some potential complications when posting to a worker thread in this fashion
including:

          Allocating memory when remaining stack is low might trigger stack overflow. We’ve
           seen cases where a driver has itself caused a stack overflow when trying to create the
           needed work item. This can happen in particular when driver verifier is enabled,
           because driver verifier uses additional stack space when your driver calls
           ExAllocatePoolWithTag!
          Serialization (locking) can become more complex in cases where your driver posts to
           a worker thread. One technique we’ve used in such cases is to indicate in the work
           item that routines should operate with the understanding that the necessary locks are
           held (by the original thread). That original thread is blocked waiting for the work item to
           complete execution. In this fashion we preserve the interlocking guarantees and can
           still handle stack overflow conditions. Getting this correct is not impossible, but may
           require some additional coding to ensure that it works. The typical symptom of
           incorrect serialization is either deadlock or data corruption, neither of which is
           desirable.
          Do not use the hypercritical queue thread. While this sounds wonderful, it isn’t and
           using this can cause other OS level problems (this thread is needed for certain OS
           critical functions and using it directly within your driver can cause unexpected system
           failures).

If you have problems overflowing the stack while allocating a work item to post to your worker
thread, we advise either allocating the work item earlier in your driver (so you have more stack
available when you are calling ExAllocatePoolWithTag) or minimizing your stack usage in
some other fashion before calling ExAllocatePoolWithTag!

For drivers that use a stack overflow thread, we would suggest that this should be an
uncommon enough case that a single thread is more than adequate. If this is not the case,
then you need to go back and find ways to save on stack space.

By combining these two techniques (minimizing stack usage and stack overflow detection) you
can minimize those nasty bugcheck 0x7F calls. However, even using these techniques it is
always going to be possible that a new combination of drivers within a single stack can lead to
a stack overflow condition. When this happens, it will require cooperation between the drivers
to implement these techniques to mitigate against the stack usage.

The storage stack suffers the most from stack overflow conditions. Future changes in
Windows will help minimize stack usage in the storage stack. For example, the new file system
filter driver model (mini-filters) in combination with the Filter Manager, will minimize the
amount of stack space used as more filters are added to the stack. This is because the filter
manager calls each mini-filter in turn, rather than having one filter call the next filter - an
implementation of converting "recursion" into "iteration" and thereby decreasing the overall
stack utilization.

Future versions of Windows will no doubt harbor new challenges in this area - as the number
of components in the various driver stacks increase, the need to save stack space will also
increase. The techniques we have described here should help driver writers for many years to
come to minimize their stack utilization, leading to a more stable Windows platform and better
user experience.
Service Pack or Dot Release? -- Test With XP SP2 Now
The NT Insider, Vol 11, Issue 2, Mar-Apr 2004 | Published: 03-May-04| Modified: 03-May-04




Unless you’ve been vacationing on Mars, you’ve certainly have heard that Windows XP
Service Pack 2 (XP SP2) will be released later this year. "What do I care?" you say. "I write
drivers, I don’t worry about service pack releases."

Sorry, wrong answer, chum. XP SP2 is more like a dot release for XP than an ordinary service
pack. The good news is that if you write quality drivers, and test them thoroughly with Driver
Verifier, you probably won’t have any problems with XP SP2. On the other hand, if you like
living on the edge and bypassing the operating system whenever you feel like it, you might
have some work ahead of you.

Let’s look at the two changes in XP SP2 that are most significant to driver writers.

Pool Overrun Checking

                                                                   ®
If you’ve read the memo from Hector on OSR Online you already know about this new feature.
Starting with XP SP2, ExFreePoolWithTag (you are using the "WithTag" variants of the pool
allocation functions, right?) was enhanced to check the integrity of the header used by the pool
block following the one being returned. If the header is corrupted, the system will immediately
blue screen. I think we can all agree that catching pool corruption is a good thing.

What you might not realize is that during beta testing, a number of drivers were discovered that
blue screened because of this new pool corruption check. The only problem was that they blue
screened while running at customer sites. Personally, I’d say this is good for the driver
community, and pretty bad for the dev who wrote and shipped those drivers without testing
them first.

If you regularly run your drivers with Driver Verifier enabled, including special pool, you’re
doing your due diligence. Your drivers are almost certainly safe. But do take a few minutes to
try out your driver code under XP SP2 just to be sure.

As for those devs who don’t test their drivers with Verifier before they ship them, well…. You
boneheads don’t read this publication anyhow. I mean, why would you take the time to read a
free publication that might help you increase the quality of your drivers if you don’t use the free
tools that are to help avoid turning your customers into QA testers?

No Execute Page Support
I bet you’re curious about how setting "no execute" protection on data pages could possibly
make a difference to driver writers. While I guess it’s possible that you have a driver that
generates code at run time and then executes it, but I’m thinkin’ that this isn’t likely to be an
issue.

No friends, the real problem for certain drivers is what has to be done to enable no execute
protection. NX is only available on processors that support it and then only when the system is
run in Physical Address Extension (PAE) mode.

Presently, the only processors that support NX are the AMD Athlon64 and Opteron. And, yes,
we’re talking about running these processors in their 32-bit mode (in case you haven’t noticed,
these processors are among the most popular CPUs on the market these days - even without
a shipping version of Windows 64).

PAE mode expands the addressing capability of the processor from 32-bits of physical
memory to 36-bits. It does this by increasing the size of the page table and page directory
entries to 64 bits, and creating a third-level page table. PAE mode was previously only
supported on Windows Server products, and only enabled on 32-bit Windows systems that
had more than 4GB of physical memory installed.

What this means is that starting with XP SP2, there will be 32-bit Windows XP systems with
less than 4GB of physical memory that will be run in PAE mode so that NX can be used. This
creates problems for drivers that:

         Think they’re smart enough to directly manipulate page tables ("I’m running on XP, or
         a system with less than 4GB of physical memory, therefore I know the PTE layout.")?

         Check to see if the system is running in PAE mode, and if it so, do something
         "different."

I don’t even want to think about why some drivers may try to understand the layout of the
system page tables. But apparently, there are drivers that do. It makes me feel dirty. But, if you
do it… now at least you’ve been warned.

There may be drivers that check to see if PAE mode is enabled and then refuse to load. This is
usually because the driver has been written with certain inherent restrictions, such as no
support for DMA to addresses longer than 32 bits. These drivers will have to be modified to
take into account the fact that PAE mode can be enabled on systems with less than 4GB of
physical memory. Note that we’re talking only about driver restrictions here. Whether or not
your device is 64-bit DMA capable is an entirely different story (the HAL has always handled
that for a well written driver).

The good news here is that if your driver doesn’t play with the page tables, or explicitly check
for PAE mode so that it can change its behavior, you’re likely to have no problems with XP
SP2. You might also be pleased to know that enabling PAE mode in Windows XP does not
cause the intermediate buffering of 32-bit DMA transfers that would be previously unbuffered,
or further limit the number of map registers that are available to

your driver. You can thank the guys who write the HAL for these changes. Oh, one more thing:
Even when PAE mode is enabled, Windows XP systems are still limited to less than 4GB of
physical memory.

Summary

To sum up, if your driver is well written and well tested, you probably don’t have anything to
worry about. I don’t know how many times we have to say this: If you follow the rules, write a
well-engineer driver, don’t try cute tricks to bypass the operating system, and test your driver
thoroughly using Driver Verifier, your driver problems will be minimized. No matter, if I were
you, I’d probably get a couple AMD systems, install XP with SP2, and make sure my driver
works the way it should. Here at OSR, we’ve already done that - We know we’re in the clear.
Finding File Contents in Memory
The NT Insider, Vol 11, Issue 1, Jan-Feb 2004 | Published: 12-Mar-04| Modified: 19-Mar-04




In our debug classes, I always invite students to bring in crash dumps for us to analyze. This
allows people to see that the analysis we do during the course is not just canned, but rather
really reflects skills that can be applied to "real world" crash analysis. In one recent class, a
student brought in a dump in which their filter driver had crashed when accessing a file. The
question the student asked was "how can I tell which file was being accessed?" Once we
tracked that down, the student asked "how can I see the contents of the file - surely it is in
memory?"

Since this is an interesting exercise - and one that seems generally useful - I’ll walk through
that process in this article. Thus, the next time you need to track down a file in memory, you
can do so.

Before walking through the mechanism, it is important to understand the basic relationship of
files to the data structures used to track them. The key data structures for this analysis are:

          FILE_OBJECT - This is the object used by Windows to track a single open instance of
          a file. Thus, if you have multiple open instances of a file, you have multiple file objects.

          SECTION_OBJECT_POINTERS - This is the data structure that connects the specific
          FILE_OBJECT to the virtual memory control structures that keep track of the file
          contents when they are in memory - and allow Windows to fetch those contents when
          they are not.

          CONTROL_AREA - This is the data structure used by the Memory Manager to track
          the state information for a given section.

          SHARED_CACHE_MAP - This is the data structure used by the Cache Manager to
          track the state of individual cached regions of the file. There can be many of these for
          a single file, with each one describing a cached region of the file.

          VACB - This is the data structure used by the Cache Manager to describe the virtual
          address region in use within the cache to access data within a cached region.

All of these data structures are available in the public type information provided in current
versions of Windows - and thus we have everything that we need to find the virtual addresses
used for the cache. With those virtual addresses, it is trivial for us to find the actual file data
itself.
In general, we either find the FILE_OBJECT by using the value in the I/O Request Packet (IRP)
or by using a handle and looking up the value in the object handle table. Thus, we can either
use the !irp command (to display an IRP and its stack locations) or we can use the !handle
command. In either case, we use the resulting information. Here, I’ve chosen to use the handle
and I’ve picked the "Microsoft Word" process that I was using to write the article itself. To do
this I used the command "!handle 0 3 86304020". The first argument indicates I want to see all
of the handles for the process (I am not sure which handle I want yet). The second is the flags
field, which indicates I want information for each handle (the default is 1 and would have only
provided me with rudimentary information. The additional flag means I am shown the name of
the various objects). The third argument is the address of the process itself (so the handle
command used the correct handle table). From this I am able to look through the table to find
the handle for this file:

0cf0: Object: 85df1028            GrantedAccess: 0012019f
Object: 85df1028            Type: (86fdbe70) File
     ObjectHeader: 85df1010
            HandleCount: 1        PointerCount: 1
            Directory Object: 00000000            Name: \Documents and
Settings\mason\My Documents\Files in Memory.doc {HarddiskVolume1}


This gives me the address of the file object (85df1028) which I then use to display information
about the specific file object:

lkd> dt nt!_FILE_OBJECT 85df1028
    +0x000 Type                      : 5
    +0x002 Size                      : 112
    +0x004 DeviceObject              : 0x86f22cc0
    +0x008 Vpb                       : 0x86f22c38
    +0x00c FsContext                 : 0xe3d92d90
    +0x010 FsContext2                : 0xe16f35b0
    +0x014 SectionObjectPointer : 0x85f50c6c
    +0x018 PrivateCacheMap           : (null)
    +0x01c FinalStatus               : 0
    +0x020 RelatedFileObject : (null)
    +0x024 LockOperation             : 0x1 ''
    +0x025 DeletePending             : 0 ''
    +0x026 ReadAccess                : 0x1 ''
    +0x027 WriteAccess               : 0x1 ''
    +0x028 DeleteAccess              : 0 ''
    +0x029 SharedRead                : 0x1 ''
    +0x02a SharedWrite               : 0x1 ''
    +0x02b SharedDelete              : 0 ''
    +0x02c Flags                     : 0x40042
    +0x030 FileName                  : _UNICODE_STRING "\Documents and
Settings\mason\My Documents\Files in Memory.doc"
    +0x038 CurrentByteOffset : _LARGE_INTEGER 0x0
    +0x040 Waiters                     : 0
    +0x044 Busy                        : 0
    +0x048 LastLock                    : 0x860c35a0
    +0x04c Lock                        : _KEVENT
    +0x05c Event                       : _KEVENT
    +0x06c CompletionContext : (null)


From this I can find the SECTION_OBJECT_POINTERS structure (85f50c6c) which once
again I can display using the debugger:

lkd> dt nt!_SECTION_OBJECT_POINTERS 0x85f50c6c
    +0x000 DataSectionObject : 0x869ed9d8
    +0x004 SharedCacheMap              : (null)
    +0x008 ImageSectionObject : (null)


This indicates to me that the file has a data section object but no corresponding shared
cache map (which means this file instance has not been set up for caching) nor image section
object (which means it is not an executable image). Now for the tricky part. The section object
actually points to a CONTROL_AREA. I can take the address listed as the address of the
DataSectionObject and see it in the debugger:

kd> dt nt!_CONTROL_AREA 0x869ed9d8
    +0x000 Segment                     : 0xe34fd698
    +0x004 DereferenceList             : _LIST_ENTRY [ 0x0 - 0x0 ]
    +0x00c NumberOfSectionReferences : 1
    +0x010 NumberOfPfnReferences : 7
    +0x014 NumberOfMappedViews : 1
    +0x018 NumberOfSubsections : 1
    +0x01a FlushInProgressCount : 0
    +0x01c NumberOfUserReferences : 2
    +0x020 u                           : __unnamed
    +0x024 FilePointer                 : 0x85f84478
    +0x028 WaitingForDeletion : (null)
    +0x02c ModifiedWriteCount : 0
    +0x02e NumberOfSystemCacheViews : 0


Notice that the FilePointer does not point back to the same file object. Recall that a
FILE_OBJECT is just an open instance, not the actual file. The Memory Manager, however,
must use one specific FILE_OBJECT in order to perform paging I/O - and this is the file object
it will use. If I display information about this object you will notice something interesting:

lkd> !object 0x85f84478
Object: 85f84478          Type: (86fdbe70) File
     ObjectHeader: 85f84460
     HandleCount: 0          PointerCount: 1
     Directory Object: 00000000                Name: \Documents and Settings\mason\My
Documents\~WRL2542.tmpory.doc {HarddiskVolume1}


Notice that the name of this file is different than the name of the original file. If I display this
using the dt command I can see the full file structure:

lkd> dt nt!_FILE_OBJECT 0x85f84478
    +0x000 Type                        : 5
    +0x002 Size                        : 112
    +0x004 DeviceObject                : 0x86f22cc0
    +0x008 Vpb                         : 0x86f22c38
    +0x00c FsContext                   : 0xe3d92d90
    +0x010 FsContext2                  : 0xe1a706e8
    +0x014 SectionObjectPointer : 0x85f50c6c
    +0x018 PrivateCacheMap             : (null)
    +0x01c FinalStatus                 : 0
    +0x020 RelatedFileObject : (null)
    +0x024 LockOperation               : 0 ''
    +0x025 DeletePending               : 0 ''
    +0x026 ReadAccess                  : 0x1 ''
    +0x027 WriteAccess                 : 0 ''
    +0x028 DeleteAccess                : 0 ''
    +0x029 SharedRead                  : 0x1 ''
    +0x02a SharedWrite                 : 0x1 ''
    +0x02b SharedDelete                : 0x1 ''
    +0x02c Flags                       : 0x144042
    +0x030 FileName                    : _UNICODE_STRING "\Documents and
Settings\mason\My Documents\~WRL2542.tmpory.doc"
    +0x038 CurrentByteOffset : _LARGE_INTEGER 0x200
    +0x040 Waiters                     : 0
    +0x044 Busy                        : 0
    +0x048 LastLock                    : (null)
    +0x04c Lock                        : _KEVENT
    +0x05c Event                       : _KEVENT
    +0x06c CompletionContext : (null)


Notice that the SectionObjectPointers field has the same address as the other file object -
these really are two instances of the same file, albeit opened using different names! Since we
do not have a cache map, this file is not stored in the Cache Manager at all (this indicates the
file was memory mapped). Thus, we cannot walk through the cache structures to find the
contents of this file because it is not in the file system data cache.
Let’s do this again, this time using a file from a web browser. I open up www.osronline.com
and look through the Internet Explorer process. From there I choose FILE_OBJECT
8606bea8:

lkd> dt nt!_FILE_OBJECT 8606bea8
   +0x000 Type                       : 5
   +0x002 Size                       : 112
   +0x004 DeviceObject               : 0x86f22cc0
   +0x008 Vpb                        : 0x86f22c38
   +0x00c FsContext                  : 0xe1d32710
   +0x010 FsContext2                 : 0xe18a25b8
   +0x014 SectionObjectPointer : 0x86cae074
   +0x018 PrivateCacheMap            : (null)
   +0x01c FinalStatus                : 0
   +0x020 RelatedFileObject : (null)
   +0x024 LockOperation              : 0 ''
   +0x025 DeletePending              : 0 ''
   +0x026 ReadAccess                 : 0x1 ''
   +0x027 WriteAccess                : 0x1 ''
   +0x028 DeleteAccess               : 0 ''
   +0x029 SharedRead                 : 0x1 ''
   +0x02a SharedWrite                : 0x1 ''
   +0x02b SharedDelete               : 0 ''
   +0x02c Flags                      : 0x140042
   +0x030 FileName                   : _UNICODE_STRING "\Documents and
Settings\mason\Cookies\index.dat"
   +0x038 CurrentByteOffset : _LARGE_INTEGER 0x0
   +0x040 Waiters                    : 0
   +0x044 Busy                       : 0
   +0x048 LastLock                   : (null)
   +0x04c Lock                       : _KEVENT
   +0x05c Event                      : _KEVENT
   +0x06c CompletionContext : (null)


And from this I look at the SectionObjectPointer field:

lkd> d nt!_SECTION_OBJECT_POINTERS 0x86cae074
   +0x000 DataSectionObject : 0x86e3e1a8
   +0x004 SharedCacheMap             : 0x86413008
   +0x008 ImageSectionObject : (null)


Notice that this time I do have a shared cache map. Let’s look at the data section object again:

lkd> dt nt!_CONTROL_AREA 0x86e3e1a8
   +0x000 Segment                    : 0xe1bae688
   +0x004 DereferenceList           : _LIST_ENTRY [ 0x0 - 0x0 ]
   +0x00c NumberOfSectionReferences : 2
   +0x010 NumberOfPfnReferences : 0x2a
   +0x014 NumberOfMappedViews : 0xe
   +0x018 NumberOfSubsections : 2
   +0x01a FlushInProgressCount : 0
   +0x01c NumberOfUserReferences : 0xd
   +0x020 u                         : __unnamed
   +0x024 FilePointer               : 0x86cce290
   +0x028 WaitingForDeletion : (null)
   +0x02c ModifiedWriteCount : 0
   +0x02e NumberOfSystemCacheViews : 2


Once again, this does not point back to the same file object:

lkd> dt nt!_FILE_OBJECT 0x86cce290
   +0x000 Type                      : 5
   +0x002 Size                      : 112
   +0x004 DeviceObject              : 0x86f22cc0
   +0x008 Vpb                       : 0x86f22c38
   +0x00c FsContext                 : 0xe1d32710
   +0x010 FsContext2                : 0xe2ae4938
   +0x014 SectionObjectPointer : 0x86cae074
   +0x018 PrivateCacheMap           : (null)
   +0x01c FinalStatus               : 0
   +0x020 RelatedFileObject : (null)
   +0x024 LockOperation             : 0 ''
   +0x025 DeletePending             : 0 ''
   +0x026 ReadAccess                : 0x1 ''
   +0x027 WriteAccess               : 0 ''
   +0x028 DeleteAccess              : 0 ''
   +0x029 SharedRead                : 0x1 ''
   +0x02a SharedWrite               : 0x1 ''
   +0x02b SharedDelete              : 0x1 ''
   +0x02c Flags                     : 0x144042
   +0x030 FileName                  : _UNICODE_STRING "\Documents and
Settings\mason\Cookies\index.dat"
   +0x038 CurrentByteOffset : _LARGE_INTEGER 0x1000
   +0x040 Waiters                   : 0
   +0x044 Busy                      : 0
   +0x048 LastLock                  : (null)
   +0x04c Lock                      : _KEVENT
   +0x05c Event                     : _KEVENT
   +0x06c CompletionContext : (null)
This is the file object used for paging - notice that in this case the file name is the same for both
file objects. Similar to the last example, and yet different. This time we also have the shared
cache map, so we display that information:

lkd> dt nt!_SHARED_CACHE_MAP 0x86413008
    +0x000 NodeTypeCode               : 767
    +0x002 NodeByteSize               : 304
    +0x004 OpenCount                  : 1
    +0x008 FileSize                   : _LARGE_INTEGER 0x48000
    +0x010 BcbList                    : _LIST_ENTRY [ 0x86413018 - 0x86413018 ]
    +0x018 SectionSize                : _LARGE_INTEGER 0x100000
    +0x020 ValidDataLength            : _LARGE_INTEGER 0x48000
    +0x028 ValidDataGoal              : _LARGE_INTEGER 0x48000
    +0x030 InitialVacbs               : [4] 0x86face00
    +0x040 Vacbs                      : 0x86413038         -> 0x86face00
    +0x044 FileObject                 : 0x866068d8
    +0x048 ActiveVacb                 : 0x86face00
    +0x04c NeedToZero                 : (null)
    +0x050 ActivePage                 : 0
    +0x054 NeedToZeroPage             : 0
    +0x058 ActiveVacbSpinLock : 0
    +0x05c VacbActiveCount            : 1
    +0x060 DirtyPages                 : 0
    +0x064 SharedCacheMapLinks : _LIST_ENTRY [ 0x8640b06c - 0x862f706c ]
    +0x06c Flags                      : 0x1000
    +0x070 Status                     : 0
    +0x074 Mbcb                       : (null)
    +0x078 Section                    : 0xe2d329f0
    +0x07c CreateEvent                : (null)
    +0x080 WaitOnActiveCount : (null)
    +0x084 PagesToWrite               : 0
    +0x088 BeyondLastFlush            : 0
    +0x090 Callbacks                  : 0xf7479c2c
    +0x094 LazyWriteContext : 0xe1d32710
    +0x098 PrivateList                : _LIST_ENTRY [ 0x8641312c - 0x8641312c ]
    +0x0a0 LogHandle                  : (null)
    +0x0a4 FlushToLsnRoutine : (null)
    +0x0a8 DirtyPageThreshold : 0
    +0x0ac LazyWritePassCount : 0
    +0x0b0 UninitializeEvent : (null)
    +0x0b4 NeedToZeroVacb             : (null)
    +0x0b8 BcbSpinLock                : 0
    +0x0bc Reserved                   : (null)
    +0x0c0 Event                      : _KEVENT
    +0x0d0 VacbPushLock                  : _EX_PUSH_LOCK
    +0x0d8 PrivateCacheMap               : _PRIVATE_CACHE_MAP


Notice that there is a single VACB for this file: 0x86face00. We can use this to display the
memory region used in the cache to map this file:

lkd> dt nt!_VACB 0x86face00
    +0x000 BaseAddress                   : 0xd4680000
    +0x004 SharedCacheMap                : 0x86413008
    +0x008 Overlay                       : __unnamed
    +0x010 LruList                       : _LIST_ENTRY [ 0x86facfa8 - 0x86facfc0 ]


And this provides us with the memory location we can use to look at the actual data contents of
the file itself:

lkd> db d4680000
d4680000           43 6c 69 65 6e 74 20 55-72 6c 43 61 63 68 65 20                 Client UrlCache
d4680010           4d 4d 46 20 56 65 72 20-35 2e 32 00 00 80 04 00                MMF Ver 5.2.....
d4680020           00 40 00 00 80 08 00 00-22 03 00 00 00 00 00 00               .@......".......
d4680030           00 00 80 00 00 00 00 00-00 b0 14 00 00 00 00 00               ................
d4680040           00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00               ................
d4680050           00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00               ................
d4680060           00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00               ................
d4680070           00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00               ................


Once you have used this technique a few times, finding the actual data contents in memory is
really rather straight-forward to do. Of course, there are some caveats of which you should be
aware:

          A file need not be stored in the Cache Manager. This is often going to be the case for
          files that are memory mapped (e.g., the files accessed by Microsoft Word or Notepad).
          Of course, mixed access could easily cause the file to be stored in cache.

          A file need not be entirely mapped in the cache. In current versions of Windows the
          Cache Manager uses 256KB regions for mapping the file. Thus, if a file is larger than
          256KB, only some portions may be mapped into the cache.

          Virtual addresses need not be present in physical memory. Just because the cache
          region is defined there is no requirement that the region contain data. If there is data
          present, it is the file data (within the size constraints of the file, of course).

Happy Debugging!
Related Articles
Enabling Debugging on the Local Machine for Windows XP®

You're Testing Me - Testing WDM/Win2K Drivers

Analyze This - Analyzing a Crash Dump

More on Kernel Debugging - KMODE_EXCEPTION_NOT_HANDLED

Making WinDbg Your Friend - Creating Debugger Extensions

Life Support for WinDbg - New Windows NT Support Tools

Life After Death? - Understanding Blue Screens

All About Lint - PC Lint and Windows Drivers

Bagging Bugs — Avoidance and Detection Tips to Consider

Choose Your Weapon: Kernel Mode Debuggers - a Choice at Last




User Comments
Rate this article and give us feedback. Do you find anything missing? Share your opinion with
the community!
    Post Your Comment


"the above comment"
Unfortunately the formatting got messed up, but you should still be able to understand it


Rating:
02-Apr-04, Stephen Cole (xxxx@microsoft.com)




"!ca and finding the data pages"
In the mapped file case, you should be able to !ca on the control area object and get info on
the memory segment and subsection(s) which contain the page numbers where the data is

eg !ca 86e3e1a8 … Pfn Ref 8 Segment @ xxxxxxxx: Subsection 1. @ 86e3e1d8 ControlArea:
86e3e1a8 Base Pte e1ac1580

dd the Base Pte to see a list of 'page numbers' in the prototype pte structure e1ac1580
02f118c0 181d38c0 199148c0 18f028c0 .....8...H...(.. e1ac1590 1a4238c0 1aa248c0
1a1458c0 036c68c0 .8B..H...X...hl. e1ac15a0 922a9cd6 922a9cd6 922a9cd6
922a9cd6 ..*...*...*...*.

These entries are effectively 'page numbers' / physical pages (bottom 3 nibbles are flags)

Can look at the data in the physical pages with !dc 02f11000

NumberOfPfnReferences / Pfn Ref usually indicates the number of pages of data it is working
on (8 pages in my example above)
NB In this example, anything page related is for 32-bit x86 with 4K page size. Different page
sizes or architectures may be somewhat different.


Rating:
01-Apr-04, Stephen Cole (xxxx@microsoft.com)
Debugging A Sound Driver
The NT Insider, Vol 11, Issue 1, Jan-Feb 2004 | Published: 12-Mar-04| Modified: 19-Mar-04



This month’s random crash dump came from a developer’s system here at OSR. He was,
innocently enough, programming along and listening to a CD (on his computer, of course)
when his system crashed. Of course, being kernel developers we are always looking for
decent crash dumps, so his system produced one (a kernel summary dump) for him. In this
article we’ll analyze this, talk about the bug we found, and our resolution of it.

Of course, by now we all know the first step in analyzing any crash dump is to use the
"!analyze" command. While it is not always right, it generally provides us with the right place to
start (See below).

1: kd> !analyze -v
*********************************************************************
**********
*
               *
*                                          Bugcheck
Analysis                                                                     *
*
               *
*********************************************************************
**********

KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
This is a very common bugcheck.                        Usually the exception address pinpoints
the driver/function that caused the problem.                                     Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003.                                         This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG.          This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG.                     This will let us see why this breakpoint is
happening.
An exception code of 0x80000002 (STATUS_DATATYPE_MISALIGNMENT) indicates
that an unaligned data reference was encountered.                                       The trap frame will
supply additional information.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: f6c32190, The address that the exception occurred at
Arg3: ba9be69c, Trap Frame
Arg4: 00000000

Debugging Details:
------------------

Debugger Dbgportaldb Connection::Open failed 80004005

Database Dbgportaldb not connected
ADO ERROR 80004005,11: [DBNETLIB][ConnectionOpen (Connect()).]SQL Server
does not exist or

access denied.

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx"
referenced memory at

"0x%08lx". The memory could not be "%s".

FAULTING_IP:
smwdm+b190
f6c32190 ff37                  push      dword ptr [edi]

TRAP_FRAME:    ba9be69c -- (.trap ffffffffba9be69c)
ErrCode = 00000000
eax=8675e2ec ebx=ba9be73f ecx=00000000 edx=00000000 esi=85d1b3c8
edi=00000000
eip=f6c32190 esp=ba9be710 ebp=ba9be740 iopl=0                  nv up ei pl zr na
po nc
cs=0008    ss=0010   ds=0023   es=0023    fs=0030    gs=0000              efl=0
0010246
smwdm+0xb190:
f6c32190 ff37                  push      dword ptr
[edi]     ds:0023:00000000=????????
Resetting default scope

DEFAULT_BUCKET_ID:      DRIVER_FAULT

BUGCHECK_STR:    0x8E

LAST_CONTROL_TRANSFER:      from f6c3164c to f6c32190

STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be
wrong.
ba9be740 f6c3164c 00000002 8675e2f8 ba9be7a4 smwdm+0xb190
ba9be760 f6c0d844 85d1b3c8 00000002 00000002 smwdm+0xa64c
ba9be780 f6c21760 8675e2f8 00000002 00000003
portcls!CPortPinWavePci::SetDeviceState+0x41
ba9be7b8 f6c21e8e 85d93c08 00000003 86470e48

portcls!CPortPinWavePci::DistributeDeviceState+0x4d
ba9be7d4 f6cdf812 86c82f48 85c3d380 85c3d378
portcls!CPortPinWavePci::DeviceIoControl+0x27f
ba9be840 f6cdf7fd 86470e48 00000004 e197f05c ks!KspPropertyHandler+0x625
ba9be860 f6c14f20 86470e48 00000004 e197f048 ks!KsPropertyHandler+0x17
ba9be870 f6c21def 86470e48 00000004 e197f048
portcls!PcHandlePropertyWithTable+0x1a
ba9be8a8 f6c14ec8 8675e2e8 86c86030 86470e48
portcls!CPortPinWavePci::DeviceIoControl+0x1e0
ba9be8c0 f6ce5633 86c86030 86470e48 ba9be90c
portcls!DispatchDeviceIoControl+0x44
ba9be8d0 f6c1c7c9 86c86030 86470e48 86432d38 ks!KsDispatchIrp+0xa3
ba9be8e4 f6c1ca2c 86c86030 86470e48 f6c88940 portcls!KsoDispatchIrp+0x40
ba9be8f0 f6c88940 86c86030 86470e48 86432d38 portcls!PcDispatchIrp+0x2a
ba9be90c 804eb3c1 86c86030 86470e48 86c86e90 smwdm+0x61940
ba9be91c f7b033f3 86432d38 86c86dd8 86c87c88 nt!IopfCallDriver+0x31
ba9be968 f77a27e7 86432d38 00000000 002f0003 aeaudio+0x3f3
ba9be9b0 f77a317e 86432d38 00000000 00000002
sysaudio!PinConnectionProperty+0x40
ba9be9dc f77a3d04 00000002 00000002 00000006
sysaudio!CPinNodeInstance::SetState+0xff
ba9be9f8 f77a3df7 00000002 00000003 00000006

sysaudio!CConnectNodeInstance::SetStateBottomUp+0x24
ba9bea18 f77a3e43 00000002 00000003 00000006

sysaudio!CStartNodeInstance::SetStateBottomUp+0x27
ba9bea38 f77a224f 00000002 00000006 8682b8d0
sysaudio!CStartNodeInstance::SetState+0x69
ba9bea4c f6cdf812 e2d6b4f0 8682b8d0 8682b8c8
sysaudio!CPinInstance::PinStateHandler+0x5b
ba9beab8 f6cdf7fd 85f7ee48 00000003 f77a0098 ks!KspPropertyHandler+0x625
ba9bead8 f77a21d8 85f7ee48 00000003 f77a0098 ks!KsPropertyHandler+0x17
ba9beb2c f6cdedfe 86a479c8 85f7ee48 804eb3c1

sysaudio!CPinInstance::PinDispatchIoControl+0x115
ba9beb38 804eb3c1 86a479c8 85f7ee48 85f7ee48
ks!DispatchDeviceIoControl+0x25
ba9beb48 f6cdf002 ba901350 ba9bebb4 00000000 nt!IopfCallDriver+0x31
ba9beb74 ba8f3fb5 86e53e70 00000000 002f0003
ks!KsSynchronousIoControlDevice+0xbb
ba9bebbc ba8f356b 86e53e70 ba901340 00000000 wdmaud!PinProperty+0x49
ba9bebd8 ba8fc56f 86e53e70 00000002 86cd0efc wdmaud!StatePin+0x19
ba9bebf8 ba8f40e9 85c3688c 86cd0ee8 00000002 wdmaud!StateWavePin+0x65
ba9bec14 ba8f4045 8608ae48 85c36000 00000000 wdmaud!Dispatch_State+0x152
ba9bec40 804eb3c1 00000000 85c36000 806bb2cc wdmaud!SoundDispatch+0x2e6
ba9bec50 805644d2 8608aedc 864b0728 8608ae48 nt!IopfCallDriver+0x31
ba9bec64 805651f6 86a55978 8608ae48 864b0728
nt!IopSynchronousServiceTail+0x5e
ba9bed00 8055e288 000001cc 000004bc 00000000 nt!IopXxxControlFile+0x5a6
ba9bed34 805306a4 000001cc 000004bc 00000000
nt!NtDeviceIoControlFile+0x28
ba9bed34 7ffe0304 000001cc 000004bc 00000000 nt!KiSystemService+0xc9
0106ff48 00000000 00000000 00000000 00000000
SharedUserData!SystemCallStub+0x4



FOLLOWUP_IP:
smwdm+b190
f6c32190 ff37                       push        dword ptr [edi]

SYMBOL_STACK_INDEX:          0

FOLLOWUP_NAME:        waddext

SYMBOL_NAME:       smwdm+b190

MODULE_NAME:       smwdm

IMAGE_NAME:       smwdm.sys

DEBUG_FLR_IMAGE_TIMESTAMP:              3cf3d814

STACK_COMMAND:        .trap ffffffffba9be69c ; kb

BUCKET_ID:      0x8E_smwdm+b190

Followup: waddext
---------


As is so often the case for those of us out in the real world, we don’t have symbols for this third
party driver. Thus, we have to work through this dump, doing a bit of careful analysis in order
to figure out what actually happened.
Generally, when I approach a dump like this I focus on the trap frame, since that is the point at
which the original exception arose. In this case the trap frame points to our faulting instruction
(ah, but we got that from the !analyze in the first place):

f6c32190 ff37                         push       dword ptr
[edi]      ds:0023:00000000=????????


Thus, the data contents of EDI do not point to a valid memory location. Of course, by itself this
is not enough to tell us why this pointer was invalid in the first place. To do that, we want to see
the sequence of instructions leading up to this point. Of course, without symbols, we don’t
know where this function even started, so we can either back up a random amount or we can
figure out where the function started.

To determine where the current function started, we can look at the address called by the
previous frame. Eventually, that will lead to a call into our current function - while often it is the
direct call, because of optimization we might have missed some call frames.

In this case the previous call frame indicates the call is from f6c31647:

f6c31647 e8f80b0000                   call       smwdm+0xb244 (f6c32244)
f6c3164c 83be1006000001               cmp        dword ptr [esi+0x610],0x1


We found this by using the Calls window in the debugger to set the focus to the previous call
frame. From this we can begin tracing from this point. Notice, however, that this address is
above the address of the fault - suggesting that in fact there is some subsequent call to the
appropriate function. To confirm this we need to look through the code flow in order to either
find a jump or call to the lower address.

For this particular case it takes a bit of time because there are several calls, each of which
must be further traced down. But we must start by disassembling from the address we
discovered:

1: kd> u f6c32244
smwdm+0xb244:
f6c32244 55                           push       ebp
f6c32245 8bec                         mov        ebp,esp
f6c32247 83ec0c                       sub        esp,0xc
f6c3224a 56                           push       esi
f6c3224b 57                           push       edi
f6c3224c 8bf1                         mov        esi,ecx
f6c3224e 33ff                         xor        edi,edi
f6c32250 6639bea4000000               cmp        [esi+0xa4],di
1: kd> u
smwdm+0xb257:
f6c32257 0f848a010000                 je         smwdm+0xb3e7 (f6c323e7)
f6c3225d 8b4618                     mov        eax,[esi+0x18]
f6c32260 83be1006000001             cmp        dword ptr [esi+0x610],0x1
f6c32267 8b4024                     mov        eax,[eax+0x24]
f6c3226a 8945f4                     mov        [ebp-0xc],eax
f6c3226d 740f                       jz         smwdm+0xb27e (f6c3227e)
f6c3226f 8d8ebc000000               lea        ecx,[esi+0xbc]
f6c32275 ff150cdbc6f6               call       dword ptr [smwdm+0x46b0c (f6c6db0c)]


The call here is a call into an OS function (calls through a table generally are using the import
table for the driver). This is easily seen using the debugger:

1: kd> dd f6c6db0c l1
f6c6db0c     806bb6b0
1: kd> u @$p
hal!KfAcquireSpinLock:


Thus, this is a spin lock acquisition. We resume disassembly of our original function:

1: kd> u f6c32275
smwdm+0xb275:
f6c32275 ff150cdbc6f6               call       dword ptr [smwdm+0x46b0c (f6c6db0c)]
f6c3227b 8845ff                     mov        [ebp-0x1],al
f6c3227e 53                         push       ebx
f6c3227f 0fb69ef8000000             movzx      ebx,byte ptr [esi+0xf8]
f6c32286 8b466c                     mov        eax,[esi+0x6c]
f6c32289 83c004                     add        eax,0x4
f6c3228c 50                         push       eax
f6c3228d ff75f4                     push       dword ptr [ebp-0xc]
1: kd> u
smwdm+0xb290:
f6c32290 e8c5deffff                 call       smwdm+0x915a (f6c3015a)
f6c32295 0fb6c0                     movzx      eax,al
f6c32298 3bd8                       cmp        ebx,eax
f6c3229a 8945f8                     mov        [ebp-0x8],eax
f6c3229d 0f8490000000               je         smwdm+0xb333 (f6c32333)
f6c322a3 6639bea4000000             cmp        [esi+0xa4],di
f6c322aa 767e                       jbe        smwdm+0xb32a (f6c3232a)
f6c322ac 80beb800000000             cmp        byte ptr [esi+0xb8],0x0


Notice the call to f6c3015a. This call should be further traced (which we actually did in our
analysis) but we’ll omit this for brevity. Thus, we continue our disassembly from this point:

1: kd> u f6c322ac
smwdm+0xb2ac:
f6c322ac 80beb800000000             cmp        byte ptr [esi+0xb8],0x0
f6c322b3 7445                     jz        smwdm+0xb2fa (f6c322fa)
f6c322b5 6a01                     push      0x1
f6c322b7 8bcb                     mov       ecx,ebx
f6c322b9 58                       pop       eax
f6c322ba d3e0                     shl       eax,cl
f6c322bc 8586b4000000             test      [esi+0xb4],eax
f6c322c2 7436                     jz        smwdm+0xb2fa (f6c322fa)
1: kd> u
smwdm+0xb2c4:
f6c322c4 66ff8ea6000000           dec       word ptr [esi+0xa6]
f6c322cb 8d8ee0000000             lea       ecx,[esi+0xe0]
f6c322d1 8b11                     mov       edx,[ecx]
f6c322d3 897cda04                 mov       [edx+ebx*8+0x4],edi
f6c322d7 8b09                     mov       ecx,[ecx]
f6c322d9 893cd9                   mov       [ecx+ebx*8],edi
f6c322dc f7d0                     not       eax
f6c322de 2186b4000000             and       [esi+0xb4],eax
1: kd> u
smwdm+0xb2e4:
f6c322e4 43                       inc       ebx
f6c322e5 83e31f                   and       ebx,0x1f
f6c322e8 6639bea4000000           cmp       [esi+0xa4],di
f6c322ef 7530                     jnz       smwdm+0xb321 (f6c32321)
f6c322f1 80a6b900000000           and       byte ptr [esi+0xb9],0x0
f6c322f8 eb27                     jmp       smwdm+0xb321 (f6c32321)
f6c322fa 83be1006000001           cmp       dword ptr [esi+0x610],0x1
f6c32301 7411                     jz        smwdm+0xb314 (f6c32314)
1: kd> u
smwdm+0xb303:
f6c32303 8d45ff                   lea       eax,[ebp-0x1]
f6c32306 8bce                     mov       ecx,esi
f6c32308 50                       push      eax
f6c32309 0fb7c3                   movzx     eax,bx
f6c3230c 50                       push      eax
f6c3230d e83afeffff               call      smwdm+0xb14c (f6c3214c)
f6c32312 eb0b                     jmp       smwdm+0xb31f (f6c3231f)
f6c32314 0fb7c3                   movzx     eax,bx


And now we see the call back to the address range we were seeking - in this case f6c3214c.
We could have also obtained this information by manually examining the stack:

1: kd> dd ba9be710
ba9be710     00000000 85d1b3c8 00000001 f6c32312
ba9be720     00000001 ba9be73f 00000000 00000000
ba9be730     85d1b3c8 86c16000 00000003 00d1b314
ba9be740     ba9be760 f6c3164c 00000002 8675e2f8
ba9be750     ba9be7a4 85d1b448 86c16000 f6c15264
ba9be760     ba9be780 f6c0d844 85d1b3c8 00000002
ba9be770     00000002 8675e2f8 00000000 00000000
ba9be780     ba9be7b8 f6c21760 8675e2f8 00000002


What we were looking for is that return address (f6c32312). From this we can back up in the
code stream to find the jump address. While not strictly required, we generally observe that
call instructions are generated as five byte opcodes (four bytes for the relative jump offset, one
byte for the opcode) and thus we can find the call by looking five bytes prior to the return:

1: kd> u f6c3230d
smwdm+0xb30d:
f6c3230d e83afeffff                 call       smwdm+0xb14c (f6c3214c)
f6c32312 eb0b                       jmp        smwdm+0xb31f (f6c3231f)


This latter approach is generally faster. Regardless of the specific approach, we can now
display the entire disassembly range:

1: kd> u f6c3214c f6c32192
smwdm+0xb14c:
f6c3214c 53                         push       ebx
f6c3214d 56                         push       esi
f6c3214e 57                         push       edi
f6c3214f 8bf1                       mov        esi,ecx
f6c32151 e8d4ffffff                 call       smwdm+0xb12a (f6c3212a)
f6c32156 8bf8                       mov        edi,eax
f6c32158 85ff                       test       edi,edi
f6c3215a 741f                       jz         smwdm+0xb17b (f6c3217b)
f6c3215c 8b4f10                     mov        ecx,[edi+0x10]
f6c3215f 8d86c8000000               lea        eax,[esi+0xc8]
f6c32165 33d2                       xor        edx,edx
f6c32167 8b18                       mov        ebx,[eax]
f6c32169 8b4004                     mov        eax,[eax+0x4]
f6c3216c 03cb                       add        ecx,ebx
f6c3216e 13d0                       adc        edx,eax
f6c32170 8d86c8000000               lea        eax,[esi+0xc8]
f6c32176 8908                       mov        [eax],ecx
f6c32178 895004                     mov        [eax+0x4],edx
f6c3217b 8b5c2414                   mov        ebx,[esp+0x14]
f6c3217f 8d8ebc000000               lea        ecx,[esi+0xbc]
f6c32185 8a13                       mov        dl,[ebx]
f6c32187 ff1510dbc6f6               call       dword ptr [smwdm+0x46b10 (f6c6db10)]
f6c3218d 8b461c                     mov        eax,[esi+0x1c]
f6c32190 ff37                       push       dword ptr [edi]
It was this final push that led to the invalid memory reference. Any time we see an invalid
memory reference like this, we try to find out from whence the address came from. In this case
we can see that EDI was loaded earlier (f6c32156). Since this came from EAX immediately
following a call, we know that this is the return value from this function. The call is another call
into the driver:

smwdm+0xb12a:
f6c3212a 8b8104010000                 mov        eax,[ecx+0x104]
f6c32130 81c104010000                 add        ecx,0x104
f6c32136 3bc1                         cmp        eax,ecx
f6c32138 7503                         jnz        smwdm+0xb13d (f6c3213d)
f6c3213a 33c0                         xor        eax,eax
f6c3213c c3                           ret


From this code snippet, it would appear they are checking some sort of list. ECX contains the
address of a data structure. It looks at the contents of offset 0x104 and if this is equal to the
memory address of offset 0x104 then the xor will be executed (setting EAX to zero) and return;
thus clearly this is the case.

Finding the actual data structure took a bit of digging because of the use of the fastcall
optimization here - ECX contains the address of the data structure. The function that called
this has damaged ECX prior to the trap frame being recorded, so we walked up to the previous
call - and this gave us a hint:

f6c0d83d 8b08                         mov        ecx,[eax]
f6c0d83f 57                           push       edi
f6c0d840 50                           push       eax
f6c0d841 ff5110                       call       dword ptr [ecx+0x10]
f6c0d844 85c0                         test       eax,eax


The call is into this driver (presumably this is the entry point table of the driver). By looking at
the stack we can find the value of EAX at the point of the call (since it was pushed onto the
stack - assuming it has not been modified since then). The address we found on the stack (you
can see it in the earlier stack dump) was 85d1b3c8 (later, we found that this value was
preserved in ESI as well). Using the "!pool" command we can see the value of this pool tag:

1: kd> !pool 85d1b3c8
Pool page 85d1b3c8 region is Nonpaged pool
 85d1b000 size:         350 previous size:               0   (Free)            Io
 85d1b350 size:           68 previous size:            350   (Allocated)       MmCa
 85d1b3b8 size:            8 previous size:             68   (Free)            Ntfr
*85d1b3c0 size:         620 previous size:               8   (Allocated) *PcNw
   Pooltag PcNw : WDM audio stuff
 85d1b9e0 size:           18 previous size:            620   (Allocated)       ReEv
 85d1b9f8 size:         608 previous size:              18   (Free)            Ntfr
Of course, this merely confirms our suspicion that this relates to the WDM audio stack
(someone more familiar with this stack probably will recognize this data structure). The value
in ECX was the location at that memory address, however:

1: kd> dd 85d1b3c8 l1
85d1b3c8       f6c6e7ac


Using this with the pool command again, leads us into never-never land:

1: kd> !pool f6c6e7ac
Pool page f6c6e7ac region is Nonpaged pool
f6c6e000 is not a valid small pool allocation, checking large pool...
f6c6e000 is freed (or corrupt) pool
Bad allocation size @f6c6e000, zero is invalid

***
*** An error (or corruption) in the pool was detected;
*** Attempting to diagnose the problem.
***
*** Use !poolval f6c6e000 for more details.
***

Pool page [ f6c6e000 ] is INVALID.

Analyzing linked list...
[ f6c6e000 ]: invalid previous size [ 0x195 ] should be [ 0x0 ]
[ f6c6e000 --> f6c6e288 (size = 0x288 bytes)]: Corrupt region
[ f6c6e4a0 --> f6c6e9e8 (size = 0x548 bytes)]: Corrupt region



Scanning for single bit errors...

None found


By using the lm (list modules) command we are able to find this memory block - it falls inside
of the driver itself:

f6c27000 f6c9b680           smwdm              (no symbols)


Thus, our suspicion is that this is a data structure within the driver itself (perhaps a global
structure). Unfortunately, without knowing more about the structure of these drivers we cannot
say what this data structure should (or would) represent. Further digging might be necessary,
but let’s return back to our original analysis. This function return 0x0 because this list was
empty. Thus, we are going to skip to the requisite code:
f6c32156 8bf8                        mov         edi,eax
f6c32158 85ff                        test        edi,edi
f6c3215a 741f                        jz          smwdm+0xb17b (f6c3217b)


This would correspond to an if block (hence if ( return value != 0) { … }) without a
corresponding else block (otherwise we’d see a couple of immediate jmp instructions.) Once
outside the if block, we see the cleanup code:

f6c3217b 8b5c2414                    mov         ebx,[esp+0x14]
f6c3217f 8d8ebc000000                lea         ecx,[esi+0xbc]
f6c32185 8a13                        mov         dl,[ebx]
f6c32187 ff1510dbc6f6                call       dword ptr [smwdm+0x46b10 (f6c6db10)]


In this code we load the first parameter (on the stack, so third parameter to this function since it
uses fastcall) into EBX, load the value of ESI+0xbc into ECX and then call the function
pointed to by f6c6db10, which is:

1: kd> dd f6c6db10 l1
f6c6db10          806bb780
1: kd> u @$p
hal!KfReleaseSpinLock:


Thus, this suggests that we’re releasing the spin lock. Recall that earlier we saw a spin lock
acquisition - checking back, we note that this was at the same byte offset (and hence likely to
be the same spin lock). This then leads us into the terminal push operation. Thus, even
without symbols for this driver, we can see the fundamental flaw in this driver - and presumably
in its testing.

The driver code in this case returns a NULL pointer (because the list is empty) and yet after
releasing its own spin lock dereferences this NULL pointer. With this level of information it
would seem that a problem of this type would be straight-forward to resolve and fix with the
developers.

Of course, we did report it to the developers of this audio driver. While they acknowledged that
this is a bug in their driver, we were told that they would not commit to fixing it. Guess it is time
to find a new sound card!


Related Articles
Undesired Debugger Behavior
Caching in the Pentium 4 Processor
The NT Insider, Vol 11, Issue 1, Jan-Feb 2004 | Published: 12-Feb-04| Modified: 19-Mar-04



A common technique for improving performance in computer systems (both hardware and
software) is to utilize caching for frequently accessed information. This lowers the average
cost of accessing the information, providing greater performance for the overall system. This
applies in processor design, and in the Intel Pentium 4 Processor architecture, caching is a
critical component of the system's performance. In this article we will describe the basics of the
cache coherency mechanisms.

The Pentium 4 Processor Architecture includes multiple types and levels of caching:

          Level 3 Cache - this type of caching is only available on some versions of the Pentium
          4 Processor (notably the Pentium 4 Xeon processors). This provides a large
          on-processor tertiary memory storage area that the processor uses for keeping
          information nearby. Thus, the contents of the Level 3 cache are faster to access than
          main memory, but slower than other types of cached information.

          Level 2 Cache - this type of cache is available in all versions of the Pentium 4
          Processor. It is normally smaller than the Level 3 cache (if present) and is used for
          caching both data and code that is being used by the processor.

          Level 1 Cache - this type of cache is used only for caching data. It is smaller than the
          Level 2 Cache and generally is used for the most frequently accessed information for
          the processor.

          Trace Cache - this type of cache is used only for caching decoded instructions.
          Specifically, the processor has already broken down the normal processor instructions
          into micro operations and it is these "micro ops" that are cached by the P4 in the Trace
          Cache.

          Translation Lookaside Buffer (TLB) - this type of cache is used for storing
          virtual-to-physical memory translation information. It is an associative cache and
          consists of an instruction TLB and data TLB.

          Store Buffer - this type of cache is used for taking arbitrary write operations and
          caching them so they may be written back to memory without blocking the current
          processor operations. This decreases contention between the processor and other
          parts of the system that are accessing main memory. There are 24 entries in the
          Pentium 4.
        Write Combining Buffer - this is similar to the Store Buffer, except that it is specifically
        optimized for burst write operations to a memory region. Thus, multiple write
        operations can be combined into a single write back operation. There are 6 entries in
        the Pentium 4.

The disadvantage of caching is handling the situation when the original copy is modified, thus
making the cached information incorrect (or "stale"). A significant amount of the work done
within the processor is ensuring the consistency of the cache, both for physical memory as
well as for the TLBs.

In the Pentium 4, physical memory caching remains coherent because the processor uses the
MESI protocol. MESI defines the state of each unique cached piece of memory, called a
cache line. In the Pentium 4, a cache line is 64 bytes. Thus, with the MESI protocol, each
cache line is in one of four states:

        Modified - the cache line is owned by this processor and there are modifications to that
        cache line stored within the processor cache. No other part of the system may access
        the main memory for that cache line as this will obtain stale information.

        Exclusive - the cache line is owned by this processor. No other part of the system may
        access the main memory for that cache line.

        Shared - the cache line is owned by this processor. Other parts of the system may
        acquire shared access to the cache line and may read that particular cache line. None
        of the shared owners may modify the cache line, which ensures that all cached copies
        of the data remain valid.

        Invalid - the cache line is in an indeterminate state for this processor. Other parts of
        the system may own this cache line, or it is possible that no other part of the system
        owns the cache line. This processor may not access the memory and it is not cached.

So long as all parts of the system obey the MESI protocol, memory remains coherent and
works properly. Note that we've been careful not to say "multi-processor" here because the
MESI protocol is important in all environments, including a multi-processor environment. For
example, a DMA controller on a PCI device must be able to ensure that the changes it makes
to memory are visible to the processor. This either requires explicit support from the hardware
(cache coherency) or from the device driver and operating system. Note that Windows does
not assume that memory is cache coherent in this fashion. Thus the device driver uses HAL
functions in order to ensure the coherency of memory in the presence of DMA. Such functions
are essentially no-ops when hardware memory cache coherency is supported by the
underlying hardware.

In addition to the MESI protocol, each region in memory can be defined as having specific
caching characteristics. The processor provides a number of different mechanisms for
controlling cache policy:
        Cache Disable (CD) - this is bit 30 in the CR0 register of the processor. If this bit is set,
        caching is not allowed for any memory.

        No Writeback (NW) - this is bit 29 in the CR0 register of the processor. If this bit is set,
        write back caching is not allowed for any memory.

        Page Cache Disable (PCD) - this bit is present in the CR3 register of the processor,
        as well as each Page Directory Entry and Page Table Entry in the system and
        controls the caching of the page tables.

        Page Write Through (PWT) - this bit is present in the CR3 register of the processor,
        as well as each Page Directory Entry and Page Table Entry in the system. It
        controls the write-through policy of updates to the page tables.

        Global (G) - this bit is present in the Page Directory Entry and Page Table Entry in
        order to determine if the particular entry is valid when the contents of CR3 change.
        Thus, if this bit is set, the page is "global" and hence valid even when the page table
        contents change.

        Page Global Enable (PGE) - this bit is present in the CR4 register of the processor
        and enables the interpretation of the G bit in the individual page directory and page
        table entries.

        Memory Type Range Registers (MTRRs) - each MTRR describes a region of
        addressable memory on the system and the specific caching characteristics of the
        memory. There are up to 96 memory regions. Normally this is set up by the BIOS.

        Page Attribute Table (PAT) - allows control of memory on a page-by-page basis.
        Each Page Directory and Page Table Entry contains a single PAT bit and when
        interpreted with the PCD and PWT bits, chooses a specific entry in the PAT table.

        Third Level Cache Disable - this is bit 6 of the IA32_MISC_ENABLE_MSR
        processor register. This exists only on processors with a third level cache and can be
        used to disable its use if present.

Thus, there are a broad range of options to consider when determining the caching policy of a
given page. It is possible (e.g., the PAT for a given virtual page) to come up with multiple
inconsistent caching policies for a single page of physical memory. In such cases the
processor may function in unpredictable ways. In earlier versions of Windows, this was
actually a problem. In Windows XP the caching of a single page must always be consistent,
even when mapped by multiple page table entries. This requirement is now enforced by the
operating system.
These various options allow control of the specific type of caching for all memory in the system.
When talking about the caching characteristics of specific pieces of memory, we use some
additional terms:

        Memory is said to be cacheable when the processor may store the data from that
        memory region within one of the processor caches. Most memory is cacheable, but
        some memory (e.g., memory used to communicate with devices) is non-cacheable.

        Memory is said to allow write-back when the processor may store modifications in its
        cache rather than store the changes immediately back to memory. This allows (for
        example) write combining operations and may cause read and write operations to
        appear "out of order" when viewed from outside the processor.

        Memory is said to allow speculative reads when the processor may read memory
        that it might need but, depending upon the ultimate result of conditional operations (for
        example) it turns out is not actually needed. Most memory falls into this category, but
        memory where a read operation changes the state of the memory (e.g., device
        memory) does not allow speculative reads.

        Write-combining is the case when multiple write operations are combined into a
        single write back to memory. This is a bit different than write-back because write
        combining does not require that read and write operations appear out of order. For
        example, if the processor allows write-combining but not write-back then a series of
        write operations may be combined into a single write; but then a read operation will
        block, the write will occur and then the read will proceed. This preserves the ordering
        of the read and write.

The ordering of read and write operations is particularly important in device memory, because
it is essential to ensure that writes to control registers be sent to the device before the
response is read from the device memory.

Normal computer memory allows caching, write-back, write-combining, and speculative reads.
The memory type range registers describe to the processor the protection for each region of
physical memory on the system. There are a number of different types of cacheable memory:

        Write Protected - this type of memory can be cached in the processor, does not allow
        any write operations, allows speculative reads and does not require any ordering of
        the read operations.

        Write Back - this type of memory can be cached in the processor, allows write-back
        and write-combining, allows speculative reads and does not require any ordering of
        the read operations.
        Write Through - this type of memory can be cached in the processor, does not allow
        write-back, allows write-combining and speculative reads, and does not require any
        ordering.

        Write Combining - this type of memory cannot be cached in the processor, does not
        allow write-back, allows write-combining and speculative reads, and imposes weak
        ordering (that is, reads are ordered with respect to write operations).

        Uncacheable - this type of memory cannot be cached in the processor, does not allow
        write-back, speculative reads, or write-combining (by default). However,
        write-combining may be allowed by explicitly enabling it in the MTRR.

        Strong Uncacheable - this type of memory does not allow any type of caching,
        write-back, write-combining, or speculative reads. All access is strongly ordered (that
        is, reads occur in order, writes occur in order).

Of course, these types of memory caching are realized by combining the various control
mechanisms to yield the final type of cacheable memory. When determining the cache policy
of a given region of memory, the CD bit is first considered; if it is set, caching is disabled, and
no other attributes are considered. If the CD bit is clear, then the MTRR and page level cache
controls are considered, with the most restrictive policy being enforced. If the write-back and
write through policies conflict, then the write-through policy takes precedence.
Write-combining policy takes precedence over write-through or write-back policy as well.
Write-combining can only be established via the MTRR or PAT mechanisms in any case.




In addition to managing the cache policy on individual pages, there are also specific
instructions in the Pentium 4 that can be used to perform operations that directly affect the
cache itself. These include the cache invalidation operations (INVD and WBINVD,) the
prefetch hint operations (PREFETCHh,) cache flush (CLFLUSH) and non-cache polluting
move operations (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD.)

Even with all of this processor support for cache coherency, there are a number of cases when
it is the responsibility of the developer (generally the operating system) to handle some cache
coherency issues:

        Self-modifying code: because the trace cache consists of decoded micro ops, if the
        original code is changed, the trace cache must be invalidated by using a serializing
        operation. There are several, but the documentation for the Pentium 4 clearly favors
        CPUID.

        Dual-ported memory: because changes to dual ported memory can occur outside the
        ability of the cache coherency mechanism to detect. Thus, the operating system must
        explicitly perform cache invalidation (using an appropriate instruction).
          Page Table Changes: because the Pentium 4 includes speculative execution, a
          change to a page table entry must be followed by a TLB invalidation to ensure that
          stale TLB information is not used. In other words, changes to the page table are not
          reflected in invalidations of the TLB that caches information retrieved from the page
          table.

With these various caching mechanisms in place, the overall performance of the Pentium 4 is
greatly improved. By including strong cache control within the processor itself, the operating
system is only minimally burdened. The architecture of the Pentium 4 also works quite well in
the presence of multiple processors. However, the operating system must deal with coherency
of the page table information, and ensure that cache policy is uniform across processors (e.g.,
the PAT must be identical on each processor).

While complicated, caching greatly improves the overall system performance. Of course, a
faster processor is just the first part of the performance puzzle!




Related Articles
Cache Me if You Can: Using the NT Cache Manager

Caching in Network File Systems
User Comments
Rate this article and give us feedback. Do you find anything missing? Share your opinion with
the community!
   Post Your Comment


"how to read L3 cache bit ?"
I would like to check if the L3 cache is enabled. How do I go for accessing the
IA32_MISC_ENABLE_MSR ??


Rating:
05-May-04, Martin Bauer (xxxx@siemens.com)




"Hyperthreading in P4"
There is no explicit "divide the cache evenly" style solution in the P4 - the cache is shared and
recycled as needed. If you have one logical processor that does not access much physical
memory and another that does, then the one that does will use most of the cache. However,
there is a big hit when it comes to the sharing of TLBs because they are context dependent on
the address space. Thus, the best approach here is to ensure the threads sharing the
processor are running in the same address space, in order to minimize the context switching.
Thus, hyperthreading will be most effective on multi-threaded applications, rather than on
disjoint process style applications. Of course, like any discussion of performance it will
ultimately depend entirely upon the benchmark you are using.

22-Mar-04, Tony Mason (xxxx@osr.com)




"Hyperthreading and Impact to Caching"
What insightcan be offered regarding the use of logical processors on hyper threading enabled
Pentium processos and the impact on processor cache? Is the level II cache split between the
two processors or is it handled as a single shared cache?


Rating:
22-Mar-04, Thomas Hennessy (xxxx@novadigm.com)




"Response"
Manfred's point is not that the cache line is not 64 bytes (it is) but rather that the processor can
(and frequently does) fetch more than 64 bytes at a time. Point taken.

As for write ordering, that doesn't read quite right does it. Writes are ordered with respect to
other writes, but not with respect to reads. Sorry for the confusion.
Rating:
22-Mar-04, Tony Mason (xxxx@osr.com)




"Cache line size for Pentium 4 cpus"
The article says that the cache line size is 64-bytes. This is misleading: Usually the two
adjactant cachelines are transferred together. Therefore on SMP, hot objects (e.g. an array of
spinlocks, or something like that) should be aligned to an 128 byte boundary, not a 64-byte
boundary.

20-Mar-04, Spraul Manfred (xxxx@colorfullife.com)




"Write Through"
Is this a typeo?

"Write Through - ..., and does not require any ordering."

i.e. doesn't Write Through require ordered writes? As well as restricting Write Combining such
that the combined writes follow the ordere of the writes (permits burst writing provided the
order in the burst follow the order in the write).


Rating:
20-Mar-04, James Dempsey (xxxx@ameritech.net)
Emerging Issues in IoCancelFileOpen
The NT Insider, Vol 10, Issue 4, July-August 2003 | Published: 06-May-03| Modified: 27-Aug-03



At the April 2003 Microsoft IFS Plugfest an issue was identified that could arise when two file
system filter drivers interact in an unexpected fashion during the processing of
IRP_MJ_CREATE.

This situation arises when the lower filter performs an IRP_MJ_READ or IRP_MJ_WRITE
operation after the IRP_MJ_CREATE has been satisfied by the underlying file system and
then the higher filter calls IoCancelFileOpen because it has decided to disallow the file open.

What actually happens is that the lower filter's call will cause the virtual memory system (cache
manager and memory manager) to increment the reference count of the file object passed in
the IRP_MJ_READ. If this is the IRP_MJ_READ from the original IRP_MJ_CREATE, the
reference count on that file object is incremented.

When the higher filter calls IoCancelFileOpen the I/O Manager issues an IRP_MJ_CLEANUP
and IRP_MJ_CLOSE down to the lower filter and then to the file system - as if the reference
count had in fact dropped to zero. The higher filter then returns an error to the I/O Manager
(or other OS component) and that component then discards the file object, even though there
is a dangling reference from the virtual memory system. Thus, later when the virtual memory
system releases its reference to the "file object" it is now acting on a potentially arbitrary
location in memory. This can trigger random failures.

In the initial analysis, Microsoft developers indicated that this appears to be a bug in
IoCancelFileOpen and the way it is used within Windows. On further investigation, there does
not appear to be a fix that can be made that would only affect IoCancelFileOpen to address
100% of the issues that can arise, but Microsoft developers are exploring this issue in search
of a good solution.

OSR developers have suggested that in fact this is a bug in the lower filter because it is never
safe to perform a cached I/O call IRP_MJ_READ or IRP_MJ_WRITE with the file object from
the IRP_MJ_CREATE. This is because some paths within the file system create fake (stack
based) file objects in order to improve performance for operations that would not invoke the
virtual memory system. Note: with NTFS, a non-cached I/O request is not honored in the case
of compressed files. Other (third party) file systems may similarly ignore a request for
non-cached I/O.

At the present time our suggested work-around is for the filter to use IoCreateStreamFile,
IoCreateStreamFileLite, or IoCreateStreamFileEx to create its own stream file object. This
new stream file object may then be used when setting up the next I/O stack location in the
IRP. The filter should then forward the IRP to the next driver synchoronously (this is normally
done by blocking and waiting on a notification event that is set in the completion routine). The
stream file object may then be used for IRP_MJ_READ and IRP_MJ_WRITE operations
safely. When the filter is done using the stream file object, it may be dereferenced (potentially
causing an IRP_MJ_CLEANUP and IRP_MJ_CLOSE) and Windows will delete it when
appropriate. If the original create is to be allowed to proceed, the IRP can then be sent (with
the original file object) to the underlying driver for further processing. Initial reports from
developers are promising regarding this work around.

Regardless, we know this is an emerging issue. Stay tuned for further information.

Additional Information (May 9, 2003)

Microsoft has indicated that the following calls may use stack-based file objects:

NtDeleteFile
NtQueryAttributesFile
NtQueryFullAttributesFile
IoFastQueryNetworkAttributes

And that these fake file objects may appear for the following operations:

IRP_MJ_CREATE
IRP_MJ_CLEANUP
IRP_MJ_CLOSE
IRP_MJ_QUERY_INFORMATION
FastIoQueryOpen
FastIoQueryBasicInfo
FastIoQueryStandardInfo
FastIoQueryNetworkOpenInfo

Further, the information from the Microsoft team indicates that this situation is unlikely to
change in the forseeable future.

The previously described solution remains OSR's suggested solution.

Additional Information (July 10, 2003)

The Microsoft filter driver team has provided an alternative solution as well:

To work around this issue, the filter doing cached I/O to the file during IRP_MJ_CREATE
processing should do the following:


       Check to see if the file object address is within the current stack limits
        (IoGetStackLimits)
       If the file object address is within the current stack limits, the filter should allocate its
        own file object (IoCreateStreamFileObject, or IoCreateStreamFileObjectLite), pass
        this stream file object down to the file system to be opened, then do the cached I/O
        against this file object. This is a well-formed file object which will be cleaned up safely
        if this is the file object referenced by either the cache manager or the memory
        manager.

Here is the pseudo code for a filter that cancels create operations. In your IRP_MJ_CREATE
dispatch routine, before the operation has been passed down to the file system, if this is a
create that you may wish to cancel, do the following:

//

// Pseudo code assumptions:

//

// Variables:

// NameToOpen ? A WCHAR buffer that contains the name the filter

// wants to open; most likely the same name specified by the

// user in the file object passed into the CREATE operation.

// The filter?s private file object must have its own copy of

// the file name buffer. In the code below, the NameToOpen

// is a full-path from the volume root.

// NameToOpenLength ? The length in bytes of NameToOpen

// IrpSp ? Pointer to the current IRP stack location

// UsersFileObject ? Local to store the file object passed in via

// the CREATE IRP.

//

//

// This API will create a FileObject and then issue the IRP_MJ_CLEANUP

// on this handle. This API is only available on Windows 2000 and later.

// On earlier OS versions, you can use IoCreateStreamFileObject and issue

// your own IRP_MJ_CLEANUP irp.
//

SwapFileObject = IoCreateStreamFileObjectLite( IrpSp->FileObject, NULL );

if (SwapFileObject != NULL) {

//

// We now need to initialize the file object just created. In most

// cases we can just to a full copy of the user?s file object and

// change the name fields.

//

RtlCopyMemory( SwapFileObject,

IrpSp->FileObject,

sizeof( FILE_OBJECT ) );

//

// This copy cleared the FO_STREAM_FILE flag that was set. It also

// cleared the FO_CLEANUP_COMPLETE flag that was set by the

// IoCreateStreamFileObjectLite processing (remember this API will

// send down the IRP_MJ_CLEANUP to the file system for this file object),

// but we want to leave that cleared so that FAT will allow us to

// open this file object.

//

SetFlag( SwapFileObject, FO_STREAM_FILE_OBJECT );

//

// Now setup the new name buffer in the FileObject.

//

SwapFileObject->FileName.Length =
SwapFileObject->FileName.MaximumLength =

NameToOpenLength;

SwapFileObject->FileName.Buffer = NameToOpen;

SwapFileObject->RelatedFileObject = NULL;

//

// Now replace the user?s file object with our new file object

//

UsersFileObject = IrpSp->FileObject;

IrpSp->FileObject = SwapFileObject;

//

// Now is also the time to save off any of the original IRP

// parameters and set your desired IRP parameters.

//

//

// < Insert code here to issue create and synchronize back >

// < to dispatch to process completion. >

//

//

// Post-create processing

//

//

// Remember up above when we didn?t restore this flag to the stream

// file object we allocated? It is now safe to reset it.

//
SetFlag( SwapFileObject->Flags, FO_CLEANUP_COMPLETE );

if (Status = STATUS_SUCCESS) {

//

// < Do whatever processing you need to on this opened file object >

// < before canceling the create. >

//

if (SwapFileObject->PrivateCacheMap != NULL) {

//

// Caching was setup on this file object. Since the

// CLEANUP has already been issued on this FileObject,

// the file system will not uninitialize the cache map

// for us. Do this now.

//

CcUninitializeCacheMap( SwapFileObject );

}

//

// < Reinitialize the IRP. >

//

//

// Now restore the user?s file object into the IRP so that the original

// create can be sent down.

//

IrpSp->FileObject = UsersFileObject;

//
// Finally, reissue the CREATE irp.

//

Status = IoCallDriver( TargetDeviceObject, Irp );

//

// < Insert the desired logic here to do the filter?s desired post- >

// < IoCallDriver processing of a CREATE. >

//

}

//

// We are now finished with our stream file object, so dereference it. The

// IRP_MJ_CLOSE will be issued to the file system when the last

// reference is released.

//

ObDereferenceObject( SwapFileObject );

								
To top