Inside the Windows Runtime, Part 2

As I mentioned in the previous post, run-time libraries in Windows provide services for applications running in User mode. For historical reasons, this run-time layer in Windows was always known as the Win32 libraries, even when these services are requested in the 64-bit OS in 32-bit mode. A good example of a Win32 run-time service is any operation that involves opening and accessing a file somewhere in the file system (or the network, or the cloud). A more involved example is the set of Win32 services an application needs to access to play an audio file, including understanding the specific audio file compressed format, and checking authorization and security.

For Windows 8, a portion of the existing Win32 services in Windows were ported to the ARM hardware platform.  The scope of the Win32 API is huge, and it was probably not feasible to convert all of it during the span of a single, time-constrained release cycle. Unfortunately, the fact that the new Windows 8 Runtime library encompasses considerably less than the full surface area of Win32 means that Windows 8 is not fully upwards compatible with previous versions of Windows. Existing Windows applications can run in the desktop mode provided in Windows 8, but that is available only for Intel-based computers not ARM tablets and phones. The number of Windows 8-specific apps that take advantage of the new UI is growing, but it is still limited, compared to tablets that run iOS or Android.

For customers using Intel-based machines, the confusion that arises from being forced to switch back and forth between the new Windows 8 Metro interface, which was designed with touch-screen based computers, tablets and phones in mind, and the legacy applications running on the desktop interface is palpable. Shortly after Christmas, I took my daughter to a local electronics store where she could pick out a new touch-based PC to use for school. (Who am I fooling? She uses her PC mainly to access Facebook and keep up with all her friends online.)

The first thing I noticed at the computer store, which featured more than 50 PCs from at least a dozen manufacturers, was the dearth of touch screen options. There weren’t very many vendors that were prepared to ship a Windows 8 machine with a touch screen in time for Christmas 2012. (In the consumer-oriented portion of the PC business, perhaps, back-to-school purchases is more important for sales than Christmas gift-giving.)

The 2nd thing I noticed was the dearth of options to purchase Windows 8 RT ARM-based tablets. Here, I think Microsoft is caught between a rock and a hard place in the realities of the PC manufacturing business. PC manufacturers like Lenovo and Toshiba have deep experience working with Intel technology in their current PC products. Switching to ARM-based processors means gaining experience with new chipsets. Graphics co-processors, NICs, hard drive interfaces, etc., on ARM-based computers all require ARM-compatible support chips that comply with the ARM architecture. These legacy PC manufacturers are naturally comfortable with the Intel-based electronics that they have used for many years and are reluctant to invest in ARM technology until they are more certain where it is going.

Meanwhile, companies like Samsung and HTC that have already made a major investment in ARM weren’t sitting around waiting for a new version of Windows from Microsoft to dive into the phone and tablet market before Apple swallowed it whole. They were putting the Android OS on  ARM-based products and pushing them into the market. A manufacturer like HTC may keep one foot in the Windows camp (there is a new HTC-based Windows 8 phone), but they are continuing to deliver Android-based products on essentially the same hardware. Samsung, which is solidly number two in tablets and smartphones behind Apple, may well have decided not to dilute its Galaxy brand of Android-based tablets by introducing a Windows 8 RT variant.

But the 3rd thing I noticed was how awkward my daughter found the new Windows 8 UI. Frankly, when she encountered the new Win 8 Metro interface, she  was baffled. Her initial reaction was to reject all the new PCs immediately, and, since she already has an iPad, she did not see the need for another tablet-based machine with limited capabilities. I counseled her that any new computer running the Windows 8 Metro interface without a touch screen did not make a lot of sense either, and eventually she was persuaded to go with a good-looking Acer machine. However, I can report she is much more comfortable, even happy, today with her new Win8 machine, after running it for a month or so.

You would think that among all those highly-intelligent individuals in Microsoft that were involved in planning and building Windows 8, some of them would have thought about these important concerns and done something about them. My experience with the Windows 8 planning process was that many of these concerns were known in advance, but were ignored, leaving the Windows 8 release deeply compromised. Those compromises are the subject of this series of blog postings. In this post, I intend to drill into the disconnect between the COM technology that provides the underpinning for much of the Win32 API and the .NET Framework component technology that is very popular among professional Windows developers. This disconnect is at the root of some of the compromises that the designers of Windows 8 decided to make.


Historically, the Win32 run-time libraries rely extensively on COM technology. By necessity, the new Windows Runtime is also COM-based, but most of the COM interfaces are hidden below the surface. This lends a much newer feel to new Runtime, a very welcome change that software developers will appreciate.

Borrowing an important feature of Microsoft’s proprietary .NET Framework, the new Windows Runtime libraries use metadata to describe the runtime components they contain. The metadata embedded in a Windows Runtime library – similar to full-fledged .NET assemblies – describes the classes that are defined, the interfaces that are implemented, and all the members of those classes: their methods, fields, properties, events and types. The metadata serves during runtime to provide dynamic linkage, similar to the way COM components can use the IUnknown interface. The metadata also supports the type of static, compile-time type checking that .NET Framework components rely on. (There is a more complete discussion of the metadata that is defined for .NET Framework assemblies here.)

Since metadata is available to describe the binary components in the new Windows 8 Runtime libraries, you can use tools like the Object Browser and Intellisense in Visual Studio to examine and work with them during coding and debugging. During compilation and debugging of a program that is built using a .NET language such as C#, Win 8 Runtime look and behave like .NET Framework managed code components. However, they should not be confused with managed code components – there are some very crucial differences.

One critical difference is that Windows Runtime components really are native code runtime modules that are ready to link to and run. In contrast, in .NET, modules are compiled first to a platform-independent Intermediate Language (IL) format. In a .NET assembly, the IL is run through a Just-In-Time compiler (the JIT) when it is first executed to create the executable native code dynamically on the fly. (There is also a .NET Framework utility called NGEN to generate native code from IL to build an executable that can be called directly at run-time.)

This is not the place for an extended discussion regarding the trade-offs between JIT vs. NGEN in the world of the .NET Framework. My sole intention here is to emphasize that while Windows 8 Runtime libraries appear similar to .NET components when you access them, they are directly executable and, they run native C++ code. Moreover, this was probably the correct decision architecturally: you should be able to call any layer of OS-level services directly. Moreover, OS services should begin execution immediately, without any of the delays associated with generating code at runtime.

(On the other hand, the .NET Framework approach that generates code Just-In-Time during run-time eliminates the need to build different installation packages for each platform – x86, x64, and now ARM. This remains the right choice for any Windows application that I might build, including those designed to run on top of the new Windows Runtime.)

So, because they use the same metadata, the new WinRT libraries feel a lot more like .NET Framework components during design time. And the developers of Windows 8 have also taken steps to make them behave more like Framework components during run-time, providing an inter-operability layer that projects the native types that the Windows Runtime supports into their .NET equivalents. It is no longer necessary to use PInvoke to call one of the Runtime APIs from a .NET program. But there is still a fundamental disconnect between the two objected-oriented runtimes that can easily lead to problems. Understanding how these problems can arise calls for a bit of an explanation.

In the CLR, all instances of an object are managed. This management basically entails memory management: the allocation of storage for the object is managed by the CLR. While the managed code application has a pointer to the address of that object’s storage location, the program is never permitted to access this address pointer directly. Management also means that the CLR keeps track of the object’s lifetime, and is responsible – and not the application program, directly – to clean up after the object has been discarded (or de-referenced). The CLR task that is responsible for keeping track of an object’s storage, its lifetime, and for clean-up is known as garbage collection.

Restricting a managed code application from directly accessing the memory addresses any of the objects it is manipulating has major benefits.  One benefit comes from eliminating a legion of logic errors whenever address pointers are mishandled in your application. These are bugs that inevitably occur whenever address pointers get passed around. In any complex application, it is also quite easy to lose track of address pointers.


Figure 2. Pseudo-code illustrating how the bookkeeping associated with object references grows more complicated over time as code paths that deviate from the straight line entry and exit logic are added.

For example, you can try to enforce a coding standard where all code modules allocate working storage during entry and delete all its associated working storage at exit. Over time, even this structured code inevitably will start to spring leaks. New logic may add code paths that exit the module at earlier points. Objects allocated in the module may need to be persisted and operated on asynchronously by other modules and services. This growing code complexity is depicted in Figure 2, where a simple allocate-on-entry and delete-on-exit scheme is enforced. However, later additions to the code (indicated in blue) lead to code paths that exit the module without triggering the de-allocation on exit. Reportedly, the majority of memory leaks uncovered in native Windows code can be attributed to objects that are abandoned in this fashion.

Another common pattern illustrated in Figure 2 that leads to memory leaks is when one function allocates the object, then passes a pointer to that instance to another function which then operates on it. As the pointer is passed from function to function, it is easy to lose track of the original owner, especially once any sort of event-oriented, asynchronous processing logic is introduced.

As a computer program grows more and more complex over time, the simple, per-module bookkeeping scheme tends to break down with that growing complexity, as illustrated in Figure 2. The chain of ownership of objects across modules becomes less and less clear or can break down entirely. This never happens in .NET because the CLR tracks object ownership across the entire process. Automated memory management performed by the CLR Garbage Collector (or GC, for short) eliminates this whole class of serious bugs, namely, memory leaks, when an application allocates memory for some instance of a class and then fails to release that memory when that instance is no longer active. (To be sure, other kinds of object ownership anomalies in .NET programs can still occur that do lead to memory leaks.)

Whenever a function at one level in your program passes an address pointer to a function at another level, it is effectively a grant of access to the underlying memory structures. There is another large class of bugs that result when the called function makes a change that has an unanticipated effect back inside the original calling function. These are nasty bugs, termed side effects because they often manifest themselves in devilishly obscure ways. Side effects are the bane of any developer working in a complex, many layered application, something even the most experienced developers dread. You fix a bug in one area of the code, but the change has a ripple effect in some entirely different area of the code. This is code that is very difficult to maintain.

In fact, one of the principal rationales of the object-oriented approach to software development – no external access to a function’s internal affairs – is to eliminate, or at least reduce, the risk of side effects. With its strict approach to memory management and type safety, the .NET Framework, is actually very, very effective in reducing the impact of side effects.

Another benefit of automatic memory management – the CLR functions that allocate the memory for objects and also perform garbage collection automatically for objects that are no longer in use – is that it is not necessary to develop these memory management routines yourself. With its GC functions, the .NET run-time understands exactly what memory has been allocated for which object and which memory areas are eligible to be freed because they are no longer in use. Your application is essentially freed from this very error-prone task. And, whenever the application faces “pressure” because available memory is scarce, the GC will initiate garbage collection, releasing the memory allocated previous by discarded object instances and consolidating free space using relocation and compaction, as necessary. Furthermore, the GC is able to relocate any currently used portions of memory during a compaction because the application is restricted from accessing pointers to memory locations directly.

What the CLR’s GC cannot do, however, is manage the objects created by the new Windows 8 Runtime services, which rely on COM interfaces, so they are subject to a separate reference counting process. Interactions across the boundary between the CLR’s GC and the COM-based reference counting system. (Javascript code interacting with the Windows 8 Runtime is similarly affected.) Interactions across this inter-op boundary are prone to memory leaks if you are not extremely careful about managing the lifetime of the objects being created on either side of the managed code/native code boundary. When developing a Windows 8 App using C#, for example, your application is likely to create an event handler in C# for a Win 8 Runtime resource such as a UI window or panel. A persistent reference in a C# event handler to a Windows Runtime resource can keep the resource alive and the .NET GC at bay. This interaction across the inter-op boundary between an event handler written in C# that accesses a native Windows 8 Runtime object like a menu is illustrated in Figure 3.

Last summer, MSDN Magazine published a lengthy article by Steve Tepper, a program manager in Windows, entitled “Managing Memory in Windows Store Apps” that discussed this problem in some detail. (Part 2 of the two-part article was published in November 2012.) As Tepper describes the problem, the GC has no knowledge of the lifetime of the Windows 8 Runtime resource being referenced. Meanwhile, the Windows 8 Runtime has no intrinsic knowledge of the status of the C# event handler that has a reference to one of its objects. This sort of circular reference will keep objects alive on both sides of the inter-op boundary whenever there is an active reference that crosses that boundary.


Figure 3. When a C# event handler references a Windows 8 Runtime object, a circular reference that can keep both objects alive in a .NET Framework program is created, unless the status of all Windows 8 Object references is carefully maintained. After Tepper, “Managing Memory in Windows Store Apps” published in MSDN Magazine.


Another element of the Windows 8 Runtime that was borrowed from the proprietary .NET Framework is the use of XAML to describe the UI declaratively. XAML was originally defined and used with the Windows Presentation Foundation (WPF), an ambitious and long overdue overhaul of the legacy Windows Forms technology. XAML is the mark-up language that was used to define the graphical elements that a WPF-based application supports. XAML for WPF uses tags that look similar to html codes for defining the UI, but the tags all refer to Windows entities.

The UI for Windows 8 App store apps is based on the emerging HTML 5 standard, but instead of conventional html tags, the mark-up language is a new dialect of XAML. The decision to require using proprietary XAML with standard HTML 5 elements for the new touch screen-oriented App Store apps in Windows 8 is curious, to say the least. The HTML5 spec itself is very new and the specification itself is still fluid. (The most recent candidate specification, published in December 2012, is available here.) It was created with web-based applications in mind, the sort that are hosted inside a web browser like IE or Chrome.

Nevertheless, HTML5 may turn out to be the perfect choice for Metro-style apps, optimized for mobile computing. The new layout capabilities in HTML5, its support for high resolution graphics, and the new ability to handle audio and video content directly address a long laundry of list of concerns in the web content community. It was never designed for conventional desktop computing, but Metro-style apps aren’t either.

Windows App Store apps can also be written using javascript, which is the standard across the industry for writing code to interact with HTML elements inside the web client. I imagine Microsoft is trying to suggest to phone and tablet software developers already working on iPhone and Android apps that porting them to Windows 8 is not only straightforward, but doesn’t require surrendering to Microsoft proprietary technology.

But then there is XAML, which is a Microsoft proprietary technology. Marrying the proprietary XAML language to HTML5 compromises Microsoft’s commitment to open standards for Windows 8 software development.

In the course of working on Windows 8, Microsoft adapted its Expression Blend graphical design tool, originally developed around WPF’s flavor of XAML, to produce the HTML5 UI that Windows Store Apps use. Blend itself is a very impressive design tool aimed at enabling collaboration among graphical designers and UI software developers. Leveraging Blend for Windows Store App development is very desirable, especially since HTML5-compliant graphical design tools are still emerging. It may not have been technically feasible to re-purpose Blend for Windows Store apps in time for Windows 8 to ship without adopting XAML. Blend now ships with Visual Studio 2012, while the rest of the Expression suite of web and graphical development tools appears dead.

The decision to use XAML also raises questions if you are an experienced developer of existing Windows desktop or phone applications and are heavily invested currently in using either WPF and/or Silverlight. The future viability of both WPF and Silverlight are seriously compromised by Windows’ adoption of HTML5 for Windows 8.

What to do now if you are a Windows software developer? Stay on Windows 7? Remain in WPF, but run only in the designated desktop application sandbox in Windows 8. Or, convert completely to HTML5 & the new Metro style UI? Or, perhaps, better start learning Objective C before your career as a software developer runs headlong into a Microsoft proprietary dead-end.


Inside the Windows 8 Runtime, Part 1

This is the next installment in a series of blog posts on the recent Windows 8 release that I began a few months back. In the last entry I expressed some reservations about the architectural decisions associated with the new Windows Runtime API layer. In this post and the ones that follow, I will provide more detail about my concerns as we look inside the new Windows Runtime layer. But, first, we will need some background on the native C language Win32 API, COM, and the Common Language Runtime (CLR) used in the .NET Framework. Collectively, these three facilities represent the run-time services available to Windows application prior to Windows 8. As I mentioned in the earlier posts, the new Windows Runtime layer in Windows 8 is a port of a subset of the existing Windows Win32 run-time to run on the ARM hardware platform.

Windows Run-time Libraries

Run-time libraries in Windows provide an API layer that applications running in User mode can call to request services from the OS. For historical reasons, this runtime layer is known as Win32, even when a Win32 service is called on a 64-bit OS. A good example of a Win32 runtime service is any operation that involves opening and accessing a file stored somewhere in the file system (or the network, or the cloud). Application programs require a variety of OS services in order to access a file, including authentication and serialization, for example.

Today, the Win32 API layer spans 100s of dlls and contains hundreds of thousands of methods that Windows applications can call. One noteworthy aspect of the Win32 runtime libraries is how long they need to persist, due to the large number of Windows applications that depend on them. Software endures. The extremely broad Win32 surface area creates a continuing obligation to support those interfaces, or else risk introducing OS upgrades that are not upwardly compatible with earlier versions of the OS.

Historically in Windows, language-specific run-time libraries were provided to allow applications to interact with the OS to perform basic functions like accessing the keyboard, the display, the mouse, and the file system. By design, the standardized “C” language was kept compact to make it more portable, but was intended to be augmented by C runtime libraries for basic functions like string-handling. To support development in C++, for example, Microsoft provided the Microsoft Foundation Class library, or MFC, which included a set of object-oriented C++ wrappers around Win32 APIs, originally developed using the C language. When you consider that Win32 APIs provide common services associated with Windows user interface GUI elements like windows, menus, buttons and controls, dialog boxes, etc., you can imagine that the scope of MFC was and is quite broad.

MFC also incorporated many classes that were not that closely associated with specific OS services, but certainly made it easier to develop applications for Windows. Good examples of these ancillary classes include MFC classes for string handling, date handling, and data structures such as Arrays and Lists. Having access to generic C++ objects that reliably handle dates and other calendar functions or implement useful data structures, such as lists of (variable length) strings, greatly simplifies application development for Windows.

COM Objects.

The set of technologies associated with the Microsoft Component Object Model (COM) brought an object-oriented way to build Windows applications. COM originally arose out of a need to support inter-process communication between Windows applications of the type where, for example, drag-and-drop is used to pull a file object from the Windows Explorer app and plop it into the active window of a desktop application. The most intriguing aspect of COM is that the programming model is designed explicitly to support late binding to objects during run time, something that is essential for a feature like drag-and-drop to work across unrelated processes. Late binding to COM objects allowed for the construction of components whose behavior was discoverable during run-time, something which was very innovative.

At the heart of COM is the IUnknown interface, which all COM classes must support. The IUnknown interface has three methods:  QueryInterface, AddRef and Release. The AddRef and Release methods are used to manage the object’s lifetime. QueryInterface is used by the calling program to discover whether the COM object it is communicating with supports a contract that the caller understands. (See, for example, this documentation for a discussion of the QueryInterface mechanism for late binding:

Because COM objects are discoverable during run-time, COM development also enabled third party developers to add component libraries of their own design. Component libraries are packaged as DLLs and installed onto the Windows machine. Originally, COM components also required an entry in the Windows registry to store a CLSID guid, a globally unique identifier that is used in calls to instantiate the COM component during runtime. If you use the Registry Editor to look for CLSIDs stored under the HKEY_CLASSES_ROOT key, you will typically see thousands of COM objects that are installed on your system, as illustrated in Figure 1. (Typically, you will also see several versions of the same – or similar – COM objects installed. This is the only good way to deal with versioning in COM.)

Note that beginning in Windows XP, the requirement for COM objects to use the registry was relaxed somewhat; the CLSID and other activation metadata can now also be stored in an XML-based assembly manifest instead, but only in certain cases.


Figure 1. The CLSIDs of installed COM objects that are available at runtime are registered under the HKEY_CLASSES_ROOT key in the Windows Registry.

After years of object-oriented programming (OOP) proponents advocating the use of standardized components in application development, COM technology proved beyond a doubt the efficacy of this approach. COM changed the face of software development on Windows, leading to the development of a wide range of third party component libraries that extended the MFC classes and opening up Windows software development significantly. COM components packaged as ActiveX controls could easily be added to any application development project. For example, it became commonplace for Windows developers to use third party ActiveX controls to give their application a similar look-and-feel to the latest Microsoft version of Office. In my case, instead of trying to develop a charting tool for a performance data visualization app from the ground up, I licensed a very professional Chart control from a third party component library developer and plugged that into my application.Limitations of the COM programming model.As innovative as the COM programming model was and how useful the technology proved to be in extending the Windows development platform, aspects of the late-binding approach used in COM came to be seen as having some decidedly less than desirable qualities. I will mention just three issues here

  • complex performance considerations,
  • memory leaks, and
  • the reality of dynamically linking to a complex object-oriented interface.

Developing well-behaved COM objects turns out to be quite difficult, forcing developers to deal with potentially complex performance considerations such as threading, concurrency, and serialization alternatives. COM objects can live either in-process or out-of-process in separate COM Server address spaces. (In COM+, COM objects can even be distributed across the network.) The runtime infrastructure to support late binding to all types of COM objects at runtime is quite complex, but at this point, of course, it is deeply embedded into the Windows OS.Software developers also discovered that applications built using persistent COM objects were prone to memory leaks in often devious, difficult-to-diagnose, non-obvious ways. Since COM objects could be shared across multiple processes, the COM object itself had responsibility for managing object lifetimes using reference counting – implemented using the AddRef and Release methods of the IUnknown interface. Keeping track of which ActiveX objects are active inside your program and making sure inactive ones are de-referenced in a timely fashion can be complicated enough. Nesting one ActiveX control inside another ActiveX control, for instance, can create a circular chain of reference to those objects that defeats reference counting. When that happens, the objects in the chain can never be destroyed, and they can accumulate inside the process address space until virtual memory is exhausted.

Finally, it turned out the idea that all the property settings and methods available for an object embedded in your application be discoverable during runtime, which sounded good on paper, wasn’t that useful a construct for dealing with a complex interface. As a practical matter, crafting code to deal with a complicated interface to a COM object that you could bind to dynamically creates just as many dependencies as any statically linked object. Another very pointed objection to late binding is that it can lead to a variety of logic errors that are only discoverable during runtime testing. On the other hand, many of these same type-mismatch (or invalid cast) errors could readily be detected during compile-time using static binding to strongly typed objects.

Building components in the .NET Framework

This criticism of the COM programming model was taken to heart by the architects of the .NET Framework. The Framework languages – mainly C# – was designed from the ground-up using OOP principles like inheritance and polymorphism, and adopted the best practices associated with software engineering quality initiatives like the Ada programming language. Unlike the C++ language, which was grafted on top of the C programming language, the Framework languages dispense with the use of address pointers entirely. Pointers to object instances are maintained internally, of course, but cannot be referenced directly by user code – except for the express purpose of interoperability with COM objects and Win32 API calls.Object instance reference counting is internalized as well in the Framework languages, which enables the .NET Common Language runtime (CLR) to periodically clean-up de-referenced pointers automatically using garbage collection. Automatic garbage collection is one of the key features of the .NET managed code environment. Memory leaks of the type associated with the kinds of housekeeping logic errors that tend to plague programs written utilizing COM objects are eliminated in one fell swoop. (To be sure, memory leaks of other types can still occur, and dealing with an automatic garbage collection procedure can sometimes be tricky. See, for example, this MSDN article “Investigating Memory Issues” from CLR developer Maoni Stephens, or some of the online documentation that I wrote here that discusses typical memory management problems that .NET developers can encounter with suggestions on how to deal with them effectively.)  

“Strongly-typed” objects

Furthermore, the architects of the .NET Framework adopted an approach diametrically opposed to COM for building component libraries. The .NET Framework relies on static binding to strongly typed .NET components, something which permits the compiler to detect type mismatches. “Strongly typed” in this context means that the code references an explicit class, i.e., one derived from System.Object, which permits the compiler to detect type mismatches. .NET is very strict about implicit type conversions – they are not permitted. Your code must use explicit casts or reference one of the Convert class library methods – .NET provides something to convert from any base type to any other base type.To be sure, the .NET Framework does support dynamic binding to objects during runtime under specific circumstances. These include using Reflection and the is and as keywords. It is also often necessary for .NET applications to communicate with unmanaged code, which includes programs written in languages such as Javascript that rely on dynamic binding. (The 4.0 version of the Framework added the dynamic keyword that instructs the compiler to bypass type-checking to help with Javascript interop issues.)

Meanwhile, Windows itself and most Office apps still rely heavily on COM. The primary way the .NET Framework deals with interoperability with Win32 APIs and COM objects that pass pointers around is wrappers that create .NET classes around these APIs and COM objects. For example, instead of calling the QueryPerformanceCounter Win32 API to gather high-precision clock values in Windows, C# developers instantiate a Stopwatchobject instead and call its methods. Structs are still permitted in C#, and they are unavoidable when you are dealing with interop for Win32 APIs that don’t already have wrapper classes, a .NET feature known as platform invoke, or PInvoke, for short. If you need to call a Win32 API that is adorned with address pointers, developers can often get help from the wiki, but, more often than not, as in the case of the Stopwatch class, there is likely to be a .NET wrapper already available.

Memory management in a C# program that does a fair amount of interop with native code and COM objects is also complicated since the CLR garbage collector cannot automatically free up memory associated with COM objects that have been abandoned the way it can for object instances built using managedcode. .NET applications that need to interact frequently with COM objects are prone to leak memory in often subtle, non-obvious ways. For instance, the CLR garbage collector cannot reclaim an instance of .NET class that references a persistent COM object so long as the COM object itself remains referenced.In summary, the .NET Framework addresses the key limitations associated with COM development, adopting a programming model that relies on strongly-typed objects. This approach was diametrically opposed to the one promulgated in COM, one based on binding to objects dynamically during run-time. Application programs written in one of the .NET Framework languages like C# or VB.NET, of course, still need to call Win32 services that pass pointers around and use COM objects that need reference counting. .NET classes that wrap frequently accessed COM objects or Win32 methods are very effective in hiding the fact that, under the covers, the Windows run-time still relies heavily on COM..

Plug-and-Play devices on Windows Tablets

In the previous post on Windows 8 and the new Windows Runtime libraries for Windows Store apps, I mentioned that the key deliverable in the new version of the Windows OS is the port to the ARM platform. In this post, I will discuss the implications of Windows running on ARM, emphasizing the impact of “plug-and-play” device driver technology. In porting the core of the OS to the ARM platform, Microsoft was careful to preserve the interfaces used by device driver developers, ensuring that there was a smooth transition. Microsoft wanted to allow customers to be able to attach most of the peripherals they use today on a Windows 7 machine to any ARM-based tablet running Windows 8.

What is ARM?

In discussing the Windows 8 port to the ARM platform with some folks, I noticed that not everyone is familiar with the underlying hardware, that it runs a different instruction set than Intel-based computers, that it is not Intel-compatible, etc. So, let’s start with a little bit about the ARM hardware itself.

ARM – the acronym originally stood for Advanced RISC Machine – is a processor architecture specification that is designed by the ARM consortium and then licensed by its members, who then build them. Members of the consortium work together to devise the ARM standard and move it forward. By any measure, in the marketplace today ARM has a reach that is impressive. According to the ARM web site, at least 95% of all mobile phones – not just smartphones – are powered by ARM microprocessors. In 2010, six billion microprocessors based on ARM designs were built. If you own a recent model coffee maker that sports a programmable, electronic interface, you are probably talking to an ARM microprocessor.

So, ARM refers to the processor architecture, an “open” standard of sorts, open, at least, to any hardware manufacturing company willing to pay to license the ARM IP and designs from the consortium – which runs you several million dollars, plus royalties on every unit you build. The ARM processor specification, which is based on RISC principles, is distinct from the manufacturing of ARM chips. Overall, there are currently about 20 manufacturers that build ARM-based computers, with companies like Qualcomm and NVIDIA leading the charge.

Another term associated with devices like the NVIDIA Tegra that powers the Surface is System on a Chip (SoC). In the case of the NVIDIA chip, that entails embedding the ARM microprocessor on a single silicon wafer that contains pretty much everything a mobile computer might need – a graphics processor (NVIDIA’s specialty), audio, video, imaging, etc. Or, if you prefer an integrated SoC design optimized for telephony, you might decide to go with the Qualcomm version. The key is that the software you build for the phone can also run on an ARM tablet because the underlying processor instruction set is compatible.

I blogged last year that ARM technology and the consortium of manufacturers that have adopted ARM designs have emerged as the first credible challenge to the Wintel hegemony that has dominated mainstream computing for the last twenty years. A year later, that prediction looks better and better. From almost every perspective today, ARM looks like it is winning.

ARM’s recent success is reflected in the relative financial results of both Microsoft and Intel, compared to Apple and Qualcomm, for example. Microsoft recently reported revenues slipped by 8% in its latest quarter, while Intel sales were down about 5%. The forecast for PC sales is down, as I mentioned in an earlier post, as more people are opting to buy tablets instead. Meanwhile, Apple posted “disappointing” financial results for the quarter because sales of iPads “only” increased by 26%. Overall, revenue at Apple increased by 27% in its most recent quarterly earnings report. Sales of iPhones were up 58%, compared to last year, with Apple apparently having some difficulty keeping up with the demand.

All of which makes Windows 8 a very important release for Microsoft. Windows 8 needs to offer a credible alternative to Apple and Android phones and tablets, blunting their drive to dominate this market. It is an open issue whether Windows 8 is good enough to do that. My guess is “yes” for tablets, but “no” for phones. Windows OEMs like Lenovo, HP and Dell are rushing to bring machines that exploit the Windows 8 touch screen interface to market. Microsoft is hoping that’s Windows’ long term policy of being open for all sorts of hardware peripherals – devices that “plug and play” in Windows —  plug into Windows PCs will provide a major advantage in the emerging market for tablets.

Plug and Play devices

As I discussed in the last blog entry, you can buy an ARM-based tablet like the new Microsoft Surface, but it is only capable of running applications built on top of Windows RT. Picture the architecture of Windows 8, for example, which is represented in the block diagram in Figure 1:


Figure 1. The Windows Runtime (aka Windows RT) is a new API layer on top of existing Win32 OS interfaces that developers must target in order to build a Windows Store app that can run on Windows 8 ARM-based tablets, which are limited to supporting Windows RT. As illustrated, a Windows Store app can also call into a limited subset of existing Win32 interfaces that have not been fully converted in Windows 8.

 The set of OS changes associated with Windows 8 are highlighted in the upper right corner of the block diagram in Figure 1: the new Windows Runtime API layer that was added, spanning a significant subset of the existing Win32 API that Windows applications call into to use OS functions. Examples of Win32 APIs that Windows applications ordinarily need to call include accessing the keyboard, mouse, display, touch screen, operate audio components of the machine, etc. Windows 8 Store apps that can run on ARM processors must limit themselves to calling into the Windows Runtime APIs, except for a small number of selected Win32 APIs, like the COM APIs, that are permitted.

Figure 1 is modeled on the diagrams used in chapter 2 in Mark Russinovich’s  most recent Windows Internals book that I have updated to reflect the new Windows Runtime layer. (Windows Internals is essential reading for anyone interested in developing a device driver for Windows, or just wants to understand how this stuff works.) It is a conventional view of how the Windows OS is structured. It shows the core components of the OS, generally associated with the Windows Executive, the OS kernel, and the HAL. The OS kernel, for example, manages process address space, creation threading, and thread dispatching. The OS kernel is also responsible for managing system memory, both physical memory and the virtual memory address space built for each executing process. At the heart of the OS kernel are a set of synchronization primitives that are used to ensure that, for instance, the same block of physical memory is only allocated to one process address space at a time.

Kernel mode is associated with a hardware level that allows privileged mode instruction to be executed. An example of a privileged mode instruction is one that is reserved for the OS to use to switch the processor from executing code inside one thread to code in another. An essential core service of an OS is to function as a traffic cop, managing shared resources such as the machine’s CPUs and its memory on behalf of the consumers – threads and processes, respectively – of those resources.

Before moving onto the next set of OS components, I should mention the HAL, or Hardware Abstraction Layer, a unique feature of Windows designed to insulate the rest of the OS from specific processor architecture dependencies. It hides hardware-specific interfaces like the way the processor hardware implements processing of interrupts from attached devices, handles errors like a thread accessing a memory location in a page that doesn’t belong to it, or context switching. These are all functions that processors handle, but different hardware platforms tend to do them in a slightly different manner. Consolidating hardware dependent code that has to be written in the machine’s assembly language in the HAL makes it relatively easy to port Windows to a new processor architecture. To port Windows to the ARM processor, for example, Microsoft first needed to develop a version of the HAL specific to the ARM architecture, and then build a cross-compiler that knows how to translate native C code into valid ARM instructions to generate the rest of the OS. I am making the port to ARM sound a whole lot easier than I am sure it was, but over the years the HAL has enabled Windows to be ported relatively easily to run on a wide range of hardware, including the Digital Alpha, the PowerPC, Intel IA-64 (the Itanium), and the AMD64 (which Intel calls x64).

Figure 1 also illustrates the device drivers in Windows. I mentioned that the Microsoft strategy for Windows 8 on tablets is designed to leverage an extensive ecosystem of hardware manufacturers that Microsoft has built over the years because of the ability for anyone to extend the OS by writing a device driver to support a new piece of hardware. Windows “Plug and Play” facilities for attaching devices has grown into a very sophisticated set of services, including ways for device driver software of tapping into Windows Error Reporting, for example.

In general, device drivers are modules that also run in kernel mode and effectively serve as extensions to the OS. Their main purpose is managing hardware resources other than the CPU and memory. Windows device drivers are installed to manage any and all of the following devices:

  •  Disks, CD, and DVD players/recorders that are attached using IDE, SCSI, SATA, or Fibre Channel adaptors
  • the network interface adaptors, both wired and wireless,
  • input devices such as the mouse, the keyboard, the touch screen, the video camera, and the microphone(s)
  • graphical output devices such as the video monitor
  • audio devices for sound output,
  • memory cards , thumb drives,

 as well as pretty much any device that plugs into a USB port on your machine. In Windows 8, the list of device drivers expands to include a GPS and an Accelerometer.

 Windows currently provides an open “Plug-and-Play” model that permits virtually anyone to develop and install a device driver that extends the operating system. Figure 2 is a screen shot from a portable PC of mine showing the Device Manager applet in the Control Panel that tells you what Plug-and-Play hardware – and the device driver associated with that hardware – is installed. As you can see, it is quite a long list. This flexibility of the Windows platform is a major virtue.


Figure 2. The Device Manager applet in the Control Panel tells you what Plug-and-Play hardware is installed, along with information about the device driver software associated with that hardware.

 For the sake of security, you want to ensure that any OS function that doesn’t absolutely need to run in kernel doesn’t. But, by their very nature, because they need to deal directly with hardware device dependencies, device drivers need to run in kernel mode. Device drivers in Windows don’t actually interface with the hardware directly – they use services from the HAL and the Windows IO Manager to do that. This mechanism allows device drivers to be written so that they can be portable across hardware platforms, too.  The importance of this is that, once Windows is ported to ARM-based SoC machines, you ought to be able to plug in virtually any device that you could into an Intel-architecture PC and it will run.

 As a practical matter, Windows has a device driver certification process that the major manufacturers of peripheral hardware use. So, not every piece of hardware you can attach to a Windows 7 PC, like the one illustrated in Figure 2, will have immediate support for the Windows RT environment on ARM. Microsoft also wants hardware manufacturers to take the extra step of packaging their drivers into Windows Store apps.

The open, plug-and-play device driver model Windows uses permits an almost unlimited variety of device peripherals to be plugged in and extend your Windows machine. Consider printer drivers in Windows. Manufacturers like HP have developed very elaborate printer drivers that let you know when you’ve run out and ink and then try to nudge you into buying expensive ink cartridges from them online. In contrast, try to print a document using your iPad. Can’t do it, no device drivers.

This great virtue of the Windows OS can also be a curse. The disadvantage of the “open” model is that it is open to anyone to plug into and start running code with kernel mode privileges. Historically, whenever your program needed a function that required kernel mode privileges, you could develop a device driver module (a .sys module) and drop that into the OS, too.

 Being open leads to problems with drivers that are less than stellar quality and also creates a potential security exposure. The fact is 3rdparty device driver code, running in kernel mode, is also a major source of the problems that all too frequently cause Windows to hang, crash, or blue screen. It is often not Microsoft code that fails, so there isn’t much Microsoft can do about this – other than take the steps they already have, like the certification program, to try to improve the quality of 3rd party driver software. The fact that my device driver can be deployed on machines configured with such a wide variety of other hardware that my software may need to interact with greatly complicates the development and testing process. The diversity leads to complexity and that directly impacts the quality of the software. Bugs inevitably arise whenever my software encounters some new and unexpected set of circumstances.

Both a blessing and a curse

A good way to illustrate the advantages and the disadvantages of the Windows open hardware policy is to look at graphics cards for video monitors. The lightweight portable PC I am typing on at the moment has a 14” display, powered by a graphics chip made by Intel that is integrated on the motherboard. When I use this portable PC at my desk, I slide it into a docking station where two additional video monitors are attached, powered by a separate, higher end NVIDIA graphics card. (The docking station actually supports up to four external monitors, but I am pretty much out of desk space the way things are at the moment, so I will have to get back to you on that.)

One of the external flat panel displays is 1920 x 1200, the other is only 1920 x 1080. I have one positioned on the left of the portable and the other on the right. In addition, I have a 3rd party port replicator plugged into a USB port on the back of the PC. This device has additional video ports that I am currently not either. If you look at the Screen resolution applet in the Control Panel, my configuration looks like I have four video monitors available, not three.

See the screen shot in Figure 3.


Figure 3. The Screen Resolution on my portable PC when I plug into a docking station with additional video monitors attached. It shows four video monitors are attached, when physically, there are only three. The 4th is a phantom device that is detected on an additional port replicator (attached via a USB port) that supports additional video connections.

This desktop configuration has multiple external monitors augmenting the built-in portable display, which is “only” 1600 x 900. When you are doing software development, take my word for it, it helps to have as much screen real estate as possible. Visual Studio also has pretty good support for multiple monitors, and I have really come to rely on this feature. When coding or debugging, I can have multiple windows displaying code inside the VS Editor open and arrayed across these monitors at any one time. Having multiple monitors is a tremendous aid to developer productivity. One reason I purchased this portable PC was that it was lightweight for when I need to pack it and go. But, in fact, the primary reason I purchased this specific model was it came with the high end NVIDIA graphics adaptor so I could plug in two or more externals monitors when I am using at my desk.

I am very satisfied with the graphics configuration I have, but it is not exactly trouble-free, and I have had to learn to live with a few annoying glitches. For instance, when I swing the mouse across an arc from the monitor on the left to the monitor on the right, Windows will let the mouse go off the deep end & enter the “display” of the phantom 4th monitor where I can no longer see where it is. When I first drag a Visual Studio panel or window onto either one of the external monitors, there is evidently a bug in the graphics card adaptor code that stripes solid black rectangles across portions of the window. This bug is apparently WPF-related, because it doesn’t show up on any standard Windows Forms applications, like Office or IE. (One of the features of Windows Presentation Foundation is that provides direct access to high resolution rendering services on the graphics card, and this is supposed to be a good thing. For one thing, these higher end graphics cards are like high speed supercomputers when it comes to vector processing.) Fortunately, re-sizing the window immediately corrects the problem, so I have learned to live with that minor annoyance, too.

Occasionally, the graphics card has a hiccup, the screens all black out, and I have to wait a few seconds while the graphics card recovers and re-paints all the screens. Very infrequently, the graphics card does not recover; there is a blue screen of death that Windows 7 hides, and a re-boot.

Overall, as I said, I am pretty happy with this configuration, but it is certainly not free of minor glitches and occasionally succumbs to a major one. Understanding that my particular configuration of PC, its graphics adaptors, the docking stations, and the characteristics of the external monitors is singularly unique, I am resigned to the fact that NVIDIA is unlikely to ever fix my peculiar set of problems.

Windows has a remarkable automated problem reporting system that will go out on the web following a graphics card meltdown and try to match the “signature” of my catastrophic error to the fixes NVIDIA has made available recently to its “latest and greatest” version of its driver code to see if there is a solution to my problem that I can download and install. But, realistically, I don’t expect to ever see a fix for this set of problems. They are associated with a combination of hardware and software (adding Visual Studio’s use of WPF to the mix) that, if not exactly unique, is still pretty rare. Inside NVIDIA, any developer working to fix this set of bug reports would have difficulty reproducing them because their configurations won’t match mine. That, and the fact that there aren’t too many other customers reporting similar problems – again, because of the unique environment – means the bug report will be consigned to  a low priority, “No-Repro” bin where no one will everwork on it.

There is another way to go about this, which is Apple’s closed model. On Apple computers and devices, with few exceptions, the only peripherals that can be attached to a Mac are those branded by Apple and supported by device drivers that Apple itself supplies. To be fair, Apple is more open than it used to be. Since Apple switched over to Intel processors, the company has opened up the OS a little to 3rd party hardware, but it has not opened it up a whole lot. I can buy a MacBook Pro, for example, which is equipped with a middling NVIDIA graphics card and attach an external Apple Thunderbolt 27” display to it. The Thunderbolt is a beautiful video display, mind you, 2560 by 1600 pixels, but it costs $900. I can’t configure a 2nd external monitor without moving to one of the Apple desktop models.

However, and this is the key take-away from this rambling discussion, limiting the kind of monitors and the array of video configurations that the MacBook can support does lead to standardized configurations that Apple can insure are rigorously tested. And, this leads to extremely high quality, which means customers running an iMac do not have endure the kinds of glitches and hiccups that Windows customers grow accustomed to. On Windows, there is support for a significantly broader array of configuration options, but Microsoft cannot deliver quite the same level of uniformly high quality to that support. Using its open model that permits virtually any third party hardware manufacturer to plug their device into Windows effectively means that Microsoft has farmed out some of the most rigorous requirements for quality control in Windows to third parties.

 Open vs. Closed hardware models

The flexibility of the open model used in Windows certainly has its virtues, as I have discussed. It makes good business sense for Microsoft executives to try to take advantage of the flexibility of the Windows platform and leverage the range and types of hardware that Windows can support, compared to an Apple PC or tablet in its Windows 8 challenge to the iPad.

The Windows organization in Microsoft is certainly aware that the high level of quality control that Apple maintains by restricting the options available to the consumer can be a significant, strategic advantage. Each release of Windows features improvements to the device driver development process to help 3rd party developers. The Windows organization performs extensive testing using popular 3rd party hardware and software in its own labs. Microsoft also provides most of the driver software you need in Windows when you first install it. However, a good deal of this responsibility for quality is farmed out to its OEM customers – the PC manufacturers – who need to ensure you have up-to-date video drivers and other drivers for the specific hardware they include in the box.

Microsoft has also made an enormous investment in automated error reporting and fix tracking associated with the Windows Update facility, which is very impressive. IT organizations often disable Windows Update because they fear the unknown, but its capabilities are actually quite remarkable. (There is a good description of Windows automated error reporting and the Windows Update facility in an article published last year in the Communicationsof the ACM.) Windows gives third parties access to its bug databases, and the Windows organization will proactively pursue getting a fix out to third party software, if it is affecting an appreciable number of customers. A staggering number of customers run Windows, however, well over a billion licensed copies exist, so that still leaves customers like me with relatively minor glitches associated with relatively unusual configurations with little hope of relief. I am not saying it is impossible that I will ever see a version of the NVIDIA driver that fixes the problems I experience, but I am not holding my breath.

Battery life on portables is a good example where, despite considerable efforts from Microsoft to support the device driver community, Apple has a distinct technical advantage. Now that Macs are running the same Intel hardware as Windows PCs, Apple hardware has no inherent advantage when it comes to battery life. Running on similar sets of hardware, Apple machines typically run about 25% longer on the same battery charge. Most of this advantage is due to the control that Apple exercises over all aspects of the quality of the OS, the hardware, and the hardware driver software that it delivers. (Some of it is due to shortcomings in Windows software, specifically system and driver routines that wake up periodically from time to time to look around for work. One of the culprits is the CPU accounting routine that wakes up 64 times a second to sample the state of the processor. Hopefully, this behavior has been has been removed in Windows 8, but I suspect it hasn’t.) In contrast, Microsoft has to periodically orchestrate battery life-saving initiatives across a broad range of 3rdparty device driver developers, which is akin to herding cats.

Microsoft’s decision to build and distribute its own branded tablet, the new Surface, does reflect an understanding at the highest levels of the company that the Apple products that Microsoft must compete with have a distinct edge in quality, compared to the products from many of its major Windows OEM suppliers. I have heard Steve Ballmer in department-level meetings discuss his reluctance to abandon the “open” and cooperative business model that has served the company so well for so long. It is a business model that definitely leads to a more choice among products across the OEM suppliers and lower prices to consumers because of the competition among those suppliers.

lt is also a business model that has forced Microsoft’s Windows OEM customers to live for years with meager profit margins in a cutthroat business, high volume, low margin, capital-intensive, with little room for error. Meanwhile, Microsoft has consistently raked in most of the cream right off the top of that market in software license fees for Windows and Office that it collects directly from those same OEMs. Microsoft’s high-handed behavior led IBM to exit the PC hardware market long ago. HP, which has struggled for years to make a profit in the same line of business, would also like to exit the business, but its management still has the albatross of the Compaq acquisition around its neck, constraining its ability to shed an asset that cost the company dearly to acquire. The problems Microsoft’s OEM partners face are obvious – an Intel “Ultrabook” configured as an iMac Air runs $1200 this Christmas, while identical hardware from HP that runs Windows retails for 40% less. The margins Apple is able to command for its hardware products are the envy of the tech industry.

By getting its support for tablets into consumer-oriented sales channels in time for the Christmas rush, Microsoft is hoping Win 8 can make a dent in the huge lead Apple has fashioned in the emerging market for tablets. Meanwhile, at least in the short term, sales of the new Microsoft Surface are going to be restricted to Microsoft’s direct sales outlet, currently numbering only about 60 stores. (Plus, you can order it direct from the Microsoft Store over the web. Currently, Microsoft is forecasting about a 3-week delay before it can ship you one.) With Windows OEMs primarily pushing a variety of Windows 8 machines running AMD and Intel processors, Christmas shoppers are bound to be to be confused with all the choices available: AMD vs. Intel, Intel iCore vs Intel Pentium, and the Microsoft Surface on ARM. It is all a little overwhelming to the average consumer and just wants something little Timmy can use for school.

Back to the future

 All of which brings us full circle back to Windows RT because the new Surface tablet can only run applications that use Windows RT. In brief, Windows RT is a new API layer in the OS that ships with every version of Windows 8, including Windows Server 2012. (“RT” stands for “run-time.”) If you buy one of the new ARM-based tablets (or phones when Windows 8 phones start to ship), these devices come with RT installed, omitting many of the older pieces of Windows that Microsoft figures you won’t ever need on a tablet or a phone.

As Figure 1 illustrates, this new API layer sits atop the existing Win32 APIs, which I have heard Windows developers discuss consists of some 300,000 different methods. As illustrated, Window RT does not come close to encompassing the full range of OS and related services that are available to the Windows developer. Microsoft understood that it could not attempt to re-write 300,000 methods in the scope of a single release, so Windows RT should be considered a work in progress. What Microsoft tried to accomplish for Windows 8 was to provide enough coverage with the first release of Win RT that developers would be capable of quickly producing the kinds of apps that have proved popular on the iPhone and the iPad. As shown in the drawing, Windows Store apps also can make certain specific Win32 API calls that were not fully retrofitted into the new Windows Runtime.

Summing up

In general, I am certain that porting the Windows OS to the ARM platform for Windows 8 was an excellent decision that should breathe some new life into the Microsoft PC business. ARM processors have evolved into extremely powerful computing devices – quad-core is already here & 64-bit ARM is on the way, for example. Portable, touch-screen tablets are a very desirable form factor. I have never seen a happier bunch of computer users than iPhone stalwarts chatting up Siri. Windows needed to try to catch up and perhaps even leap frog Apple before its lead in portable computing became insurmountable.

When Windows 8 was in the planning stages, the Windows Phone OS, which was adapted from Windows CE, was already running on ARM. At the time, there were at least two other major R&D efforts inside Microsoft that were also targeting the ARM platform. The Windows organization, led by Steve Sinofsky, effectively steamrollered those competing visions of the future of the OS when it started to build Windows 8 in earnest. And, for the record, I don’t have a problem with Sinofsky’s autocratic approach to crafting software. Design by committee slowly and inevitably takes its toll, weakening the power and scope of a truly visionary architect’s design breakthrough.

One of the crucial areas to watch as Windows 8 takes hold and Microsoft begins development of the next version of Windows is whether or not Windows on devices can keep up with rapidly evolving hardware. Microsoft needs to figure out how to rev Windows on devices much more frequently than it does the rest of the OS. That will be an interesting challenge for an extremely complicated piece of software that needs to support such a wide range of computers, from handhelds to rack-mounted, multi-core blade servers.

As delivered, I also believe the vision for Windows 8 suffers from serious flaws. The most noticeable one is the decision to make the new touch screen-oriented UI primary even on machines that don’t have touch-enabled screens. This “one size fits all” strategy condemns many, many Windows customers to struggle to adapt to an inappropriate user interface.

Moreover, from the standpoint of a Windows application developer, I am less than enamored with some of the architectural decisions associated with the new Windows Runtime API. These were based on a profound misunderstanding inside the Windows organization about why software developers chose to target Windows development in the first place (going back 20 years or so in the life of the company) and why these same developers are targeting Apple iPhones and iPads today.

I will defer the bulk of that discussion to the next blog entry on Windows 8..