Windows Error Reporting: Elementary, My Dear Watson

What data does Microsoft gather and how is it used?

Karen Forster

July 25, 2005

8 Min Read
ITPro Today logo in a gray background | ITPro Today

"It's like pressing the button for an elevator over and over: It doesn't do anything, but boy, it makes you feel that you have some input." Like the reader who made this comment, most respondents to our survey about Windows Error Reporting (WER—formerly "Dr. Watson") are skeptical about whether reporting errors to Microsoft has any effect on product quality. Readers also would like to know what data Microsoft collects and how it uses and secures that data. But most of all, readers just want Microsoft to let them know that pressing the button does make a difference.

Readers' responses to our survey gave rise to an interesting interview with Microsoft's Ben Canning (group program manager, Office Trustworthy Computing). Because I have so much information to share, I'm making this a two-part column. This month, I cover the WER process, explaining what data Microsoft collects and how the company protects it. Next month, I'll discuss Microsoft's Corporate Error Reporting (CER), a tool that lets Software Assurance customers manage how WER is used on their network and review collected data before sending it to Microsoft; Ben's examples of specific improvements to Microsoft Office as a result of WER; and how Microsoft is addressing your desire to know that reporting crashes pays off in better products. In addition, I'll give Ben's answers to your concerns about how end users interpret WER dialog boxes and how those perceptions affect IT support costs.

Familiarity Breeds Contempt?
Just about everyone (94 percent of the 472 survey respondents) is familiar with WER. However, nearly half of respondents rarely or never click Send when they see a WER dialog box. ("What's the point of the feedback when Microsoft doesn't listen to us?") Only 9 percent always send a response, and 16 percent respond half the time.

Sixty-four percent don't inform end users about WER, and 74 percent don't encourage users to submit data. Common reasons include, "My end users' time is better spent on their work," "Users are already mad that their application crashed and they have possible data loss," and "Error reporting confuses people." (One respondent elaborated on end-user confusion: "It's not worth their time, and they would think they were responding to my Help desk" instead of to Microsoft.)

If you don't know what happens to data submitted to Microsoft through WER, you're like 72 percent of survey respondents. Most respondents have never (42 percent) or rarely (41 percent) received a response to a crash from Microsoft. So it's not surprising that 55 percent said they don't know whether WER has helped Microsoft improve the functionality and efficiency of Office applications. Only 29 percent believe WER has helped, and 15 percent believe it hasn't.

One Crash, One Vote
Because so many people are skeptical about WER, I asked Ben to explain how it works and what benefit users get from reporting crashes. "One way to think about reporting a crash is that you're voting for Microsoft to fix the problem you just had. At a high level, the information sent to us is just a set of simple parameters that say this problem occurred at this point in this piece of software, this version, in this executable, at this line. That gives us a unique identifier of what crash just happened, so we can keep count. By keeping count, we can prioritize and start working down from the top to fix the issues customers see most often."

One typically frustrated reader said, "Since I've never seen a response to a crash, I started clicking Don't Send because I felt sending was worthless."

But Ben maintains that "The more annoying a crash is to you, the more I'd recommend you send us the data. The more votes it gets, the higher it gets on the list. We find that if we can fix the ones at the top of the list, we're wiping out vast swaths of problems."

Some readers believe that Microsoft should fix problems before users experience crashes: "It's too bad we even need the vehicle to report errors. A longer beta and less frequent versions would produce better code and happier end users."

Ben replied, "We work very hard to eliminate problems before we ship. We use error reporting inside Microsoft, and of course we do testing. (We typically have one tester for every developer, which shows we take testing very seriously.) We eliminate as many problems as we can find and get the product stable. Then we release it to customers."

I asked Ben to elaborate on how Microsoft developers use error reporting. He explained that WER "is integrated throughout our development cycle. Internal users send crash reports as we're developing the software. We log our crashes as bugs, and the developers fix them. As we move into beta, we look at customers' crash reports. We have a process in every beta—we call it a Watson push—where we set a goal for how far down the prioritized error list we can go to address the most frequently occurring crashes. We march down that list and fix the top problems reported.

"WER lets us have betas with more customers and still gain value. The more customers in the beta, the more data we get about problems and the more accurate we can be in determining what order to fix things in."

Data Collection and Protection
Readers asked, "What data do you really capture?" and "How can I be sure the information provided to WER is kept secure?"

"We can get three levels of information," Ben explained. "About 90 percent of the time, all we get is application name, the version, the module, and the specific line of code that was executing. That's enough for us to increment the counter that tracks how many people are seeing a particular crash."

The second level is a minidump. "For example, the first 10 or so times we see a specific error, we collect a minidump, which contains debugging information. " In that case, the user who reported the error receives a WER message requesting permission for Microsoft to gather further information. "Very specifically," Ben continued, "a minidump contains the call stack, the code that was executing when the failure occurred, and some information about memory registers and so forth that's useful for the developer in debugging. It does not typically contain any personally identifiable information. In other words, we don't intentionally collect any information about who or where you are, the contents of your document, or anything like that."

Typically?

Ben emphasized, "I qualified that because we're grabbing a chunk of memory. It's possible that where you crashed, you were typing in something that is personally identifiable—for example, your email address—and that's in the chunk of memory. So I can't categorically say we never collect personally identifiable information. It's possible, but we don't typically. And we don't do anything in the way the data is stored to try to find that kind of information. If a customer is uncomfortable sending information, I'd much rather they not send it."

What's the third level of information? "With customers' permission, we can also collect such information as a full system memory dump, or particular registry keys, or the user's documents," Ben replied. "If we need that information, it's because we've looked into the problem and can't figure out how to fix it without additional information. Our server sends a dialog that says we need to collect more data, tells what specifically we need, and asks permission to get it. We rarely ask for that information because typically we don't need it. You can always say no."

Survey respondents also worry about data security. "With all the leaks of private data from major companies, how can I be sure the information provided via WER is kept secure?"

Ben assured me that all data is transmitted securely over HTTP Secure (HTTPS). "All WER information is stored in a secure data center. To access the information, a Microsoft employee has to sign a data-use policy that details exactly what the information can and cannot be used for. Basically, that policy is that developers can use this information to identify and fix a problem in the software, and that's it. There are severe penalties for anyone who might try to contravene that policy." For a detailed explanation of Microsoft's data collection policy, what data the company receives in a WER transmission, and how that data is used, see the WER data collection policy at http://oca.microsoft.com/en/dcp20.asp.

Closing the Loop
So what about those frustrated customers who keep pushing the button but never hear from Microsoft? Ben said, "You're right—the data shows people aren't seeing a lot of response from us. That's definitely something that we need to work on."

Ben emphasized, "Error reporting has revolutionized development at Microsoft. It changed everything about the way we do our development process, sustained engineering, and service packs. The amount of improvement that we've made as a result is tremendous. The good news is we've done a lot. The bad news is we've done not a great job of showing customers the results."

I'll have more about that next month.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like