Exchange 2010 Architecture: Microsoft's Ankur Kothari Talks About Personal Archives
Microsoft's Ankur Kothari discusses the Personal Archives in Exchange Server 2010, along with related features such as email retention policies and mailbox search capabilities for e-discovery.
March 3, 2011
Next to database availability groups (DAGs), the Personal Archive feature is probably the most dramatic architectural addition to Microsoft Exchange Server 2010. On the Exchange Server 2010 Architecture Poster—which you'll receive a free, full-size, full-color copy of in the March 2011 issue of Windows IT Pro, thanks to the Exchange team—you can find Personal Archives described in the Mailbox Server Role box in the lower left.
Continuing this series of interviews about Exchange 2010 architecture, I discussed the Personal Archive feature with Ankur Kothari, a senior product manager with Exchange. We also talked about related features, such as email retention policies and the mailbox search capabilities for e-discovery that are built in to Exchange 2010. And don't forget to check out the previous interviews about Exchange ActiveSync and Exchange Online:
BKW: How did the Personal Archive feature come about as part of the Exchange 2010 development?
Ankur: So, it really comes down to the volume of email that people have started to receive. Organizations that we talked to say that they have a huge demand to preserve and discover that information. Every day it's becoming more and more critical. So when we designed Exchange's archiving capabilities, in both Exchange Server 2010 as well as Exchange Online with Office 365, we set about creating email archiving, retention, discovery capabilities that are built in to the product. We wanted to make sure that it was native in the product and that it would not change your user paradigm. It wouldn't change how your users work; it wouldn't change how your administrators work. If users are used to using Outlook or Outlook Web App, they would continue to be able to use it in a way that they were very familiar with. And so \[the Personal Archive\] shows up in Outlook, in Outlook 2010, in Outlook 2007, as well as Outlook Web App for Exchange 2010.
I also want to talk a little bit about retention policies. We have a lot of rich retention-management policies so that organizations can automate archival and deletion of email. Instead of having an individual user spending an hour every day managing their quota and moving items between folders, you can send an automated policy to manage all of that behavior and basically give that time back to your users. Finally, we also have a legal hold capability that allows some real rich compliance capabilities, including editing or deleting emails. So if you place someone on a legal hold as an IT professional, or even as a compliance officer, all of those changes and updates that an email goes through—if someone tries to intentionally modify the headers or whatnot—all of that is tracked. If there ever is a discovery request, all of that information can be discovered.
Speaking of those compliance officers, we had a number of requests from organizations, when we started designing \[these archiving and compliance features\], to say that they don't want to teach legal officers and compliance officers new ways of doing tasks: "We don't want them logging on to servers and running administrative consoles." So when we designed the UI for the compliance officers, we said let's expand the Outlook Web App experience so a specialist user like that compliance officer can use their email platform to do the e-discovery request across all the different message types and search as needed across the primary and archive mailbox. That's kind of the background of how we designed it.
BKW: With the release of Exchange Server 2010 SP1, the archiving feature received quite a few changes. Can you talk a little about those changes and why they came about?
Ankur: Absolutely. The first feature that comes to mind is due almost directly to customer demand. When we introduced Exchange 2010, the initial released product, we forced organizations to have their primary mailbox and their archive mailbox in the same database. That was to preserve the same end-user experience. If an end user was clicking in their Inbox and then switching to their archive, they would have the same speed of response, they would have the same benefits—it would appear as a very uniform experience.
Our customers came back to us and told us that they like to have different tiers of experience for their archive and for their primary mailbox. We listened to them, and we basically expanded the architecture to support having the archive on different storage. So you can have your archive on cheap disks, and you can have your primary mailbox be on higher-speed storage—have a more enterprise-capable primary mailbox while your archive is on slightly slower storage. It's feedback we heard over and over again, and so we decided that, hey, if there is that demand, let's absolutely put it in. It's what customers want, so let's move forward with that.
The second thing we announced with Exchange 2010 SP1, we enabled the capability to have your archive in the cloud. So if you're on Exchange 2010 SP1, you want to have your primary mailbox on-premises, and you want to have your archives kept in the Office 365 data centers, it's now enabled with the SP1 release. That really gives customers some value in not managing that historical data, that massive beast of data that exists for most organizations. Just offload that data to Microsoft to handle in our data centers.
BKW: Is that a feature that can be used only with Office 365 or can it be used with other types of cloud storage or hosted services?
Ankur: Great question. It is a feature right now that you can use with Office 365, which is in beta. And we have a number of partners that are actually expanding their hosted services to include that as well. Unfortunately, I can't name anyone right now.
BKW: OK. What other SP1 changes did you make?
Ankur: Those are the large ones. We’ve done a lot around improving the behavior of our policies and having many more features surface from the web interfaces and the GUI interfaces, instead of forcing organizations to use PowerShell. But for the most part, those are the larger technologies that have opened up with SP1.
BKW: Most of the talk about SP1 centered around the ability to have tiered storage, since so many customers had been asking for it, but the cloud storage ability seems like a great new option as well.
Ankur: It's one that I've had a number of calls about with even our largest customers who say that the amount of data that they're storing is in the petabytes, and if there was any way of offloading that to—and to be honest, the customers, they just want to offload it to someone. But they have legal requirements—they need to store it, they can't just delete it. They need to keep it around. It made sense because we have the high economies of scale around our data centers. \[For us,\] storing a few petabytes isn't a big deal, but for an organization, storing a few petabytes is a pretty substantial cost burden.
BKW: With the combination of end-user control of archiving plus admin dictated policies, this could really be a best-of-both-worlds scenario. So, how is the end-user education going for using archives? What's the feedback from the field?
Ankur: When we had our MRM, or messaging records management—we had that in Exchange 2007—and that was a capability that allowed an organization to say, "I'm going to push these 5, 7, 10 folders down to every single user that's part of this policy." If you need to store something for an extra 5 years, 7 years, 9 years, you'd file it into one of these folders, or at least automate the filing of it into one of these folders. Almost universally, IT professionals around the world said it's a good idea having policies around email, but changing my user dynamic is something that's getting me in trouble. Their users, or the CEO, is coming to them and saying, "I don't manage my email like this." Or some junior executive is saying, "I manage my email by search as opposed to folders." We knew changing that user behavior wasn't something that was acceptable for organizations moving forward.
So in Exchange 2010, we implemented version 2 of that. You often hear it being called "MRM 2.0." The official name is retention policies and retention policy tags, and they allow an administrator to do two things: enforce policy per folder, and enforce policy across your mailbox. That's where the enforcement happens. Then the end user can—if the IT professional allows them—they can tag an item and say, "I'd like to keep this individual item at an item level, I'd like to keep this for 5 years, I'd like to keep this forever," if the organization permits it. By having that option, and surfacing that straight from the client, and exposing when the item will expire and things like that, you're making the user experience much better. Overwhelmingly, folks that have deployed 2010 SP1 and are using the archiving functionality have come back to me and said that this is exactly what they're looking for, and they're users are happy with it.
BKW: That's great to hear. Here in my company, we're using the managed folder system. I don't like it. Like you said, I have to change my behavior and remember what needs to get saved. Otherwise it goes under the default policy and gets deleted. What happens is, if I think something needs to be saved, I end up putting it in the folder that maintains everything and never gets deleted. That's probably wrong too.
Ankur: It saves you a few headaches, but it probably causes someone else a headache later on. One thing that our research has shown is that people essentially fall into some variation of two different types when it comes to managing their Inbox. Some people like the folder approach, and they have a folder for every one of their clients, and every one of the people that they talk to. And other people really like to just use search. They say, "I have my Inbox, and I have my sent items, and I have my deleted items, and I'm just going to search across there and it better find it for me." That was kind of a turning point for us to realize that, OK, people really do think about their emails very, very distinctly.
There are people in between those a little bit, that are mostly filing or mostly piling. When I think about that, I'm very much what I call a piler. I have massive items in my—let me see, if I click my Deleted Items right now, I have 52,000 in my Deleted Items folder. And in my archive, I have another 37,000 items in my Deleted Items folders. I just assume that search is going to work for me, and it does, and so I'm happy. But for some people that would scare the bejeesus out of them.
BKW: For a lot of IT departments, I'm sure those kind of numbers are going to be scary.
Ankur: Yes, of course. It's something we know. With Exchange 2010, we've really spent a lot of effort in allowing organizations to have that really, really large mailbox on cheap storage. And what we're finding is that that's a big reason organizations are jumping to Exchange 2010 if they're using Exchange 2003 or even Lotus Notes. It's just a huge differentiator. With Hotmail, you get 10 gigs, and if an end user comes to work and they have a 250 megabyte quota, that doesn't seem to resonate with people. I've even talked to a customer in Miami who said that their CIO—so he's a technologically involved person—went to Best Buy and bought a 1.5 terabyte drive, gave it to the IT director, and said, "Give me a larger mailbox. 250 megs is not cutting it." Things like that just make you understand that small mailboxes are a burden for organizations as much as they help decrease storage requirements.
BKW: With larger mailboxes comes the potential problem of finding information when it comes to e-discovery requests. Exchange 2010 also features the ability to do multi-mailboxes searches, which is shown on the architecture poster. Can you tell us a little bit about this feature and how it fits in with Role Based Access Control (RBAC)?
Ankur: In Exchange 2010, we've introduced Role Based Access Control, which is a concept of really delegating permissions to specialized task workers. If you're a Help desk worker, and your job is very simple—all you do is reset voicemail PINs—you can create that so that this individual can only do that one thing, and you don't have to teach them how to do all this other stuff.
Similarly, in archiving, we have a number of roles that are specific to discovery and to policies. The foremost role is what we call the Discovery Management role, which is issued to folks in your organization who need to do e-discovery requests—your legal officers, your compliance officers. They can run these requests and search across all the mailboxes or a subset of the mailboxes in an organization and find anything that matches typical words—whether it's insider trading, or if you're in a lawsuit and you type in Enron. You're able to find these words that have pushed across your mailboxes.
You have a really rich amount of granularity as far as how you want to search. Do you only want to search from an individual's mailbox? Do you want to search to another person's mailbox? Do you want to search for messages from Brian to Heather? You have very good control of search results, where you want to store them, as well as a lot of analytics before you even have to export the results. You can probably imagine if you're doing a search within Microsoft and you search for the word Windows, the number of hits in that request is going to be pretty massive. So before you export out—and I'm just going to estimate and say terabytes of emails have the word Windows in them—you get some statistics that say, "Hey, this search has found 700,000 items and 300 terabytes of information. Are you sure you want to export this out and store this off somewhere else? Or do you want to refine this further and say that I was actually just looking for "Windows insider trading"? And then you find 600 items, and from there you're able to manage it a little more efficiently.
You can probably imagine a legal officer going through 700,000 items one by one—it's going to be pretty cost-prohibitive. Being able to tweak that search a little bit and make sure we have an accurate representation, we're searching properly, before we export that data out \[is a useful feature\]. You don't even have to export data—you can provide that to your legal counsel through the Outlook Web App interface, something that really has a rich amount of value from a role-based function perspective.
BKW: One of the things I've heard talked about a lot is that the Personal Archive feature was implemented in Exchange 2010 as a way to get users and organizations to stop using PSTs. Was that really on your mind as you designed this feature, and do you think people will start using archives instead of PSTs?
Ankur: We absolutely want organizations to use the Personal Archives instead of a PST. A bit of a data point is that 80 percent of Exchange customers don't have an email archiving solution. That includes Exchange 5.5 through Exchange 2007. So we know that email archiving is something that they want. When we talk to them, they say that they have a significant pain around all that data that resides in those personal folders or PST files. It's spread among many, many different users, devices, file servers, shares, and it has all that corporate information—difficult to find and difficult to get visibility into if you ever need to search across it.
When we designed Exchange 2010, it was really about having that PST experience be native into Exchange. So for an individual who's used to using a PST, and dragging an item in from their Inbox into their PST, they're able to drag an item in from their Inbox into their archive. For an end user, they won't even know it's a different experience. The benefit is that it's kept on the server, so if you ever need to search it, it's searchable. And if you go to a different machine—because PSTs are usually tied to a individual machine—if you go to a different machine and use Outlook Web App or a different copy of Outlook, you're able to see all your archive functionality there as well. We really expect that decreasing the PST pain is a primary driver for getting organizations to use Personal Archives in Exchange 2010.
BKW: That ability to use the archives on different systems, different machines, is a great feature.
Ankur: I'm sure you've done it, and I've done it before, where you have a hard drive fail or you reformat a machine, and as soon as you hit that Enter button you think, Oh no. There's a PST in that special folder. I actually did this to my wife's email that she sent to me when we were doing our wedding planning seven years ago. I had a PST around there, and I formatted the machine and I lost the PST because it was in a place that I hadn't backed up. I came back to my wife and said, "Honey, I've got some bad news." It wasn't the end of the world, but I can imagine in other situations, someone may not have been so lucky.
BKW: How does the Exchange 2010 Personal Archive feature stand up against the third party archiving products that have been on the market for a while? And can you talk about how you plan to develop out this feature in Exchange in the future?
Ankur: I think the first thing is that a number of the archiving partners have developed their solutions for Exchange 2010. There are a number of scenarios where an organization wants to have Personal Archives for their data storage and perhaps have a partner solution to add value—to add tagging, to add some workflow capabilities around archiving that aren't' explicitly native to the Exchange system. So we've been working a lot with our very, very close partners to make sure that there is a good story for organizations when they're using a partner solution in conjunction with Exchange 2010. Moving forward, you should start seeing, with the Office 365 release, you should start seeing regular updates to the platform as well as regular updates to the on-premises system with functionality, feature updates, to coincide with some of our major and minor releases.
BKW: What's the feedback you've heard from the field? How are people reacting to Personal Archives and implementing them in their organizations?
Ankur: Let me give you an example. Twice a year I go to TechEd. I'll go to the North America one as well as the Europe one. Pretty much if I go and stand at the booth, which has the Microsoft Exchange sign, from morning to night, I will get questions around archiving. "Can I do this with this?" "We're a customer of this size, and how can we do this?" "We're looking at the Exchange archiving feature as a reason to get to Exchange 2010." So just in general, we've had so much excitement around the Personal Archives, and all of the archiving, retention, and discovery features in Exchange 2010—it's pretty substantial.
If you go to the microsoft.com/casestudies website, we've had a number of organizations do full-on case studies to say that they are using some really powerful functionality in doing Exchange 2010 archiving. It's saving them money, it's saving them time, it's saving them a whole bunch of effort as far as training when it comes to archiving.
BKW: That sounds great. Thanks for sharing all this with us, Ankur.
Related Reading:
About the Author
You May Also Like