5 Principles for Effectively Implementing Exchange

Follow these principles to apply best operational practices to Exchange.

ITPro Today

July 31, 1997

18 Min Read
ITPro Today logo in a gray background | ITPro Today

When I speak about Exchange at seminars and other events, the topic of bestoperational practices often comes up. People want to know the steps they musttake to operate an efficient and effective Exchange installation once thesoftware moves from pilot status into production. Email is now amission-critical application for many large companies, and these organizationswant to minimize the company's risk in the investment they make to implementclient/server-based messaging. In "Planning a Large-Scale ExchangeImplementation," May 1997, I discussed how to plan for a successfulimplementation; now I'll consider day-to-day operations in an Exchangeenvironment and explain the five guiding principles that will make youroperations successful.

Microsoft designed Exchange to be scalable, robust, and reliable indistributed environments. Exchange manages reasonably large user populations onindividual servers (one server at Digital has supported more than 2750mailboxes) and will manage far larger populations as Windows NT and hardwareevolve. Exchange is more akin to mainframe or mini-computer messaging systems,such as IBM PROFS or Digital ALL-IN-1, than Microsoft Mail or Lotus cc:Mail.

Guiding Principles


Managing very large user communities is impossible if you don't followdisciplined systems management practices. I have several principles that guideefficient system management for a production-category Exchange server.

  1. Plan for success. Assume that users will increase the demand on the servers, the volume of mail traffic will increase, and you'll deploy new messaging applications (such as workflow). Make sure that system configurations incorporate room for growth and accommodate periods of increased demand.

  2. Use dedicated hardware for Exchange.Configure the hardware to provide a resilient and reliable service on a continuous basis for three years with a minimum number of interventions (and system downtime) required. After three years, replace the hardware.

  3. Keep downtime to a minimum. Never take an action that interferes with or removes the Exchange service from users. For any intervention that requires taking servers offline, plan in advance and clearly communicate your intentions to users. Also, be prepared for catastrophic hardware failure. Outline a recovery plan to handle emergencies.

  4. Track system statistics. Proactive system monitoring is a prerequisite for delivering a production-quality service. While you're monitoring the system, gather regular statistics on system use and analyze the data to help identify potential problems and protect the quality of service.

  5. Follow well-defined, regular housekeeping procedures.

Exchange needs disciplined management to achieve maximum potential. Anyonecan take the Exchange CD-ROM, slap it into a drive, install the software, andhave a server up and running with clients connected in 30 minutes. Such a systemcan handle a small user community. This approach is OK if that level of serviceis all you need. The strategy I outline here is geared to large, corporatedeployments, but the logic that drives the strategy is valuable no matter whatsize shop you run. The five principles are generic, but they have proved to workover a large number of Exchange deployments in the past two years.

1. Plan forSuccess


Any configuration will come under increasing pressure as it ages. Youexperience the best performance immediately after you install the system,when disks are not fragmented, users put little demand on the computer, andapplication files are as small as they'll ever be.

As people get to know an application, the user-generated load increases.Users send more messages, and the messages are larger. Users find more reasonsto use the underlying service: For example, you might install a fax connectorfor better communication with external agencies or deploy a full-text retrievalpackage to improve manageability of public folder contents. The disks fill upwith user and application data. With Exchange, the information store swells tooccupy as much space as you can devote to it. If you don't configure the systemwith success in mind and incorporate room for growth, you'll end up with asystem that runs smoothly at the beginning only to suffer increasingly as timegoes by.

I recommend overconfiguring the service at the start so that you don'tbecome entangled in a cycle of constant upgrades. Install two CPUs rather thanone, use 128MB of RAM rather than 96MB, have 20GB of disk instead of 16GB, andso on. Build server configurations that can handle at least some expectedsoftware developments over the next few years. For example, consider RAIDcontrollers for system clustering. Look at the hardware that existing clusteringsolutions use and see whether you can include hardware with the same or superiorcapabilities. (For more information on clustering solutions, see Mark Smith, "Clustersfor Everyone," and Joel Sloss, "Clustering Solutions for Windows NT,"June 1997.) Because the upcoming release of 64-bit NT 5.0 will require a newversion of Exchange before it can be used for messaging, it is probably at theouter range of consideration. But think about Alpha CPUs if you're interested inbuilding high-end servers that you want to eventually run 64-bit NT on. AlphaCPUs are also appropriate as servers that must handle high levels of formattranslation work, such as those that host Internet connectors. Configure systemsthat will be successful over time rather than just today. Any other approachmight require more hardware upgrades than you want in a production environment.

2. Use Dedicated Hardware for Exchange


You can install Exchange on just about any NT server that has the correctrevision level of the operating system (for Exchange 5.0, the correct level isNT 4.0 with Service Pack 3--SP3) and a minimum of 32MB of RAM. The same servercan run other BackOffice applications and some personal productivityapplications such as Office 97. For good measure, the server can provide fileand print sharing to a set of workstations, not to mention Domain Name System(DNS), Windows Internet Name Service (WINS), and Dynamic Host ConfigurationProtocol (DHCP), and act as a domain controller. The applications will installand run, but run slowly. And, with all those applications, think of the stepsyou'll have to take to get the server back online in case of hardware failure. Ido not recommend this mix on a production system. Having dedicated hardware letsyou tailor and tune the configuration to meet the needs of an application.P>Most accountants are happy to depreciate servers over three years. Plan torun Exchange on dedicated boxes without interruption for three years and replacethe servers at the end of that time.

We've already discussed configuring systems for success. Apart from theobvious need for a fast CPU and enough memory, the I/O subsystem and hardwarefor system backups require special attention in an Exchange environment.

With a database at its center (the information and directory stores),Exchange is sensitive to disk I/O. If you design systems to support hundreds ofusers, you must pay attention to the number of disks and the way you arrange theExchange files across the disks. If you don't pay attention to I/O, your systemwill run into an I/O bottleneck long before it exhausts CPU or memory resources.The system often masks an I/O bottleneck by 100 percent CPU usage, largelybecause of the work that the CPU does in swapping processes around.

Classically, the major sources of I/O on an Exchange server are pub.edb andpriv.edb, the public and private information stores; (to a lesser extent)dir.edb, the directory store; the transaction logs; and the Message TransferAgent (MTA) work directory. Servers hosting the Internet Mail Server (IMS) haveto cope with its work directory as well. Ideally, allocate a separate physicaldisk to each I/O source to give the system a separate channel for the I/Oactivity each source generates. Resilience is also important, and you need toprotect Exchange against the effects of a disk failure: Place the stores in aRAID-5 array, and keep the stores separate from the transaction logs. If youhave to restore a database, you'll want the transaction logs structured so thatyou can roll forward any outstanding transactions once you restart Exchange. Ifthe stores and the logs are on the same drive and a problem occurs, you canrecover the store from a backup, but all transactions since the backup willvanish.

Servers that do a lot of work involving connectors generate a large amountof traffic through the MTA work directory. (For a description of how Exchangeuses connectors, see "Planning a Large-Scale Exchange Implementation,"May 1997.) Unlike the databases for the stores, Exchange uses the NT filestructure to hold information about messages as they go through the MTA.Exchange maintains a set of indexes and writes each message to disk as it isprocessed. Exchange takes some steps to minimize I/O, but generally thisscenario is the way things happen. With servers hosting connectors to theInternet, other Exchange sites, or other messaging systems, thousands ofmessages pass through the MTA daily. In these cases, you must isolate the MTAwork directory and prevent the I/O it generates from interfering with otherprocessing. For example, do not have the MTA work directory on the same drive asthe information store, or on the same drive as NT. Put this directory on a driveallocated to user directories or anywhere else where disk I/O is low.

3. Keep Downtimeto a Minimum


Managing a system also means anticipating downtime. Each time you take aserver down for preventative maintenance or to upgrade hardware or software, yourisk that the server might not come up again smoothly. Errors happen; softwareand hardware aren't perfect. Doesn't minimizing the number of times that you'llhave to interfere with a server during its lifetime make sense?

You must do preventative maintenance, and you can't avoid softwareupgrades. Despite misgivings, you will probably install every service pack, atleast for Exchange if not for NT. So all you can do to minimize system downtimeis configure hardware so that it can comfortably last its predicted lifetimewithout requiring an upgrade. You compensate for the extra up-front expense ofsuch configuration with peace of mind for the systems administrator and morepredictable service for users.

But what about getting a system back online quickly if a disaster occurs,specifically if some catastrophic hardware failure happens? In a problemsituation, you don't want to install half-a-dozen applications back onto newhardware just to get a mail server back online. I prefer a situation where I canfollow four steps to get the server back online:

  1. Install and configure NT (including service packs).

  2. Install and configure Exchange (including service packs) using a "forklift install," meaning you install Exchange, but its services will not be started. You don't want services such as the directory to start immediately after you install the software because the directory will synchronize its brand new databases with other servers and sites, leading to possible data loss. Allow synchronization to proceed only after you've restored the information and directory stores.

  3. Restore the information and directory stores and restart the Exchange services.

  4. Check that everything has worked and that users can access their mail.

Have you ever noticed how usually logical people do the craziest things inpressure situations? If you keep things simple and have dedicated hardware forExchange, you'll make a recovery exercise much easier. I assume Exchange will bearound for at least three or four more years. How many hardware problems can youexpect on a server in that time? Now multiply the chance of a hardware problemoccurring across many servers, and you'll understand why it pays to rundedicated hardware.

Recently at a customer site, a server had gone down Friday evening andwasn't back online until Sunday afternoon. Such an outage is barely acceptableover the weekend when you have less user demand, but the same outage isunacceptable during peak working hours. No one knew how to get a replacementserver online. The customer had no clear and simple steps outlined, and thestaff went down many blind alleys before they restarted the server.

While we're discussing hardware backup and restore, let me make a couplepoints. First, get the fastest backup devices you can afford. Moving away fromthe digital audio tape (DAT) device that is often automatically configured intoevery server will cost extra money, but you'll be glad you made the investment.The time for backups (and restores) will be shorter, and you'll be able to makefull daily backups instead of incremental daily backups and a weekly fullbackup. Exchange stores have a maximum size of 16GB, but Microsoft will removethis restriction in the Exchange Osmium release, due by the end of 1997. Thenyou might have to back up stores as large as the disks you attach to a server,conceivably hundreds of gigabytes. The faster the backup device, the easier thetask. Even on small servers, a digital linear technology (DLT) tape device ispreferable to a DAT.

Second, don't assume that NTBACKUP scores 100 percent in the backupsoftware desirability stakes. The best things about NTBACKUP are the price (it'sfree) and that it comes ready to work with Exchange. Screen 1 shows NTBACKUPready to back up a server selected from an Exchange organization. NTBACKUPworks, and you must make a conscious decision to purchase replacement backupsoftware (and not just for one server; use the same software everywhere).Increased speed, a greater degree of control over backup operations, and ascheduling engine are among the justifications for these purchases. All thesereasons are valid. Seagate's Backup Exec, Cheyenne's ARCServe, and BarrattEdwards International's UltraBac are good examples of third-party backupsoftware that works with Exchange. If the extra expense is not for you, be surethat you are happy with NTBACKUP and take the time to create some batch files tohelp automate backup procedures. You can use the AT and WINAT utilities toschedule backups, but if you use these utilities, you'll need some handcraftedbatch code to start off the backups with the proper command switches.

4. Track System Statistics


You can say you know what's happening on a server, but proving it is anotherthing. Recording regular statistics about message throughput, growth in diskusage, number of supported users, volume of Help desk calls, average messagetransmission time, and so on provides the evidence of a system's workload. Goodstatistics can also give you the necessary background to help justify hardwareupgrades or replacements when the time arrives.

Gather some statistics that don't directly relate to Exchange, such as thegrowth of disk space allocated to networked personal drives. You can use theExchange message tracking logs to analyze a server's workload. Unfortunatelythis measurement is relatively crude because it is based on the transactionsrecorded in the tracking logs as they pass through Exchange. Each messagegenerates a number of transactions depending on the number of components (theMTA and connectors) that handle the message. A message to a local recipientgenerates fewer transactions than a message that an external connectorprocesses.

You must create tracking logs before you can use them for analysis. Selectthe Enable message tracking checkbox on the properties of the MTA SiteConfiguration object to create message tracking logs. Exchange willautomatically create the logs and store them on a network share called\server_nametracking.log. The network share lets you track the path of amessage from server to server as it makes its way to its final destination. TheMessage Tracking Center option in the administration program lets you trackmessages.

Logs are simple ASCII files. Each entry, such as messages being submittedand then delivered to a recipient or connector, contains a code (to identify thetype of transaction, see Chapter 17, "Troubleshooting Tools and Resources,"of the Exchange Administrator's Guide) and some information about themessage, such as the recipient. Exchange creates a new log every day, and thelog size varies from server to server, depending on the amount of messagetraffic.

Screen 2 shows the set of tracking logs on a server. In this case, the logsare reasonably small. Based on figures from some reasonably large servers atDigital and other customers, even on the largest server, you'll probably see nomore than 40MB of logs generated daily. Of course, servers that deliver a highproportion of messages to local recipients will generate smaller logs thanservers that route many messages to different connectors. Distribution listexpansion also creates entries for the logs. Writing entries into the logs doesnot place a strain on the server, and you have no reason not to generatetracking logs.

You can analyze the log contents with Crystal Reports for Exchange, whichis on the Microsoft Exchange Resource Kit. You can view data in reportformat or export the data into Excel for further manipulation. Screen 1 showsthe result of analyzing the message traffic through one of Digital's largeExchange servers in the U.S. The time line is based on Greenwich mean time, fivehours ahead of eastern standard time. Thus, the peak load at 16:00 GMT is 11:00EST.

You can also extract statistics from Exchange by examining properties ofmailboxes and other objects through the Administration program. However, thismanual process is difficult when you have a server hosting more than a hundredusers.

5. Housekeeping


Regular housekeeping and systems monitoring are important. You must monitorservers regularly if you want to maintain a predictable quality of service.Exchange provides several tools for monitoring important system indicators,including counters, link monitors, and server monitors.

Exchange publishes more than 100 counters that NT's Performance Monitor canuse. Exchange server installs eight predefined workspaces automatically. You canuse these workspaces or define your own.

Link monitors check whether the network links to other servers areavailable. The monitor sends probe messages to the Exchange System Attendantprocess on remote servers. If the System Attendant is active, it replies to theprobe and the monitor notes the reply.

Server monitors check whether important NT services (such as the ExchangeMTA or Information Store) are active on remote servers. You can use servermonitors only if you have administration permission for the servers you want tomonitor.

You can run all the standard monitors on an NT server or workstation (youhave to install the Exchange administration program to use them on aworkstation). Link and server monitors run as windows inside the Exchangeadministration program. Screen 3 shows a server monitor keeping an eye on sixservers in five sites. The monitor has detected problems on four servers,ranging from serious (the IMS is not active on one server) to inconsequential(the time on the server is off by 51 seconds). You can define actions if aserver monitor detects a problem. For example, you can have the Exchange SystemAttendant send an email message to an administrator, attempt to restart amissing service, or display an NT alert. Compare the information available fromthe server monitor with the information from a link monitor, which Screen 4shows. The link monitor shows only whether a network path to a remote serverexists.

Many installations have Performance Monitor running constantly, checking onimportant Exchange indicators such as message queues and the number of userslogged on. Screen 5 shows the server health workspace (a workspace is a set ofPerformance Monitor counters) monitoring a lightly loaded Exchange server. Thefour essential Exchange components (the store, directory, MTA, and SystemAttendant) are being monitored with the overall CPU usage and system paging. Allthe standard monitors are fine in small deployments, but they become less usefulwhen you need to check more than a couple of servers on a regular basis. At thisstage, consider other options such as NetIQ's AppManager Console, acommand-center type utility.

If you're concerned about message delivery times, use pings to check howquickly messages get from one point of the network to another. A ping is amessage that the system sends to a mailbox on a remote server, which thenbounces it back to its originator. The system measures how long the roundtriptakes. If you don't want to write procedures to send and measure pings, considersolutions such as Baranof Software's MailCheck for Exchange. You can think ofMailCheck as a highly developed version of the standard link monitor, completewith reporting facilities.

Automated monitoring is all very well, but you need some manual checks toback up the tools. The checklist, "Regular Maintenance Tasks for Exchange,"lists everyday maintenance items for Exchange that fill this gap.

On a weekly basis, check the public folder hierarchy to ensure thatunauthorized folders have not appeared or that users have not createdunauthorized replicas on servers in the organization. You can perform this checkless often in deployments where you use a small number of public folders. Theaim here is to keep the public folder hierarchy well organized so that itdoesn't degenerate into anarchy. Also, review the directory contents regularlyto ensure that email addresses are as up to date and accurate as possible. Thisstep is especially important when you synchronize the Exchange directory withinformation from other messaging systems.

Every three months or so, review the system configuration (hardware andsoftware). A planned software upgrade might be available, or moving files todifferent disks might create a more efficient configuration, especially forcontrolling disk I/O.

Aside from a regular system review, the most important intervention youneed to consider is database defragmentation. Exchange databases do not supportonline defragmentation. In other words, over time, the databases swell to occupyall available disk space, halting only when the disk is filled. Of course, thedatabase won't be filled with messages and other items, but instead, a greatdeal of white space will intersperse the useful material. You can remove thewhite space and defragment the database only if you take Exchange offline andrun the EDBUTIL utility. Because mail messages have a shorter lifetime thanitems in public folders, you can recover more white space in the privateinformation store.

You can run EDBUTIL only after you stop the Exchange services. Screen 6shows a successful run. The time you need depends on the size of thedatabase, the speed of the CPU and I/O subsystem, and whether the server isdoing any other work at the same time. Expect to be able to process 1GB to 2GBan hour on small to medium servers (100MHz to 200MHz single Pentiums) and up to4GB per hour on systems with dual CPUs or on Alpha processors. Your mileage mayvary, so always depend on the results achieved in your environment rather thanwhat anyone tells you.

My experience with many servers shows that if you run EDBUTIL every threemonths, you can recover substantial disk space. You might not see the sameresults as Digital, which recovered more than 5GB of space when we defragmenteda 15.5GB store, but I'm sure that you'll recover between 10 percent and 20percent. Because you alter the internal structure of the database duringcompaction, be sure to make a backup before and after any EDBUTIL run.

The Payoff


Systems won't deliver reliable performance if you leave them alone. Aproactive approach pays big dividends when you configure and maintain systems.The suggestions in this article are generic, and you need to refine them foryour installation. Use them as input to your plans, but always remember you'rethe expert when it comes to the details of your site.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like