Planning a Large-Scale Exchange Implementation
To deploy Exchange in a large enterprise, you need to consider the Windows NT infrastructure, network connections and bandwidth, the shape of the Exchange organization, and server connections.
April 30, 1997
How do you plan for an implementation of Microsoft Exchange Server? Whatsteps can you take to ensure success, or at least avoid some pitfalls along theway? I can easily discuss this topic at length, certainly beyond what can fit inone magazine article. Instead, I'll just focus on the most important aspects ofdeployment planning for Microsoft Exchange Server, reflecting on my experienceof the past 18 months working with the product in a variety of corporatesituations. Some ideas I'll discuss here won't be new to anyone who hasinstalled Exchange, but some ideas might surprise you. You'll know what makessense for your environment. My comments are generic whereas your knowledgeisn't, so your own experience will always win out in the end. The four mostimportant aspects of any Exchange deployment are Windows NT infrastructure,network connections and bandwidth, the shape of the Exchange organization, andserver connections.
Organizing Exchange in NT
Computers run Exchange within a somewhat inflexible hierarchical arrangementknown as an organization. An organization is subdivided into sites,which are closely connected groups of server computers that communicatecontinually. Screen 1, shows an Exchange organization with 10 sites. An Exchangeorganization is mapped on the NT infrastructure and the available network. Theeasiest Exchange implementation follows two simple principles:
1. All servers operate inside the same NT domain. In large implementations,this principle usually means that you create a separate resource domain forExchange. You do not create user accounts in the resource domain, only anaccount for Exchange administration (the service account).
2. You install and operate Exchange on dedicated servers. In particular, nopotentially contending database-type application such as Systems ManagementServer (SMS) or SQL Server runs on the same computer as Exchange. Ensure thatthe server is neither a Primary Domain Controller (PDC) nor (less important) aBackup Domain Controller (BDC). To refine this principle, you can configure someExchange servers to handle messaging and some to run connectors.
The real skill in Exchange design comes in knowing when to compromise oneprinciple to arrive at a pragmatic design that meets your needs and is flexibleenough to permit evolution. For example, you can connect Exchange servers acrossmultiple NT domains. Many people do so because their NT domain structure wasn'twell planned or they decide to separate users and computers into differentdomains.
Exchange's basic method of communication is to send messages betweenservers, and as long as a messaging connection is possible (for example, sendingSimple Mail Tranfer Protocol-- SMTP--messages between servers), messages willflow. Messages include directory replication and public folder hierarchy andcontent replication, and interpersonal notes that users send each other.
Having one unified security context is best (the result of placing allExchange servers into the same domain) because the finer points of Exchange canoperate without hindrance. These points include single-seat serveradministration, public folder affinity, and message tracking.
Public folder replication gets a lot of attention, largely because of theinevitable comparisons people make with the replication mechanism of LotusNotes. However, public folder affinity, which is the ability to direct allaccess to public folder contents to one or more predefined (and costed) pointswithin the Exchange organization, is valuable because it reduces the amount ofduplicated data floating around the network. Public folder affinity also allowsmore control over document content that you don't want replicated.
Affinity depends on clients having access to the servers where the contentis stored, which can be outside the client's NT domain. This approach requires atrust relationship. Or better still, if all of the Exchange servers share aunified security context, affinity can proceed on automatic pilot because theclient's security credentials are acceptable to all servers within theorganization. Of course, you can establish a unified security context throughtwo-way trust relationships between NT domains, but this approach is hardlyelegant and not viable when more than three or four domains are involved.
Operating Exchange on dedicated computers is the luxury approach andvaluable when things go wrong. Take email, which is now a mission-criticalapplication for many companies. When email servers go down, users demandimmediate reinstatement and give little credit to MIS departments that can'tfulfill that demand.
When things go wrong, you want a simple checklist of what to do to restoreservice quickly, instead of having to mess around to rebuild a complex server. Iknow instances where getting a server back online after hardware failures tooktwo or more days. Such delay is unacceptable when the CEO is waiting for email.You can operate dedicated servers and make sure you protect those servers withUPS and RAID devices.
Accidents happen, computers fail, and software has bugs. These three truthsof computing mean that you need to be sure that you can get the company emailsystem online quickly after catastrophic hardware failures, minor hardwarefailures, and botched software upgrades or other accidents of systemsadministration life.
The Fight for Bandwidth
No one has enough network bandwidth. Everywhere we turn, applications areabsorbing network capacity, and Exchange is no exception. To plan for Exchange,keep a few factors in mind.
Exchange transports more messages than you might expect, includinginterpersonal mail, directory updates, configuration updates, and public foldercontent and hierarchy. Don't expect your experience with another messagingsystem to accurately reflect just how much data Exchange will move around. Forexample, Microsoft Mail or Lotus cc:Mail concentrate on interpersonal mail only,so statistics you extract for these systems won't tell you how many messageswill travel between Exchange servers.
The default replication schedule often generates too much replicationactivity. To take control of replication activity, minimize the number ofreplicas of public folders that you maintain across an organization and definemore appropriate replication schedules for the directory and public folders. Forinstance, if your directory entries don't change often, you don't need areplication schedule with a 15-minute interval between updates. Define a 2- or3-hour schedule instead. But during migration periods when directory updatesoccur very frequently as new mailboxes are added, you'll need more regularupdates to let people see the new users in the directory.
In addition to transporting messages, Exchange servers use remote procedurecalls (RPCs) to talk together inside sites. However, you can't determine justhow much bandwidth servers will absorb as they chat. For example, any changemade to a server's configuration or a change to the site configuration from anindividual server (or workstation running the administration program) will bedispatched via specially-coded messages to all the servers within the site. Thismechanism ensures that all servers maintain a complete picture of the siteconfiguration. Exchange servers use the same type of mechanism, albeit at a moreleisurely pace, to exchange configuration data between the different sites in anorganization, so that each server knows about the other sites and knows theconfiguration details of those sites.
Factors such as the percentage of messages that travel off a server, thefrequency of directory updates, the number of servers within a site, the qualityof the network links that connect the servers, the use of public folders, andthe behavior of individual users affect network use. You might be surprised athow much Exchange servers communicate with their counterparts within a site,similar to the way that NT domain controllers synchronize each other withupdates of the Security Accounts Manager (SAM) every 5 minutes (by default). Forexample, within a site, all Exchange servers automatically synchronize directorychanges with each other. Thus, a directory entry made on one server will bereplicated to all other servers within the same site after 5 minutes, or shortlyafterwards if the network or servers are heavily loaded.
The effect that user behavior can have on network load is an interestingtopic, especially if you're moving from a green-screen email system. The averagemessage size continues to grow. Yesterday's simple 2KB message is today's 10KBmessage and tomorrow's 40KB message. People use the facilities available tothem, and the Exchange and Outlook clients encourage users to embellish theiremail with fonts and colors and to attach any file that they care to send totheir friends. I have known users to attach files larger than 20MB to messagesand expect the server to faithfully deliver the message to a large distributionlist. The auto-signature option lets you automatically append cute sign-off textto each message, and you can even append graphics. Many people insist onincluding company logos in their auto-signatures, driving up the average size ofmessages to more than 100KB. Clearly, user training can positively influencesuch behavior, but self-tutoring Windows applications eliminate many formaltraining opportunities for enforcing good habits and eliminating bad habits.
Given that a network might not be able to handle the load that newtechnology and bad user habits impose, what type of network links do you need toput in place? The classic answer is that servers within a site operate on theexpectation that a permanent, LAN-quality link is in place. If you can't compareyour connections to the bandwidth delivered by a LAN (or high-quality WAN),don't bother connecting servers into a widely distributed site. Make sure thatevery server in a site has access to at least 64Kbps of good-quality bandwidthto let them communicate with their peers. If you don't provide adequatebandwidth, servers can't transmit RPCs, messages won't get through, and messagequeues will build up rapidly.
Sizing Sites
What size should a site be and how many sites should you plan for in anorganization? The answer depends on the quality of your network links: If, likeMicrosoft and Digital, you have a network based on T1 and T3 links rather than64Kbps links, you can consider building a very large North American or Europeansite. But when bandwidth becomes scarce, you must consider other options.
At the start, try to create as few sites as possible. However, two factorswill influence this approach: First, no administration tools for cross-siteoperations are available today; second, cross-site operations (e.g., moving auser), are manual and time-consuming, so many designs combine servers into verylarge sites to minimize cross-site operations. The largest Exchangeimplementations today (Microsoft and Digital) both operate very large sites inNorth America.
In countries such as those in the former Soviet Union, you can't connectservers in one site because you can't get the necessary network links, even ifyou can afford to pay for them. The same restriction applies in some locationsin South America and the Asia/Pacific region. In all these cases, you mustcreate several sites, perhaps one for each location. All the global deploymentsI know of have large sites in North America, smaller (but still large) sites inEurope, and the smallest sites in the Asia/Pacific region. Exceptions occurwhere local conditions permit availability of cheap bandwidth.
Within a site, you can run from 1 to more than 100 servers. The issuesinvolved in running more than 10 servers are chiefly operational, such askeeping track of what all the servers are doing. For example, Microsoft has asite with more than 160 servers. I don't expect you to have the same backupresources (the entire Exchange development team), so restrain your enthusiasmand limit yourself to smaller sites.
Connecting Exchange
You connect sites with connectors, predefined links that tellExchange how messages flow from one site to another. You have four options: thedirect, RPC-based site connector (usually called the site connector, aterm that often confuses people new to Exchange because you can use all theconnectors to link sites); the X.400 connector; the Internet or SMTP connector;and the Dynamic Remote Access Service (RAS) connector. Screen 2 shows connectorslinking Exchange sites.
If you don't have very reliable, fast network connections between sites,the RPC connector is not a viable option. The connector uses RPCs betweenservers in the different sites to exchange messages. If the network is incapableof carrying the RPCs to the target servers, large message queues will build up.Many people start with site connectors because they have a reliable network linkin place but find that the link proves troublesome under the strain of aproduction workload. In this case, the results of a pilot project might not bevalid.
Many consultants, including Microsoft, recommend a minimum of 56Kbps or64Kbps available bandwidth for a site connector. A recommendation to use aparticular bandwidth is somewhat arbitrary because this number is a startingpoint only. You must increase or throttle back to reflect the load in yourenvironment. Some companies find that they need 128Kbps or 256Kbps links forsite connectors to perform reliably; some anecdotal evidence suggests that asite connector can run across a 9.6Kbps link. Of course, 9.6Kbps and 256Kbpslinks represent a radical difference in capabilities, and the former is viableonly if a very small number of messages pass across the link each day. Sitesthat experience heavy network traffic, act as a central site for Internet orX.400 connectors, or serve as bridgehead sites for directory synchronizationwith other (external) directories all need large network pipes if they don'twant large message queues to build up.
Because of their direct server-to-server RPC-driven links, site connectorsare the easiest type of connector to configure, and they let several servers ineach site be points of contact. However, over a site connector, you cannotcontrol the network traffic that passes between servers, and no tools exist toanalyze what passes over the link when you're in production. The X.400 connectorcomes into its own when you're concerned about the capability of the network oryou want to schedule connections.
Some people in the US don't seem to like the X.400 and X.500 standards.Europe is different, probably because Europeans have had to deal withinternational boundaries, multinational character sets, and other blocks toconnectivity. Much of the internal working of Exchange stems from the conceptsexpressed in the X.400 and X.500 recommendations, and you can use thesetechnologies for a major deployment of Exchange without anyone outside theimplementation team detecting that X.400 plays an important part in the Exchangearchitecture. For example, the Exchange Mail Transfer Agent (MTA) is basedcompletely on the X.400 recommendations. The X.400 connector is the connector ofchoice in low-capability networks. We see a lot of low-bandwidth connections inEurope and the Asia/Pacific region, and X.400 connectors are popular indeployments there. Screen 3 shows an X.400 connector, which is robust overextended links such as Dublin to Kuala Lumpur.
Some people argue that the Internet connector offers the same type offunctionality as the X.400 connector and is easier to set up. The Internetconnector has fewer property pages to complete when you create a new connectorand is equally capable of connecting sites. However, the Internet connector'sSMTP roots prevent it from offering scheduled connections. Messages sent overboth the X.400 and Internet connectors must be converted from Exchange internalformat to either P2/P22 (X.400) or SMTP/ MIME (Internet) before they aredispatched, meaning that both connectors are slower than the site connector. Theoverhead of format translation has been measured at 20 percent to 25 percent,but your mileage will vary. In any case, if you don't have the network tosupport site connectors and you are unwilling to pay for an upgrade, you'll paysomewhere else--in this case, by accepting the overhead of format translation.
Dynamic RAS is the last port of call. By definition it is slow, and thespeed of the modem connection at each end limits throughput. However, when youcan't do anything else (e.g., you're waiting for a permanent network connectionbut want to get Exchange into production), you have no choice. Bear thefollowing points in mind if you use Dynamic RAS:
If possible, install Exchange on the first server for the site where youhave a permanent network connection. Allow directory replication andbackfill--backfill describes how the directory is populated with data aboutusers, servers, and the Exchange organization--to occur before detaching theserver from the network and transporting the computer to its final destination.This approach avoids very large queues of messages (mostly containing directoryentries) building up across the slow modem link when the server joins theorganization.
Do not use public folder replication unless necessary. If you use publicfolder replication, make sure that the replication schedule is throttled back asfar as possible (once or twice a day). Try to keep the available bandwidth forpersonal messages.
Encourage users to behave responsibly and not send messages with largeattachments to users in the site served by the Dynamic RAS connector. One largemessage can occupy a 28.8Kbps modem for a long time.
Monitor MTA queues for the site carefully because large queues can quicklybuild up if the modem link drops unexpectedly.
Specialized Sites
Sometimes you need to dedicate a specialized site to a particular purpose orgroup of people, and you want to create a separate management environment. Forexample, suppose you want to operate a separate site for your company'sexecutive staff and limit the number of people with administrative privilegesover that site.
Another example of a specialized site is the connector site, a sitededicated to message exchange with other systems such as Microsoft Mail, Lotuscc:Mail, Fax, Internet, and X.400. A connector site has at least two servers (toprovide some resilience). Ideally, each server in the connector site needs to beable to handle the total messaging load so that if a server is taken offline,normal service can continue. You can configure some connectors, such as thosehandling SMTP mail, to be incoming, outgoing, or both; so inside the site, youcan configure one server to handle incoming mail from the Internet and the otherto handle outgoing messages. The logic behind the connector site is simple: Theconnector site removes the relative complexity of connectors from the standardmessaging servers. Administrators can more easily make changes, such as applyinga Service Pack (SP) for either NT or Exchange to a server in the connector site,because they don't have to interrupt service to users. In addition, you canallocate systems management to people who really know Exchange and thus avoidthe chance that someone who knows only the basics could change a connector andaffect the whole organization. Of course, not everyone can afford separateservers just to run connectors, but if you're operating at the high end of themessaging scale, this idea deserves your consideration.
Operating Multiple Exchange Organizations
You don't have to create a single Exchange organization. In fact, many largeenterprises find agreeing to create a single organization difficult: Businessunits or divisions opt to exercise a degree of autonomy and run Exchange with noregard for what other units do. In theory, this situation is an unmitigateddisaster, but in practice, it's not so bad. When two or more organizations areinvolved, you cannot implement some Exchange features (such as directory andpublic folder replication), but the basic messaging functionality works justfine across any number of Exchange organizations. Sure, you won't be able to usethe site connector, but the X.400 or Internet connectors do a more than adequatejob of linking servers into what appears to users as a seamless messagingenvironment.
As Exchange and NT evolve, I believe Microsoft will address two issues thatmultiple organizations pose: automated methods to share directory and publicfolder information across organizations, and tools to merge, split, and joinorganizational hierarchies to form new organizations. Exchange and NT willeventually share the same X.500-based directory (the Active Directory in NT5.0), and at that point, we might be able to join, graft, and split entities(such as domains or organizations) from the directory. After all, corporationsdon't retain the same business shape all the time, so why should their messagingsystem assume that they will?
Evolving Exchange
Exchange hasn't been out long. I don't think we have yet found all thetricks and techniques that we can apply to extract the utmost performance fromExchange, but we are moving along that path quickly.
About the Author
You May Also Like