Convoy Cluster Software and NT's Directory Replication
Set up a Web cluster and synchronize your Web servers to serve the same information with Valence Research's Convoy Cluster and Windows NT's Directory Replication.
July 31, 1997
Clustering and mirroring your Web serversfor maximum uptime
As Web master for Windows NT Magazine, I know thatdowntime is the absolute worst thing that can happen to a Web site. Severalvendors have solutions to help prevent this problem. One such vendor, ValenceResearch, is developing a Web clustering solution, Convoy Cluster Software, thatlets you balance your Web servers' load and make them fault tolerant. Theproduct looked interesting and simple to implement, so I gave it a try.
In addition to setting up the Web cluster, I needed a way to make sure thatboth Web servers in our cluster were serving the same Web pages. Applicationssuch as Octopus SASO can help you synchronize the information on both servers(for a review of SASO, see Carlos Bernal, "Octopus SASO 2.0," June1997). However, I felt this product was overkill for data replication. Aftermaking a few inquiries, I decided to use Windows NT's directory replication.
Convoy Cluster
Convoy is simple to install and operate. If you follow the detaileddirections, you can have a working cluster up and running in about 30 minutes.However, if you skip one vital step, such as I accidentally did, your Webservers will start playing ping-pong with blue screens of death. In thissituation, one server covers for the other while it's down. Unfortunately, whenthe server that was down comes back up, it causes the other server to go down.This cycle will repeat indefinitely. Valence Research's technical staff washelpful in pinpointing the problem in the configuration I had set up. When Ireinstalled and reconfigured the machines the second time, everything worked.
You can set up Convoy on machines with only one NIC. However, if you wantthe machines to be able to talk to each other so that you can duplicateinformation, you need to install two NICs in each Web server. I configured myenvironment using two NICs so that I could use NT's directory replication.Convoy refers to the two NICs as the dedicated adapter card and the clusteradapter card.
Installing Convoy
Although I can give you a general sense of how to install Convoy, make sureyou follow the installation directions to the letter. You install Convoy as anew adapter. The installer adds the Convoy Virtual adapter and a Convoy Driverprotocol to your system. After the installation is complete, the Convoy Setupscreen, which you see in Screen 1, automatically opens so you canenter your Convoy clustering variables. You use this screen to type in yourcluster IP number, each server's dedicated IP number, the priority status ofeach server in the cluster (the lower the number, the higher the status), andhow you want to distribute the cluster.
The next step is to view the network bindings for all protocols in theNetwork applet of the NT Control Panel. While you're at this screen, you need toconfigure the bindings so that the Convoy Driver protocol can talk to the ConvoyVirtual adapter and cluster adapter, but not to the dedicated adapter. You alsoneed to configure the bindings so that TCP/IP can talk to the Convoy Virtualadapter and dedicated adapter, but not to the cluster adapter. For informationon how to configure these bindings, refer to the Convoy documentation. Inessence, you are creating a firewall because only Convoy knows how to talkdirectly to your machine via the Convoy Virtual adapter and cluster adapter. Theoutside world can't see or use the IP for your dedicated adapter.
How Convoy Performs
To test Convoy, I simulated 50 simultaneous users requesting HTML pages fromthe cluster IP. Right off the bat, I could see the two machines sharing theload. When I made a page request from the Web cluster, Convoy built some of thepage from one server and the rest from the other. I was able to verify this loadsharing because my two development machines didn't have the same version of Webpages when I started the test. I then increased the number of simultaneous usersto 75, and the machines just kept purring. For reference, the first server is anIntergraph Web-300, 200MHz Pentium Pro with 128MB of RAM. The second is anIntergraph Web-300, 150MHz Pentium Pro with 64MB of RAM. In my environment, Icouldn't create enough client requests to slow down these machines.
To provide fault tolerance, Convoy redirects incoming traffic to anotherserver in the cluster when the software detects that the first server is notresponding. To determine which servers are active, the clustered machinesperiodically exchange broadcast messages with each other. This communicationlets each machine know the status of the other members in the cluster. When thestatus changes, such as when a server fails or leaves the cluster, Convoyinvokes a convergence. In Convoy terms, a convergence is when the clusterreestablishes itself so that it can redistribute the load. Convoy invokes aconvergence every time you add or remove a server from the cluster.
By default, each server broadcasts a message every second to monitor thestatus of the cluster. The cluster waits five seconds (five missed messages)before it initiates the convergence. The software takes another five seconds toredistribute the load, so the average failover time is 10 seconds. You canadjust these parameters as needed, but the default values work well withoutmaking the process too slow or overburdening the network. When I tested thefault tolerance, it worked every time. I could stop the Web service or shut downone of the servers in the cluster, and the remaining machine took over theentire load. Even with the default settings, my failover times were closer to 15seconds. During that time, a Web server will experience a few failedconnections, but these losses beat having to reboot or fire up another machine.Overall, I was pleased with the way the cluster performed.
However, when I stopped requesting simple HTML documents and startedrequesting data-driven pages, the picture changed. The Cold Fusion pages on ourWeb site didn't cause a big bottleneck, but our forums area did. The forumspackage we use, Allaire Forums, does some fantastic things; but it comes at acost. Allaire Forums is a resource hog. Granted, most users who visit our forumsdon't go click crazy like my test did, but what better place to see loadbalancing?
Allaire Forums consists of a lot of Cold Fusion pages that make calls tothe SQL Server back end. During this portion of my tests, Convoy stoppedbalancing the load between the clustered machines. Our SQL Server is anIntergraph InterServe 660 Quad 200MHz Pentium Pro with 512MB of RAM, so I knewthat the machine wasn't the problem. The problem began when I simulated 20 usersattacking the forums area. The Cold Fusion service that runs the forums chokedon one of the machines. This lackluster performance is unfortunate, but evenmore unfortunate is that the other machine didn't take up the load. When thefirst machine was pegged at 100 percent CPU, the second machine was just idlingat 20 percent. In this scenario, I would rather have seen both machines cruisingor both pegged; at least I would have known that the cluster was truly loadbalancing in all instances. Cold Fusion appears to be the culprit in this test,but Convoy should have been able to cover for it.
Convoy's fault tolerance worked as advertised and at a recovery rate Icould more than live with. However, I would like to see the software loadbalance in every situation. The product is still in beta, and I'm hoping thatValence can address this issue of load balancing certain types of pages, such asthe data-driven pages in our forums, before the final release.
Directory Replication
When I started testing Valence Research's clustering solution, my two Webmachines didn't have the same version of content. This situation is never idealin a clustering environment, so I had to remedy it. I could have just draggedthe root Web directory from one machine to the other, but this fix wouldn'taddress how I'd keep both machines mirrored in a working environment. I didn'twant to have to remember to put the same file on both machines each time I workon one, so I needed a tool to automate this process of replicating theinformation.
NT's directory replication feature lets you maintain identical directoriesand files on different servers and workstations across domains. Maintainingidentical data on separate machines is easy because only one master copy of thedata exists, and all the computers synchronize their data from that master copy.The master copy is the export server, and all other computers are importservers. I have only two machines-- one export server and one importserver-- although you can have multiple import servers. The export servercan export only one directory tree, so I exported the entire Web root directory.
You can configure directory replication either to replicate changes to theimport servers whenever you change any file or to wait for a two-minutestabilization period. I stuck with the default two-minute stabilization period.
To make everything work, you need to create a special user account for theDirectory Replicator service to use. (Everything I've read about directoryreplication says that you must create this account. However, I wasn't able toget the Directory Replicator account to work without Domain Administratorprivileges, so I simply used the Domain Administrator account to enabledirectory replication.) You can't use the name Replicator for the DirectoryReplicator account because NT already uses that name for a built-in domaingroup. To set up the special user account, you need to log on to the domain ofthe export server as a Domain Administrator. Next, you start the User Managerfor Domains and choose Users, New User. When you see the dialog box for the newuser, enter the values you see in Table 1. Choose Groups. The new account isalready a member of Domain Users, but you need to add it to the Backup Operatorsand Replicator groups. Click Add to create the new Directory Replicator account,and click Close.
Next, go to Control Panel and select the Services applet. Highlight theDirectory Replicator service and click Startup. Set the Startup Type toAutomatic. Select Browse to the right of the This Account field to see the AddUser dialog box. Highlight the Directory Replicator account you just created,click Add, and click OK. Enter the password information into the two passwordfields, and click OK to save. You'll see a message confirming that NT has set upyour account to use directory replication at login. Start the DirectoryReplicator service by clicking Start from the Services applet in Control Panel.
Now that you've created the account, you need to set up directoryreplication on each Web machine. First you configure the export server. Selectthe Server applet from the Control Panel, choose Replication, and click ExportDirectories. You probably won't be replicating the default directory, so type inyour Web machine's directory that contains the subdirectories and files you wantto export. I specified c:wwwroot, as you see in Screen 2. In the To Listbox, add the export machines, and click OK. You perform these steps for eachimport server, except you enter the import information instead of the exportinformation.
After you finish setting up your import servers, you can test your setup.Copy a file into the directory you want to replicate on your export server. Ifyou don't see the same file in the same directory on your import server withinminutes, something is wrong. As I mentioned before, I had to change the login ofthe Directory Replicator account to Domain Administrator before I started seeingfiles replicated to the import server.
When you can confirm that NT is replicating your files to your importservers, you're finished. Now, anytime you make a change to a file in the exportdirectory on your export server, the changes will automatically appear on eachof your import servers.
About the Author
You May Also Like