RAID: Enhanced Disk Storage for Windows NT
With the flurry of activity surrounding Windows NT clusters and server scalability, RAID has gained new exposure in the NT market. Here's what RAID is all about.
July 31, 1997
Enhanced Disk Storage for Windows NT
RAID or Redundant Array of Inexpensive (orIndependent) Disks, can improve disk drive performance by spreading data acrossmultiple disks that are treated as one logical drive. A RAID subsystem canenhance system performance, provide fault tolerance, simplify the process ofadding disk capacity, and make building extremely large disk volumes possible.
RAID has been around on UNIX and mainframe systems for many years, and thetechnology was built into Windows NT from the start. With the flurry of activitysurrounding NT clusters and server scalability, RAID has recently gained newexposure in the NT market (see the June 1997 issue for more information aboutNT-based cluster solutions). But what is RAID? And how can it help you improveyour NT systems' performance and reliability?
Let's explore the answers to these questions with a detailed technical lookat RAID options for NT systems. I discuss the best RAID levels to use foroptimizing performance and fault tolerance, and provide some general guidelinesfor choosing a RAID system. For an introduction to RAID, seeRaid Levels
Hardware and Software RAID
The two types of RAID are hardware RAID, in which the disk controllerperforms the RAID functions, and software RAID, in which the operating systemperforms RAID functions. NT 4.0 lets you use hardware- or software-basedsolutions or combine the two to achieve the best performance and faulttolerance.
Many vendors, including Adaptec, American Megatrends (AMI), Compaq, andMylex, provide hardware RAID solutions (disk controllers and array chassis) thatoffer many of the RAID levels listed in "RAID Levels." RAID 0, 1, and5 are the most common. As a rule, hardware-based RAID solutions are faster andmore reliable than software-based ones. They also offer a greater range ofconfiguration options. Of course, they're more expensive than using NT'sbuilt-in RAID, but if you want the best performance, strongly consider includinghardware-based RAID in your overall system budget.
NT supports RAID functionality, offering software settings for RAID 0 in NTWorkstation and 0, 1, and 5 in NT Server. The advantages of software RAID arethe convenience of built-in software and cost. However, performing RAIDfunctions through the operating system instead of offloading them to a separatecontroller can slow server performance.
As you can see in "RAID Levels," each level has differentperformance characteristics, fault-tolerance capabilities, and drive usages.Some levels offer excellent all-around performance, and others sacrifice thisperformance to gain fault tolerance (for more information about how theperformance of RAID 0, 5, and 10 compare, see "Optimizing Exchange Server,"November 1996). Because each level is suited to a particular environment, yourmain challenge when choosing a RAID subsystem is to decide which RAID level touse on your server under what conditions. Let's look at the tradeoffs of theRAID levels most commonly used in NT systems and some tips for choosing the bestRAID for your system. (See "RAID Tips," for some RAID-optimizationhints. And for information about other ways to improve disk subsystemperformance besides RAID, see "Pumping Up Your Server")
Optimizing for Performance: RAID 0
Disk performance is a critical factor in server performance. Disk access ismuch slower than memory access. Therefore, the faster your disk I/O, the fasteryour server's response time. As a rule, RAID 0 (i.e., plain disk striping)provides the fastest I/O and thus the best performance.
RAID 0, or normal striping, splits data blocks (chunks of data) acrossmultiple disks simultaneously. The group of disk drives containing the splitdata is called a stripe set; the size of each data piece depends on how manydisks are in the stripe set. Striping means all drives are active for every I/Otransaction and that each drive in the stripe set does less work pertransaction. Less work means faster performance.
You can immediately benefit from RAID 0 by using NT's Disk Administrator tocreate stripe sets. This approach lets you create larger disk volumes under NTFS(FAT has a 2GB partition limit) and improves disk I/O performance.
Software striping via Disk Administrator is useful for just about anyapplication, but with some cautions. First, software striping causes someminimal additional CPU overhead because NT now has to calculate striping insteadof just passing I/O requests to the disk controllers. However, with today's fastCPUs, this overhead is not a problem because the processing takes a very smallpercentage of the CPU's overall capacity and the performance benefit of usingmultiple drives is greater than the performance hit. Systems with old processors(386, 486, or even slow Pentiums) may have more difficulty, and you need toaugment them with a hardware RAID controller, which offloads RAID calculationsfrom the system's main CPU or CPUs.
Second, be careful of where disks are located in the system. If you stripedisks on two or more SCSI controllers (called controller multiplexing), you'reasking NT to calculate which data goes where in addition to figuring out thestriping, not to mention processing overhead, system bus traffic, and processorinterrupts for handling multiple cards. Again, older systems may have troublehandling this processing.
Try to stripe disks only on the same controller for the best performance,unless the capacity simply isn't enough from one SCSI card. You can compensatefor the above problems by using a hardware RAID controller that has specificcircuitry for handling these calculations and multiple channels for enhancingperformance and adding capacity (a multichannel card uses only one interrupt).
The big drawback of RAID 0 is that it offers no fault tolerance: If onedrive in the stripe set dies, the entire volume is unrecoverable. Also, thenumber of drives you use in a stripe set has a point of diminishing returns.
For example, the results explained in "Microsoft SQL Server 6.5Scaleability," (January 1997) showed that six drives were the effectivelimit for a Compaq ProLiant 5000 with a Smart 2/P Array Controller; more drivesimproved performance minimally. This minor improvement is because the mechanicsof the situation catch up with you (as the number of drives in the stripe setgoes up, the block size goes down; if the block size drops below the stripewidth, the advantages of striping diminish). In addition, you saturate the SCSIchannel with too many drives. New controllers with faster hardware, such as WideSCSI-3 and Ultra-2, raise this limit because they can run at 40MBps or 80MBpsand use wider (32-bit) data words. (See ">RAID-Related Terms,"for definitions of the SCSI standards and other terms.) Another issue is thatmore drives mean greater probability for failure.
Some experts recommend that you never use RAID 0 alone on a server.However, the question is one of cost vs. performance, so RAID 0 with anaggressive backup policy may be worthwhile.
Optimizing for Fault Tolerance: RAID 1 and 5
Optimizing your server's disk storage is a balancing act: You want the bestpossible performance, but you need to protect your data, too. RAID 1 and RAID 5are two widely used methods for protecting data.
RAID 1, disk mirroring, is most often used for smaller critical datavolumes. It gives you complete fault tolerance (either drive in the mirror setcan fail without affecting system integrity or performance) and slightly betterperformance than no RAID. The tradeoff? Because both drives are exact copies ofeach other, you get only 50 percent of the disk capacity you purchased.
RAID 5 is the most commonly used option for fault-tolerant disk volumes inNT because most manufacturers implement and support this method, it is part ofNT Server, and it offers a reasonable compromise between performance and diskcapacity. RAID 5 offers enhanced performance, protection, and far less capacityloss than RAID 1. Because you can build a RAID 5 volume out of as few as threedrives, the maximum capacity you lose is 33 percent; the more drives you add,the less total space you lose. RAID 5 offers better I/O read performance than noRAID at all and in some cases, is even better than RAID 0 (because of thestriping algorithm used). The drawback of RAID 5 is that write performancesuffers significantly because every I/O operation requires a parity calculation.This performance hit in software RAID 5 is high; you'll probably want to use afast RAID controller to compensate for the overhead.
The advantages to RAID 5 are that you can build very large fault-tolerantdisk volumes, and any drive in the stripe set can fail without damaging data.However, fault tolerance doesn't mean you won't suffer a little if a drivefails. When one drive disappears from the stripe set, either your system CPU orthe RAID controller must compensate on the fly by using the remaining data andparity information to reconstruct the data for every I/O request. Depending onyour system and controller, this reconstruction could mean as much as a 50percent performance hit on that volume--but at least you're still running!
In NT, this recovery process is automatic (as it is on hardwarecontrollers). NT also automatically rebuilds the volume when you replace thefaulty drive. As soon as the system gets a new drive, it begins the backgroundprocess of reconstructing the data on the new drive in the same way it handlesI/O requests on the fly (this process can take several hours, depending on thevolume/disk size). The process slows performance (more with software RAID thanon an accelerated controller), but as soon as reconstruction is finished, systemoperations return to normal.
Also note that in software RAID 5, you often cannot break the set to add anew drive. Such behavior makes RAID 5 on NT not such a great option, and someexperts never recommend this approach. In contrast, this issue does not arisewith hardware RAID.
Other Fault-Tolerance Options
Two additional RAID fault-tolerance hardware options are RAID 3 and 4.Although they're less common on NT systems than other options (and NT does notsupport them), they offer fault tolerance through striping with parity data.
In addition to providing fault tolerance through RAID, some diskcontrollers have special features that ensure availability in the event of adisk crash. Some RAID arrays feature hot-swap drives: You can remove and insertdisks without powering off the disk cage or even the specific slot.
A hot swapcapable array should never go down due to a drive failure(barring component death of the backplane, faulty power supplies, or similarproblems). Systems without hot-swap drives require you to power down the systemto replace a bad drive. In systems with hot-swap bays, the controller/softwaredetects the new drive coming online and begins repairing the volume.
Another option is a hot-spare--a drive in the array that waits in standbymode. If any other drive in the array fails, the system automatically switchesover to the hot-spare and begins rebuilding, without administrator intervention.When you replace the faulty drive, it becomes the new hot-spare. You can enablehot-spares through the controller's BIOS or management software.
The Best of Both Worlds
A few combined RAID levels (e.g., RAID 10, 30, or 50) offer both performanceand fault tolerance by using two forms of RAID on the same logical volume at thesame time. As you might expect, you pay more to have both capabilities. Thisextra cost is because NT's Disk Administrator tool alone won't let you combineRAID levels; to do this, you must combine a hardware RAID controller with NT'sRAID software functions.
One combined RAID level is RAID 10, also called mirrored stripe sets (i.e.,a RAID 0 stripe set is mirrored to another stripe set). RAID 10 offers excellentgains in read and write performance in sequential and random transactionenvironments. In fact, it's the best overall performer of all RAID levels. Thecost, as with mirroring, is that you lose 50 percent of your planned diskcapacity. But, where simple mirroring (RAID 1) costs you only one drive permirrored set, RAID 10 costs you as many drives as are in the RAID 0 stripe set(which can get expensive). Like RAID 1, RAID 10 makes a fault-tolerant volumewith the performance advantages of striping and no performance hit in the eventof a drive failure.
Another combination of RAID 0 and 1 is RAID 01, or striped mirror sets,which has similar characteristics to RAID 10. The main difference between RAID10 and 01 is which RAID level the hardware controller handles and which thesoftware handles. In RAID 10, for example, if the software handles the striping,the controller performs the mirroring; in RAID 01, vice versa
Not all RAID controllers support level 10 or 01. You'll need to check whichRAID levels a controller supports before you buy it. However, you can makecombined RAID by using hardware for the first part (RAID 0 striping or RAID 1mirroring) and software for the second (the alternative mirror or stripe,respectively). This solution does not perform as well as using a RAID hardwarecontroller that can handle both at the same time. But you can still buildhigh-performance, fault-tolerant disk volumes without replacing an existing RAIDcontroller.
Other RAID levels, such as 30 and 50, can also enhance performance andfault tolerance, depending on your applications. With them, you can build verylarge disk volumes out of commodity drives. However, these RAID levels are oflimited use in most low- to midrange NT server situations, unless your goal isto experiment or achieve new and interesting disk configurations. RAID 50 is agood option on an enterprise-scale server where you are trying to build a 500GBor even 1000GB disk volume.
The Right RAID
With the variety of available RAID options, you can choose the right balanceof performance and fault tolerance for your site. Mixing hardware and softwareRAID lets you build disk subsystems specifically tailored to your needs, such asextremely large disk volumes or multiple-faulttolerant arrays. WhateverRAID you consider, it's a disk technology you can't afford to be without.
About the Author
You May Also Like