SQL Server Storage in the Virtual World
Storage options in Hyper-V and vSphere
August 15, 2010
So you've decided that you want to virtualize some SQL Server machines, but you've got some storage concerns. Concerns are good when it comes to moving SQL Server machines from the physical world to the virtual world. I'll do my best to address any concerns that you might have about storage performance in the virtual world.
Whether you're using VMware's vSphere or Microsoft's Hyper-V, you need to take special care with your high-load database disks to ensure that they'll have the I/O resources available to meet the applications' demands today and in the future. You have main three storage options for your virtual machines (VMs):
Use traditional virtual disks. In the Hyper-V platform, they're called Virtual Hard Disks (.vhd). In the vSphere platform, they're called virtual machine disks (.vmdk).
Mount a volume (or LUN in a SAN environment) to a VM, passing it through the hypervisor. In the Hyper-V platform, this technique is referred to as using pass-through disks. In the vSphere world, it's known as a raw device mapping.
Use iSCSI to directly link the Fibre Channel storage array to the VM. To use this option, your storage array must support iSCSI or you must have a device within your storage environment that supports converting iSCSI traffic to and from the storage array.
Using Virtual Disks
Virtual disks are what systems administrators typically set up by default. They're files created on the host OS's storage (either as local storage or storage provided by a storage array) and mounted within the VM as physical disks. The host will use a set of disks in a RAID array (which is a set of disks attached to the back of the server via a SCSI connection) or a Fibre Channel storage array. The OS running within the VM will see these as traditional SCSI disks. The performance of these virtual disks will depend on a few factors:
How many physical disks are hosting the data
How fast those physical disks are
How many other VMs are using the same physical RAID array
How much I/O the other VMs are using on the physical RAID array
For example, if your virtualization platform has five disks in a RAID 5 array and 40 VMs use that single RAID array to store their data, you probably don't have enough I/O for your SQL Server machine to use, unless it has only the smallest of I/O requirements. With that said, if those five disks are changed from standard Fibre Channel SAN disks to solid state disks (SSDs), you'll probably have enough I/O capacity available. SSDs support much faster data transfer rates than normal rotating disks because the SSDs don't have any spinning heads or moving arms.
There's a small performance penalty when using virtual disks. The engineers at VMware told me to expect about a 2 or 3 percent performance penalty. I expect the performance penalty is about the same in Hyper-V. If you have a RAID array that will store only SQL Server data files, using a virtual disk on a RAID array is a perfectly acceptable solution, provided that you're okay with the slight performance penalty.
Mounting a Volume
At a high level, using pass-through disks and using raw device mapping is basically the same. The volume (or LUN) is mounted to the host OS, as shown in Figure 1. The volume is then assigned directly to the guest OS via management tools so that the guest OS has full control over the volume.
With this setup, the guest OS can format the volume using its own file system, which means there's less overhead from the host OS. Although the virtual disk's overhead is already low (2 or 3 percent), using this setup will give you an even lower overhead. I don't have any specific numbers on the overhead for raw device mapping in the vSphere platform, but the white paper "High Performance SQL Server Workloads on Hyper-V" discusses Hyper-V's pass-through disk performance.
Another advantage of using pass-through disks or raw device mapping is that your VM effectively gets direct access to the SCSI array (or Fibre Channel network in a storage array) while still being a VM, so it doesn't lose any of the available Live Migration or vMotion (VMware's version of Live Migration) features. One disadvantage is that it requires a more advanced configuration, so some systems administrators might not be comfortable using this technique. The advanced configuration makes the virtual environment more complex and therefore harder to manage and explain to newly hired administrators.
Another disadvantage is that mounting a volume to a VM increases costs because you must have dedicated resources for the VM. In a direct attached environment, you'll need dedicated disks and possibly a dedicated drive chasse and RAID card (or more than one depending on the virtual server's needs). These physical resources are dedicated to the VM, so their cost can't be absorbed by multiple VMs. In a SAN environment, you'll need dedicated LUNs for the VM to use. Although these LUNs can be on disks that are being shared by other physical or virtual servers, that space needs to be allocated and managed by your storage administrator. When mounting a LUN directly to a VM via pass-through disks or raw device mapping, you need to understand the I/O traffic that's being used by other LUNs sharing the same physical disks. (For information about SAN configurations for SQL Server, see "Configuring Storage for Your SQL Server Environment".)
Using iSCSI
If you're working in a shop with a storage array, you can use iSCSI from the storage array directly to the VM. Before you can decide to take this route, you'll need to check with your storage administrator to make sure that your storage array supports iSCSI. Most newer storage arrays support iSCSI natively, as shown in Figure 2, but if you have an older storage array, it might not. If that's the case, you'll need a Fibre Channel switch that supports iSCSI. This switch will allow you to connect to the storage array via iSCSI without the storage array natively supporting iSCSI.
When a guest OS uses iSCSI to access a storage array, it's bypassing the host's storage driver. Although that lightens the storage driver's load, it puts an additional load on the host's Ethernet driver and the Ethernet infrastructure. As long as your Ethernet infrastructure isn't already saturated, this is a great way to provide direct storage access to your guest OS.
If you decide to use iSCSI, you'll want to make some configuration changes to your guests. Typically, you need to isolate your iSCSI traffic to another network either through physical network switches or Virtual LANs (VLANs) within your regular network switches. In either case, you need to connect at least one physical NIC on the host to this network so that your iSCSI traffic doesn't need to be routed between various subnets. After doing this, you should add a virtual NIC (vNIC) to the VMs that need access to the iSCSI network. This way, the VMs' normal traffic is routed over the LAN network, and their iSCSI traffic is routed through the dedicated iSCSI network, as Figure 3 shows.
Although a dedicated network isn't required, it's recommended because iSCSI tends to generate a lot of network traffic—and this network traffic could increase CPU utilization on your network routers. In addition, any other network devices that your iSCSI traffic needs to travel through on its way to the storage array will add latency to your storage commands.
By giving the iSCSI traffic a dedicated network connection from the host to the Ethernet network, you'll help reduce the chance that the host server's network ports will become saturated. This holds true even when the physical network between the iSCSI network and the LAN network use the same network switch.
Physical HBA Mapping
If you're working in a vSphere 4.0 virtual environment, you have a fourth option available to you. However, it's a beta feature as of this writing, so use it with caution. This option lets you map a physical host bus adapter (HBA) to a single VM, which effectively puts your VM directly in the Fibre Channel network, as shown in Figure 4.
After mapping the HBA, you need to zone it so that it can access the storage array. (Zoning is the process of making a logical connection between two ports, allowing those ports to talk to each other.)
The upside to physical HBA mapping is that it provides direct high-speed access to the Fibre Channel network. However, there are two hefty downsides.
The first downside is that if you map the HBA to a VM, then no other VM or physical machine can use that HBA. So, if your server has two HBAs and you map one of them to a VM, you'll have only one HBA available to your physical server. And if you map both of your server's HBAs to VMs, you won't have any left for the host, which means the host won't be able to talk to the storage array.
The second downside is that the VM can no longer be moved from one host to another with vMotion. Because the VM is tied to a server's physical HBA and that HBA isn't in any other server, you can't move the VM to another server without powering down the VM, removing the old HBA mapping, and creating a new one. Afterward, you need to zone the VM's new HBA, reboot the VM, and make any necessary HBA driver configuration changes. Because the VM is down for a while, physical HBA mapping isn't a viable option for SQL Server VMs that need to provide high availability. But the option is there if your SQL Server VM needs high-speed storage access to a Fibre Channel network.
Bringing It All Together
When virtualizing your SQL Server environment, you don't want to simply use the higher-end storage solutions for every SQL Server VM you create. Most storage arrays have limits on how many LUNs can be created and how many physical or virtual servers can be connected to them. You need to find a solution that provides storage access at the right speed at the lowest possible cost with the least administrative overhead. So, you need to sit down with your storage administrator and evaluate each SQL Server instance that you want to virtualize. Look at the I/O operations per second in Performance Monitor to see what the current load requirements are, then compare those requirements with the I/O capabilities of the various storage options to see which option would work best. By having a solid understanding of your storage needs and your storage options, you can make informed decisions.
About the Author
You May Also Like