Review: Stratus Avance

Stratus Technologies' Avance software is a midrange high-availability solution that can deliver availability at near-enterprise levels but without the million-dollar outlay.

Joel Sloss

July 21, 2012

9 Min Read
ITPro Today logo in a gray background | ITPro Today

Floods, fires, earthquakes, power outages, and software and hardware failures are reminders of why disaster readiness and recovery are so important.Maintaining business continuity in the face of this adversity could mean the difference between weathering the storm and going out with the lights.

Enterprise IT groups know and handle this challenge well, but it can be quite difficult for smaller organizations to meet 99.0 percent uptime requirements,let alone 99.999 percent. Cost and complexity barriers keep many businesses from trying high-availability solutions at all, forcing IT staff to use manual,administrator-intensive detection, remediation, and recovery processes.

Many forms of high-availability solutions exist today, ranging from software-based solutions to mission-critical solutions that offer hardware-levelredundancy and failover. The trick is to pick the right one for your organization, thereby achieving the desired availability without breaking your ITbudget. As with network security, the more you can afford the better off you'll be, but there is a tipping point at which you're throwing good moneyafter bad. In other words, your particular business might not require extreme measures. I recently took a look at Stratus Technologies' Avancehigh-availability software, one of the midrange solutions that can deliver availability at near-enterprise levels, but without the million-dollar outlay.

Overview

CIOs often call on systems administrators to reduce costs but still boost IT reliability. Administrators in small-to-midsized businesses (SMBs) tend to feelthis crunch more acutely, because delivering fault tolerance can more than double the cost of the existing infrastructure for backup servers, redundantnetworking, and so on. Although native technologies in Windows Server are capable of getting you part of the way there, they fall short of theinstantaneous failover that's needed for demanding workloads -- and demanding CIOs.

Stratus aims to solve this conundrum through a hardware agnostic, yet not entirely hardware independent, software-based availability package for SMBs.Stratus has made its name in enterprise-class high-availability solutions for more than 30 years, keeping the lights on 24 ´ 7 for critical human services,such as 911 call centers, hospitals, utilities, and more.

Avance combines a software offering with proactive management (which can even be monitored by Stratus remotely) and hardware redundancy. An Avancehigh-availability cluster provides near-zero failover and recovery times, with near-zero client impact (including stateful applications) using real-timemonitoring and data replication. If you're running a heterogeneous environment, you'll also appreciate Avance's support for Linux server platforms (e.g.,Red Hat, CentOS) and applications. Avance uses CentOS 5.5 and Citrix Systems XenServer virtualization technologies to abstract hardware from software,providing a foundation for transparently migrating OS and application workloads between physical systems in the event of a failure.

You can use most of the off-the-shelf server, networking, and storage hardware as long as any two systems you cluster are similar enough that a hardwaremismatch doesn't result in bad driver behaviors (and thus a crash). In addition, the same RAID configuration must be used on both machines. One benefit ofthis clustering approach is that you don't need to purchase a dedicated storage array for data because replication between servers occurs over the wire.

The downside is that you still need an equivalently configured second server as a hot standby. Note that you won't have an active-active performancecluster. For more information, see the sidebar "How Avance Works."

Setup

For expediency, I started with two white-box Intel servers, which were supplied by Stratus. Each server had a S5520UR motherboard, dual quad-core XeonX5560 processors, 24GB of memory, and 2TB of disk space. You can gain additional hardware resiliency if you select a chassis with hot-swappable components(e.g., CPU, RAM), RAID controllers, redundant power supplies, failover NICs, and so forth; doing so will reduce the likelihood of a single-server failure.This isn't required, however, since the solution's real-time monitoring includes more than 150 different metrics and predictive analytics that will triggera live migration if a fault is either detected or about to take place.

Your dual-server configuration doesn't need to be any different from your standard build, with the exception of a dedicated gigabit Ethernet port on eachmachine for management and data replication, which is referred to as the "Sync" link. The servers can also be completely headless (after initial setup),because all maintenance operations are performed through a web-based console. However, Stratus recommends redundant Sync links to improve performance andfault tolerance.

Avance installation is straightforward and uses a self-imaged DVD. It automates setup for both servers through a single process, but you should reformatthe machine if you're repurposing older hardware. (You can't change out the hardware on an existing OS platform build or migrate it from another machineunless it's identical hardware and already virtualized.) Adding the second machine to form a cluster is achieved by a fast software install driven from theprimary node. When you join the second server to the cluster, an automated synchronization process images and configures it.

Avance's instant data replication between nodes means each server is always up to date. When a hardware failure, predicted failure, or planned shutdownoccurs, the second machine simply picks up where the first one left off. This lets you carry out whatever maintenance is required on the first node withouta service interruption. When you're done, you can manually flip the workload back to the first node or leave the workload on the second node, letting itmigrate back to the first node only if a failure is detected on the second one.

Operations

Avance features a web management console (Apache Tomcat based on HTML5 and JavaScript), which runs on only one cluster node at a time. The great thingabout web-based management is that it's usually simplified and available from any client (even a mobile device). The downside is the nagging question aboutsecurity. Although you benefit from a thin-client (no install) experience, you're increasing your potential attack surface by running, at the host level, a web application that is capable of full system control, even over SSL.

Each virtualized server workload can be locked down and protected with anti-malware solutions and the like, but host-level intrusions are bad news.XenServer isolates virtual machines (VMs) from each other, and Stratus has invested in hardening the host configuration.

For additional protection, you could deploy a full application-layer firewall and place your servers in a demilitarized zone (DMZ), which is a commontopology. Alternatively, you could deploy a dedicated security VM through which all traffic gets routed. However, a bare-metal hypervisor with no nativeapplication operations and a separate management server would provide a better overall (albeit more costly) security posture. The Avance console has aninactivity timeout feature but lacks any token-based or multi-factor authentication capabilities.

Every task in Avance is possible through the web management console, saving you from having to sit in the wiring closet with the machines. It's alsoconvenient if you employ a services management vendor to remotely maintain your IT infrastructure. Although native IP repudiation or filtering isn'tsupported for limiting administration through only certain IP addresses, access through a VPN and firewall will grant similar security.

Using an easy-to-understand layout, the UI gives you quick access to Avance's default dashboard, which provides alerts, configuration details, anddrilldown pages for managing both physical and virtual cluster attributes. You also have quick access to pages in which you can manage physical machines,set up storage groups and volumes to dedicate resources to specific workloads, lay out virtual networks, manage users, build VMs, and more. Most operationsare driven by easy-to-use wizards that automate the tasks.

In keeping with the fully virtualized nature of the solution, you can create virtual CD installation points accessible by specific VMs, as Figure 1 shows.They can be used as either direct copies of .iso software media or downloadable installs by both servers and virtualized desktops. Although it might notbe advisable from a security point of view, you can make physical components such as USB storage available to individual workloads.


Figure 1: Creating virtual CD installation points 

Failover

When a failure occurs, Avance provides active monitoring across a variety of different categories, enabling a full range of fault detection, whetherphysical or virtual. As with some out-of-band (OOB) management solutions, predictive filters can help identify when something bad is about to happen,instead of just waiting for a failure. With this fair warning, you can get ahead of the problem before a catastrophic event occurs that even Avance can'thandle. Although if you're using the right combination of metrics, which are dependent on the specific hardware and OS, I'm not sure what this could be.

To test Avance, I did a number of disagreeable things to the servers. I removed network cables, unplugged the power cord, killed VMs, and so forth. I evenwent so far as to hard-crash both machines at the same time by yanking out all power cords, even to the redundant power supplies (causing them to emit avariety of plaintive beeps). Impressively, nothing bad ever seemed to happen. Killing one entire server produced a warning in the console, as Figure 2shows, but neither the management application nor the workloads (such as the Remote Desktop Services session) seemed to notice.


Figure 2: Checking the warning in the default dashboard 

The VMs seamlessly kept going. When I brought the failed primary server back online, it quietly rejoined the cluster, resynchronized its data, and took itsplace as the new secondary node. I had difficulty thinking of anything else I could break without physically damaging the hardware.

Given these capabilities, what could you use Avance for, beyond the obvious uptime enhancements? As I previously mentioned, there are other forms offault tolerance and clustering available, some of which might be better suited to certain workloads or situations. Areas in which Avance would be a naturalfit include:

  • 99.99 percent application availability

  • Remote-site redundancy

  • Small or branch-office resiliency

  • Small- to average-size workloads (e.g., Microsoft Exchange Server, Microsoft SharePoint, customer relationship management—CRM—software, limited-scale database environments)

  • Private cloud

Areas in which a different approach (or perhaps the more advanced enterprise-class V Series offering from Stratus) would be best include:

  • High-throughput transaction processing

  • Data warehousing

  • Real-time computing

  • High-capacity distributed applications or enterprise-scale deployments (e.g., multi-server email or database environments)

  • Public cloud

Note that there isn't a facility for managing multiple Avance deployments through a single console. Thus, building one large cluster of powerful machineswould be better than using several smaller clusters in a demanding environment.

Avance Lives Up To Stratus' Reputation

Avance lives up to the reputation established by Stratus' more advanced availability solutions. Avance also provides capabilities you'd normally expect inmuch higher-priced packages. With its focus on failover and ease of use, smaller IT shops with limited resources or training will be able to up-level theirservice offerings and greatly enhance disaster readiness.

But perhaps a more important question might be, "Would I install this in my data center?" The answer is yes, I would.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like