HADR Brings Resiliency to Virtualization
Virtualization is becoming increasingly important in the datacenter as a way to respond quickly to the varying server demands. Depending on time of day and day of the week, as well as events in progress and many other factors, loads on any given machine may vary by factors of 100 or 1,000 or more. Giving a server more or less computing power, running multiple instances of the same server for load balancing purposes, or allowing failover from one virtual instance to another are increasingly important capabilities.
That's where HADR (high availability disaster recovery) tools come into play. High availability tools ensure that applications running in a virtual server session remain available even during failures of hardware, server OS, or application software. Disaster recovery tools enable the quick recovery of functionality after loss of hardware and are oriented toward restarting services at new locations.
HADR is complex enough when working with one OS per server. When you add in the multiple virtualization platforms out there, the numerous OSes, as well as all the storage and network settings and the additional complexities of boot images used by the hypervisors, HADR for virtual servers becomes even more complex. Just as there is no single solution for HADR in general, there is no single solution within virtualization: Different products address different areas, including backups, failover, deployment, and storage virtualization.
These HADR products range from software that is installed on either the virtualization server or on a separate server to hardware-software combinations that are installed separately. There are even specialized platforms that enable just one aspect of HADR for virtualization, such as DataCore's SANmelody, which enables storage systems to respond to the changing requirements of virtual servers as they are moved from one instance to another.
HADR Can Be H-A-R-D
HADR is fraught with complexity, due to the intricacies of virtualization itself. For example, moving a server instance from one physical server to another can be complicated by differing subnets, differing hardware from system to system, differing access to storage (the logical units or LUNs on a storage system are typically mapped to a specific piece of hardware), and other factors. Because there is generally no single overarching tool for this, management of the overall system is exceedingly complex.
There's an additional complexity: the boot image. This image is the file stored on disk that encompasses the file system, boot sectors, boot files, operating system files, application files, and so forth required by an operating system. VMware can save the entire thing as a single image file on the local disk, but moving the instance from the local disk to a SAN requires converting it to a block device. This conversion process is not a problem, but an image converted to a block device cannot always be converted back to a local image. There are similar issues with other virtualization products.
Notably, some of the virtualization vendors are already baking HADR functionality into their wares to address these types of issues. VMware, for example, has unveiled several such features. There's Virtual Machine File System (VMFS), which supports storing OS images on shared volumes. Additionally, there's VMotion, which supports moving instances from one VMware server to another without having to bring the instance down first. Moreover, Site Recovery Manager provides central management of instances across multiple ESX servers. VMware HA (High Availability) can restart instances that stop responding, or restart them on other ESX servers if necessary. Finally, VMware Distributed Resource Scheduler (DRS) allows dynamic reallocation of resources to servers when loads increase or decrease on a given instance.
In this round-up, however, we'll be looking at third-party offerings intended to supplement virtualization products such as VMware, Microsoft's Hyper-V, Virtual Iron, and Parallels, as well as XenSource and other products based on Xen, as well as KVM, VServer, and other open-source virtualization platforms. The products include DataCore's SANmelody, Marathon Technologies' everRun VM for Xen, Scalent Systems' Scalent software, Stratus Technology's Avance, and Vizioncore vRanger Pro. Each addresses a different aspect of HADR for virtualization.
Tale of the HADR Tape
SANmelody virtualizes storage and works with emBoot to enable VMware server instances to boot from the SAN rather than the local hard drive, making it easier to move instances from one system to another. The boot instances stored on the SAN can also be backed up using the snapshot functionality of the system, as well as being replicated on a second system.
everRun VM for Xen enables failover modes for Citrix's XenServer so that instances that fail on one system continue to be available on the second. This provides true continuous availability, with both the primary and secondary instance having the same IP address and even the same MAC address. There is no detectable interruption in service if one of the two instances fails.
Scalent offers an infrastructure virtualization system that provides an integrated platform for deployment, migration, and failover of virtual instances, allowing instances to be converted from local to SAN-based. It also automates changes in network settings, SAN settings, and more as instances are moved. It offers a uniquely flexible, quick and easy system for deploying, moving, or re-deploying server instances.
Avance integrates with Citrix's XenServer and provides automatic failover from one instance to another if a server fails. It uses a dedicated, hardened version of XenServer modified to provide rapid failover and high security. Each instance is separate, and there can be slight delays in responses to client systems during the switchover.
vRangerPro is a backup utility for VMware that allows both full and incremental backups of server instances, allowing for greater flexibility than the simple creation of an image allowed with the core VMware functionality. It can also back up a physical server and then restore it to a virtual instance, allowing for a disaster recovery strategy that uses far fewer servers at an alternate datacenter.
None of these products directly competes with each other; rather, they all help fill in some part of the large puzzle that is HADR for virtualization.
DataCore SANmelody 2.0
SANmelody from DataCore is not simply virtualization HADR software. Rather, it's storage virtualization software that encompasses many types of storage features. A couple of those features, however, make the product extremely useful in an HADR environment for virtualization.
The SANmelody software, which installs on Windows Server 2000 or 2003, essentially turns a commodity Windows server plus storage into a SAN storage platform, with high-end features such as thin provisioning, support for boot from SAN, snapshot, and replication functionality. It works with internal storage and direct attached storage, as well as iSCSI or Fibre Channel SAN storage -- anything that Windows supports.
The aforementioned boot from SAN features are of particular note in the context of HADR: It allows administrators to easily create a flexible and resilient virtualization environment. The feature makes SANmelody a nice complement to versions of VMware previous to ESX 3.5, which don't support booting directly from SAN volumes: It integrates with emBoot's netBoot/i to enable boot from SAN with iSCSI as well as FC.
Because it works with emBoot, it should also work well with any open-source hypervisors that support it. Moreover, Datacore has worked with other manufacturers to ensure that SANmelody works with XenServer, Microsoft Virtual Server, and Virtual Iron.
Additionally, SANmelody simplifies the deployment of multiple instances of the same OS. Admins can create one boot volume, install an OS instance to it, then create snapshots and copy them to additional volumes very quickly.
Installation of SANmelody is no more difficult than any other Windows application, and because hardware support is based on Windows support, any hardware that runs Windows will work. After the software is installed, management of the storage system can be accomplished locally on the Windows server or via browser. The interface is clean and easy to navigate, and creating boot volumes for VMware or XenServer is simple.
For the purposes of this review, I did not test all the functionality of SANmelody. Instead, I looked at creating boot images for both VMware ESX 3.5 and XenServer on the storage attached to the SANmelody system. I easily created boot images and then booted the instances from the SAN storage rather than local storage. I also used VMware's VMotion easily to migrate an instance from one VMware server to another without copying files or doing anything other than changing which LUN was presented to which server.
The capabilities you get with SANmelody are not unique. Many storage systems from companies such as Compellent, Xiotech, NetApp, EMC, 3Par, and EqualLogic support boot from SAN. Moreover, dedicated (and much more expensive) storage platforms from the likes of Compellent and EMC are custom tuned and optimized; thus, you shouldn't expect to get similar levels of performance from a SANmelody system. SANmelody's strength, however, lies in its ease of use and its unique capability of adding high-level storage features -- such as boot to SAN -- to inexpensive commodity storage.
Marathon Technologies everRun VM 4
Marathon's everRun has been well regarded as a server failover add-on for quite a while now, and the company has a good reputation. everRun VM adds similar functionality to Citrix's XenServer virtualization platform, making it competitive with VMware HA.
everRun has a key advantage over VMware's HA, however: When VMware HA fails over, there's a short delay with interruption of service to clients. everRun VM, however, creates two systems that share a single IP address, so that there are no interruptions if the primary server fails.
everRun VM offers some other benefits over VMware HA, including a completely automated install that includes Citrix's XenServer. This feature makes it substantially easier to get up and running compared to VMware with the HA option. It can check for changes to the target servers to ensure there's sufficient processing power and other resources before failover. It also monitors all hardware components, rather than simply a heartbeat connection. Finally, it works with a SAN environment -- but doesn't require one as VMware HA does.
After the XenServer hosts and the XenCenter management workstation are installed, getting everRun VM installed onto each XenServer host is straightforward: You run the installer from the XenServer console, just like any Linux application. The installer bears a striking resemblance to Novell's old C-Worthy interface, which is to say text-based graphics circa 1993 -- but it works. After everRun VM is installed, you will also need to install XenServer tools on each instance of Windows 2003 that you want to protect. (Notably, whereas XenServer supports both Linux and Windows guest operating systems, everRun supports only Windows 2003 instances.)
In addition to the XenCenter console, you will also need to run the everRun Availability Center, for creating protected server instances on a Flash-enabled browser. You can get XenServer installation images, documentation, and licenses direct from Marathon if you're not already using it.
The recommended physical setup between the two servers running XenServer and everRun VM requires four NIC connections: one for the production network, one for the management network, and two for heartbeat connections, using crossover cables. (This is not required but strongly recommended because connecting them through a switch or router introduces a single point of failure). Storage can be local, direct attached, or SAN, either iSCSI or FC.
After everything is configured, using the everRun Availability Center to create protected server instances is simple. You just choose an instance of Windows 2003 on either XenServer system and tell everRun to protect it. everRun VM will clone the instance to the other XenServer system. Both instances have the same IP address, even the same MAC address, and are kept synchronized over the heartbeat cables. If the primary server instance fails, or if storage, network adapters, or other hardware fails, the secondary server takes over, completely imperceptible to the user.
The process of managing a protected copy of Windows remains the same. Any action taken on the primary copy is duplicated automatically on the second copy. It is possible to protect server instances on both XenServer systems: You can protect a primary instance or multiple instances on the first, which are duplicated on the second, and one (or more) instances on the second, which are duplicated on the first. Of course, care should be taken to ensure that if one physical server fails that the other won't be overloaded. After a failure is repaired, everRun automatically resynchronizes the two instances; there is no need for administrator intervention.
everRun VM is limited in scope, currently supporting virtualization of Windows 2003, and only on XenServer. However, what it does, it does very well, providing the highest level of fault tolerance and great ease of use. At $2,000 per physical XenServer, or $4,500 for a license for both XenServer and everRun VM, it is reasonably priced for a continuous high availability solution.
Scalent Systems' software is not storage virtualization software, nor network virtualization software, nor deployment software: It's an integrated platform that enables admins to quickly and easily deploy, move, repurpose, and clone server instances, automatically changing network settings, storage and LUN settings and more to reflect the needs of each system. By integrating control of your network switches, storage switches, storage hardware, and virtual environment into a single console, it provides flexibility and capabilities that are hard to imagine if you're used to a standard hardware-based environment.
Scalent uses the capabilities of VMware 3.5 to create instances that boot from SAN, then makes those instances completely portable. As such, you can move a server OS from a physical instance to a virtual instance, to a virtual instance on another VMware server, and even back to a physical instance, all without ever having to copy files or change any settings manually.
Suppose, for example, you have a Windows 2000 instance running in the test network. With the development finished, you might have a requirement to move it to the production network. In a typical hardware-based scenario, this would involve changing many settings internally on the server -- and possibly moving the server hardware, or at least changing the physical network connections to a different switch.
With Scalent, you would simply drag the icon from one group to the other. The system would then automatically change the VLAN settings on the appropriate switch, the network settings on the server instance, the LUN masking and other storage settings, the VMware partitioning, and virtual name of the HBA for the server instance, and the server would be in its new role. (You do need to change the HBA that the boot-from-SAN image is linked to.) No files actually have to be copied anywhere. Each server needs to have a lightweight agent installed, but it has minimal impact on the system.
Because storage replication features can be used to keep the boot-from-SAN images up to date in a backup datacenter, switchover time is very quick, limited to the time it takes the OS to boot from the new SAN image in the new location (typically faster than booting from a local disk).
The level of flexibility that Scalent offers, in terms of being able to run an OS instance either on hardware or any VMware server available, is especially noteworthy. Often with other failover systems, the two servers need to be identical to ensure that VMware drivers for CPU type, motherboard type, VMware partitioning, network settings, and capacity all match between the two. With Scalent, the backup server doesn't have to match the original. Scalent does a full install of the OS instance with all drivers, so images will work on any hardware. Some Linux display drivers may need to be re-configured, but that is easily done.
Using Scalent, high-availability environments are also easy to create and maintain, since it's simple to clone a boot image and create multiple instances, either on the same or separate VMware servers. Scalent can even work with load balancers such as F5's Big/IP to add new instances to a load balancing cluster.
Installation is a complex business. Scalent will integrate with your existing storage and network hardware, but may have to do on-site installation and configuration to get everything working, depending on what hardware you have. For the review, the company provided a complete rack of preconfigured equipment including an Ethernet switch, FC switch, storage system, and five servers, in addition to the server running the Scalent software.
In very short order, I was then able to add a new server (an HP ML370G5) from my lab to the provided pod. Using the Scalent software, I created a VLAN, connected the pod to my network, logged in to the HP server, added the lightweight agent, connected the HP to the Scalent controller, and added the server to a managed group in about 15 minutes total. Then the Scalent appliance deployed OS instances to the VMware ESX 3.5 server in less than a minute.
Scalent also supports iSCSI boot from SAN using emBoot, which means that iSCSI boot from SAN doesn't require a specialized (and expensive) TOE network interface card. Using iSCSI rather than Fibre Channel is transparent: Management occurs in the same fashion in either case. You could even mix both in the same environment.
Scalent is not inexpensive. The company prices in packs of managed physical CPU sockets (regardless of the number of cores); for example six dual-socket systems or three quad-socket systems. Pricing is about $1,000 per physical socket. This applies only to managed systems, and you can run any number of VMs on each system. However, given the flexibility and control provided, the cost is definitely worth it. Any large datacenter, or networks with requirements for scalability or failover, or environments with rapidly changing requirements that make regular re-purposing of systems a necessity should consider Scalent.
Stratus Technologies Avance 1.3
Stratus Technologies Avance 1.3 is a rival to Marathon's everRun VM in that it provides automatic failover. Like everRun VM, Avance runs on Citrix XenServer. Stratus sells a customized version of XenServer as part of the product, however, hardened and modified to support Avance's failover services. Unlike Marathon, Stratus supports Linux as well as Windows instances.
Perhaps the most notable difference between the two offerings is that Avance does not provide continuous failover like everRun VM. In the event of a failure, there's a short interval while server instances are transferred from the primary to secondary system. In most cases, though, this results in a momentary lapse in service, but the client remains connected to the server. Thus, most clients will not require rebooting after the switchover. Transfer times were very short; under a second with Linux, and four to five seconds with Windows 2003 Server running Exchange.
Stratus offers two versions of the system: You can purchase software and install it on your own commodity server hardware, or you can opt for the ftServer, which is the Avance software pre-installed on dedicated Stratus hardware. For my test, Stratus provided the Avance software pre-installed on two Dell PowerEdge 1950 servers. However, the installation process is not onerous. It's a simple matter of booting from the installation CD and following the prompts to install the first node, then repeating on the second system. After the two systems have been installed, you log into the Avance management console to configure and register the Avance software.
With the basic setup complete, the next step is installing virtual instances of server operating systems. This is the same procedure followed for XenServer installations in general, including the requirement to install para-virtualization drivers for Windows Server installations, and the method of creating Linux repositories to hold various versions of Linux operating systems.
When you're finished with basic installation and configuration, protecting a server instance is automatic. Any VM running on either node has all of its virtual storage replicated on both physical servers. If a fault occurs, the Avance system migrates the instance to the other physical node without interrupting operation. This process will work if the server's connection to the production network is brought down, if a non-fatal hardware fault (such as one of the two power supplies failing) occurs, or if the server instance itself experiences some kind of fault. The system will not gracefully fail over in the event of a catastrophic hardware failure.
Because Avance does not rely on shared storage, it's substantially less expensive to implement than other solutions. To its credit, it supports the same range of guest operating systems that XenServer does, plus it provides quick recovery from most hardware problems and software faults, without interrupting services to clients for more than a couple of seconds, and without requiring client reboots.
Although Avance does not provide the utterly seamless failover that everRun VM does, it supports a much greater array of guest OSes, has commendable ease of use, boasts an excellent price, and will satisfy most administrators looking for a way to make their virtual applications fault tolerant -- as long as they are willing to run on XenServer.
Vizioncore vRanger Pro 3.2.4
Vizioncore offers a suite of products providing disaster recovery, management, provisioning, and migration solutions that work with VMware, Microsoft Hyper-V and Virtual PC 2004, and Virtual Iron. The Vizioncore offerings that fall under the HADR umbrella include vRanger Pro, vReplicator, and P2V-DR vRanger Pro -- and they currently only work with VMware.
vRanger Pro is analogous to a standard backup program on a physical server: It creates full or incremental backups of a guest operating system while the VM is running. These backups can be restored to a different VMware server if desired. P2V-DR vRanger Pro (which requires vRanger Pro) allows backups of physical servers to a VMware image, allowing a server to be re-created as a VMware guest OS. This would allow for one VMware server at a remote location to temporarily take the place of many separate physical production servers.
Finally, vReplicator -- which is a stand-alone product -- supports replicating server instances to a backup VMware server. All changes are synchronized and the replicas are ready to go, but failover is not automated.
I installed vRanger Pro v. 3.2.4 on a Windows 2003 Server system connected to a Fibre Channel array. It can be installed on any version of Windows from Windows 2000 SP1 with the .Net framework. It supports Versions of VMware from 2.12 on.
The initial configuration entails entering information such as the version of VMware console that you use, along with a console login and the administrative login (root, or equivalent) on each of the ESX servers you'll be backing up. To perform a LAN-free backup directly from the ESX server to a FC-connected storage device, you'll also need to install a VMware Consolidated Backup (VCB) plug-in.
Running backups or restores using the VCB functionality bypass the LAN and back up directly via Fibre Channel. This tact is much faster, so if you are using a version of ESX server that supports it and you have an FC infrastructure, it's really the way to go.
When vRanger is configured and ready to go, using it is just like using any other Windows application. You select a target ESX server to back up, then select LAN-free backup or VSS (volume shadow copy). You can back up running server applications such as Exchange or MS SQL Server; you may also opt to encrypt the data transfer, then run the backup. You can schedule backups to run as desired, run full backups weekly and incremental backups in between, automate e-mails when jobs complete or if there are problems -- all the sorts of things you'd expect from a backup program.
The process of backing up a physical server is similar to that of backing up a virtual instance, except that each physical server has to be added separately. Virtual instances are auto-discovered.
After a virtual server or physical server is backed up, restoring is a simple matter of selecting an ESX host to which to restore. You can make multiple .VMDK file backups, or restore the same image with different .VMDK files on different hosts. When restoring, you can select an incremental backup to add to the restore as well. When the restore is completed, vRanger will restore the VM configuration and can register the VM if it isn't already registered (on an ESX 2.0 server).
As with backups, restores can also be scheduled to run later if desired. Reporting has all the features you'd expect from a backup program: tracking completed backups, failures, errors, times and so forth.
Vizioncore has also partnered with DataDomain to ensure that vRanger Pro is compatible with the DataDomain data deduplication appliances. If multiple backups of different instances of the same guest OS are made, using deduplication would result in vastly reduced storage requirements.
vRanger offers a strong disaster recovery system for VMware users. It offers all the features administrators would expect to find in a backup program, but it's tailored for server instances running on VMware. The P2V option is especially attractive in a disaster recovery scenario, as production servers can be backed up and then brought up in an emergency as VMware instances.
Narrowing Your Choices
None of these products directly competes with one another; rather they all complement each other and the virtualization platforms they support. It's possible that an enterprise might implement all four of these products in different applications. This review is intended to show the kinds of functionality available rather than serving as an all-inclusive guide -- there are many competitors for each of the products reviews.
Even the two products closest in functionality, everRun and Avance, have differing approaches to high available, and some organizations might use both in different applications. everRun's true continuous availability provides the level of availability necessary for the most critical applications, whereas Avance offers an easier configuration and lower pricing.
SANmelody (and other storage systems with the same kinds of functionality) allows for a simpler environment and easier management of VM distributed across multiple virtualization servers. Scalent takes this simplification a step further and essentially virtualizes the entire underlying infrastructure that supports virtualization -- network, storage, and VMware management, all in a single integrated platform. vRangerPro offers recovery tools for disaster recovery rather than high availability, enabling backups of VMs that can be restored locally or at a remote datacenter to continue operations after a disaster.