Guide to Storage Virtualization

Seven things IT personnel should know about storage

Avoid performance problems with these new tricks

By Beth Schultz, Network World, 06/06/07

Thanks to virtualization and a host of other technologies, storage has left its silo. Its performance affects the whole computing shebang. Fortunately, new technologies that cross the boundaries of storage, management and compliance are smoothing over performance issues and easing the pain (and expense).

But you've got to be in the know to make use of them. Here are seven storage truths that every IT person should understand.

Optimizing storage isn't about buying new stuff , says Mark Diamond, CEO at storage-consulting firm Contoural. It's about determining whether the data you've created is stored in the right place. This discussion goes beyond the basic concept of using inexpensive disk to store data and delves into how the disk is configured, especially when it comes to replication and mirroring.

"We typically see that 60% of the data is overprotected and overspent, while 10% of the data is underprotected -- and therefore not in compliance with SLAs [service-level agreements]," Diamond says. "Often, we can dramatically change the cost structure of how customers store data and their SLAs, using the same disk but just configuring it differently for each class of data."

Read the full story here >>

Users who get great performance out of their storage-area networks (SAN) have discovered application-centered monitoring for storage performance.

For instance, the Affinion Group is testing a combination of Onaro's Application Insight and SANscreen Foundation monitoring tool. "We could be alerted in real time of any performance spikes and hopefully be informed of any issues that could cause an outage, before someone calls from the business line," says storage specialist Raul Robledo. "We wouldn't need to get inquiries or notification from individuals. We would be getting those right from a product that's monitoring our environment."

A host of other products have entered the category of storage optimization, too.

Read the full story here >>

Storage isn't the biggest energy hog in the data center, but new technologies can still help cut back on its power consumption by as much as 20%, users say. Even using storage space more efficiently can cut down on wasted capacity, experts say. This means spending less on storage in the long run.

At San Diego Supercomputer Center, Don Thorp, manager of operations, looked to Copan Systems, one of a handful of relatively new, smaller green storage vendors. He reports that storage consumption is down by 10% to 20% since switching to Copan Systems last July.

Many more such vendors are entering the market.

Read the full story here >>

Over the last several years, numerous vendors have taken backups from boring to remarkable by rolling out fancy backup-management tools.

Spun off from the broader storage-resource management market, these tools, of course, monitor and report on backups of products from multiple vendors. But they also give IT administrator an at-a-glance picture from a single console, in real time and historically. They can ease the auditing process and help create chargeback programs verify internal service-level agreements for backups.

Heterogeneous backup-management tools are available from various niche vendors and the mainstream storage biggies.

Read the full story here >>

Just ask the University of Florida College of Veterinary Medicine (UFCVM). Over the last six months, the college has been putting its 7TB storage area network through its paces, using it for nearline backup and primary storage.

UFCVM relies on Storage Virtualization Manager (SVM), a virtualization appliance from StoreAge Networking Technologies, now owned by LSI. The SAN setup reduced backup times by half, and the project came in under budget, says Sommer Sharp, systems programmer for the college in Gainesville, Fla.

Provisioning is a painless matter of moving volumes to any server that needs it, so live data can be managed as easily as backups.

Read the full story here >>

Recent surveys show that, on average, U.S. companies face 305 lawsuits at any one time. With each lawsuit comes the obligation for discovery -- production of evidence for presentation to the other side in a legal dispute. With 95% of all business communications created and stored electronically, that puts a heavy burden on IT to perform e-discovery, finding electronically stored information.

In the U.S. court system, the onus of e-discovery took on new weight on Dec. 1, 2006, when amendments to the Federal Rules of Civil Procedure (FRCP) took effect. "With the amendments to the FRCP, the courts are saying, 'We know the technology exists to do this stuff. We want to see you take some reasonable steps to put processes and technologies together to do e-discovery. And if you don't, we're really going to hold you accountable for it,'" says Barry Murphy, principal analyst at Forrester Research.

He cites the recent case of Morgan Stanley vs. Ronald Perelman, in which Morgan Stanley was hit with a $1.57 billion jury verdict, which hinged primarily on the company's lax e-discovery procedures.

Read the full story here >>

The Open Grid Forum, a standards organization focused on Grid Computing, is working on a variety of standards for the compute, network and storage infrastructure, all the way from describing jobs to being able to move and manage data, says Mark Linesch, who heads the organization.

Work is progressing around defining a grid file system and naming schemes, and developing a storage resource manager for grids. The group is collaborating with other standards bodies like the Distributed Management Task Force and the Storage Networking Industry Association.

The ultimate goal is to enable proprietary storage vendors to make their gear interoperable.

Read the full story here >>


Five Buying tips for buying storage virtualization products

By Deni Connor, Network World 10/1/07

Storage virtualization – the logical abstraction of data from its physical devices -- is a platform that enables a lot of different storage services. By virtualizing storage into a common resource pool, companies can receive better management capability, higher utilization of storage and improved migration, replication and storage provisioning capabilities. Despite its advantages, however, buying storage virtualization products can be costly and complex to implement.

Once you've decided that storage virtualization is indeed right for your network, consider these tips for buying virtualization gear and software.

While storage virtualization can bring better utilization of storage resources and the ability to migrate data between different tiers of storage, you need to look at those capabilities and decide whether or not you need to deploy storage virtualization to achieve them. If, for instance, you are only buying virtualization to determine the utilization of your storage arrays, there is other software than can do that.

Management of a single homogeneous environment may be easier with software supplied by the vendor for its storage platforms. Management of diverse storage platforms such as those from IBM, EMC and Hitachi may be simplified by storage virtualization since all data from the different devices would be aggregated into a single pool that can be managed from one management console.

Storage virtualization comes in three forms: host-based, network- or fabric-based, and array-based.

Host-based virtualization has been around for a number of years and is relatively inexpensive compared to network, fabric-based and array-based virtualization. Host-based virtualization is characterized by Symantec's Veritas Storage Foundation or Brocade's Tapestry StorageX. It is, however, often plagued by lack of scalability. As a storage virtualization environment grows, more servers are needed to host virtualization – each server requires its own operating system and host virtualization license, maintenance and software overhead.

In-network or fabric-based virtualization is getting a lot of interest lately because it allows the attachment of any variety of host computer and almost any vendor's storage array. Network or fabric-based virtualization is represented by IBM's SAN Volume Controller and EMC's Invista. In network or fabric-based virtualization you need to decide whether to adopt an in-band or out-of-network configuration.

Controller or array-based virtualization is characterized by Hitachi Data Systems TagmaStore Universal Storage Platform.  Controller-based virtualization does not require you to insert another appliance into the mix, but does assume extra work for the controller.

Storage virtualization requires both additional hardware and software be added to the network. Host-based virtualization often requires software drivers on all host computers that attach to various storage devices. Fabric-based or in-network virtualization requires an appliance which attaches to the Fibre Channel switches. Array-based virtualization may require the addition of a separate storage array to virtualize the storage resources.

You need to also look at any changes to your network that virtualization require. For example, that may come in the form another brand of Fibre Channel switch or a pair of servers you ordinarily wouldn't need to have.

You need to decide whether you are going to be virtualizing storage devices from one vendor or those from two or more storage vendors. Some forms of virtualization are limited to only those storage devices of their own manufacture. Others, like Hitachi's TagmaStor USP can virtualize data from a variety of vendor's storage arrays.

If, for instance, you want to migrate data between different storage platforms from a single vendor, you may not need storage virtualization at all, as migration software from that vendor might be a cheaper end to the same goal. If you want to migrate data from several platforms to a single one or migrate data stored on storage arrays from diverse vendors to a single system, then pooling that data with storage virtualization may make migration easier and a much less complex process.

In choosing the type of virtualization to deploy, you need to watch out for vendor lock-in. Some array-based virtualization such as that from Hitachi Data Systems, requires you to add a Hitachi storage array to the network in order to virtualize any data. You also need to be sure that the functions you need to perform -- such as initial data migration -- do not require you to switch to the vendor's own migration product. Locking yourself into a vendor's replication, migration or management capabilities can be a costly affair.

How to do storage virtualization right

By Galen Gruman, CIO, 09/11/07

When Roland Etcheverry joined chemical company Champion Technologies two years ago, he looked around and realized he needed to remake the company's storage environment. He had done this twice before at other companies, so he knew he wanted a storage-area network (SAN) to tie the various locations to the corporate data center, as well as to a separate disaster recovery site, each with about 7TB of capacity. He also knew he wanted to utilize storage virtualization.

At its most basic, storage virtualization makes scores of separate hard drives look to be one big storage pool. IT staffers spend less time managing storage devices, since some chores can be centralized. Virtualization also increases the efficiency of storage, letting files be stored wherever there is room, rather than have some drives go underutilized. And IT can add or replace drives without requiring downtime to reconfigure the network and affected servers: The virtualization software does that for you. Backup and mirroring are also much faster because only changed data needs to be copied; this eliminates the need for scheduled storage management downtime, Etcheverry notes.

Better yet, he will save money on future storage needs, because his FalconStor storage management software combines drives from multiple vendors as if they were one virtual drive, letting Etcheverry avoid getting locked in to the expensive, proprietary drives that array-based storage systems often require.

Although storage virtualization technology is fairly new, it's quickly gaining traction in the enterprise. In 2006, 20% of 1,017 companies surveyed by Forrester Research had adopted storage virtualization. By 2009, 50% of those enterprises expect to. And the percentages are even higher for companies with 20,000 or more employees, the survey notes: 34% of such firms had deployed storage virtualization in 2006, and that will climb to 67% by 2009.

But storage virtualization requires a clear strategy, Etcheverry says. "A lot of people don't think much about storage, so they don't do the planning that can save costs," he says. Because storage virtualization is a very different approach to managing data, those who don't think it through may miss several of the technology's key productivity and cost-savings advantages, concurs Nik Simpson, a storage analyst at the Burton Group.

Strategically, storage virtualization brings the most value to resource-intensive storage management chores meant to protect data and keep it available in demanding environments. These chores include the following: replication to keep distributed databases synchronized; mirroring to keep a redundant copy of data available for use in case the primary copy becomes unavailable; backup to keep both current and historical data available in case it gets deleted but is needed later; and snapshots to copy the original portions of changed data and make it easier to go back to the original version. All these activities have become harder to accomplish using traditional storage management techniques as data volumes surge and time for backup chores decreases.

Because storage virtualization technology used for these purposes copies just the individual parts of changed data, not entire files or even drive volumes as in traditional host-based storage architectures, these data-protection activities are faster and tax the network less. "You end up transferring 40% or 50% less, depending on the data you have," says Ashish Nadkarni, a principal consultant at the storage consultancy GlassHouse Technologies.

This efficiency lets a CIO contemplate continuous backup and replication, and enables quick moves to new equipment in case of hardware failure. "We can add new storage as needed and have data transferred in the background, without the users even knowing," says Ryan Engh, IT infrastructure manager at the investment firm Wasatch Advisors, which uses DataCore's virtualization software.

Another advantage: "This prevents the states of the disaster recovery site and the production site from pulling apart," he says, a common problem in a traditional environment where the two data sets are usually out of synch because of the long replication times needed.

Moreover, the distributed nature of the data storage gives IT great flexibility in how data is stored, says Chris Walls, president of IT services at the healthcare data management firm PHNS, which uses IBM's virtualization controller. "That control layer gives you the flexibility to put your data in a remote site, or even in multiple sites," he says, all invisible to users.

Understanding these capabilities, a CIO could thus introduce 24/7 availability and disaster recovery, perhaps as part of a global expansion strategy. That is precisely what Etcheverry is doing at Champion. "We now have a zero-window backup, and I can rebuild a drive image in almost real-time," he says.

Some enterprises have gained additional advantage from storage virtualization by combining it with an older technology called thin provisioning that fools a drive into thinking it has more capacity than it has; this is done typically to create one standard user volume configuration across all drives, so when you replace drives with larger ones, IT staff does not have to change the user-facing storage structure. By adding storage virtualization, these standardized, thin-provisioned volumes can exceed the physical limit of any drive; the excess is simply stored on another drive, without the user knowing. "This really eases configuration," says Wasatch's Engh. That also reduces IT's need to monitor individual drive usage; the virtualization software or appliance just gets more capacity where it can find it.

For example, Epilepsy Project, a research group at the University of California at San Francisco, uses thin provisioning, coupled with Network Appliance's storage virtualization appliance. The project's analysis applications generate hundreds of gigabytes of temporary data while crunching the numbers. Rather than give every researcher the Windows maximum of 2TB of storage capacity for this occasional use, CIO Michael Williams gives each one about a quarter of that physical space, then uses thin provisioning. The appliance allocates the extra space for the analysis applications' temporary data only when it's really needed, essentially juggling the storage space among the researchers.

Storage virtualization comes in several forms, starting with the most established, array-based virtualization. Here, a vendor provides an expandable array, to which that vendor's drives can be added; management software virtualizes the drives so they appear as a common pool of data. You're typically locked in to one vendor's hardware but donâ¬"t have to worry about finger-pointing among vendors if something goes wrong, says Forrester Research analyst Andrew Reichman. 

Providers of such arrays include Compellent, EMC, Hewlett-Packard, Hitachi Data Systems, Network Appliance (NetApp), Sun and Xiotech. Reichman notes that several such array-based virtualization products, including those from Hitachi (also sold by HP and Sun) and NetApp, also support third-party storage arrays. The Hitachi array is "the only option for the high end," he says, while the others are designed for relatively small storage systems of less than 75TB.

The newer option, network-based storage virtualization, uses software or a network appliance to manage a variety of disk drives and other storage media. The media can come from multiple vendors, typically allowing for the purchase of lower-cost drives than the all-from-one-vendor options. This lets you use cheaper drives for non-mission-critical storage needs and allows you to reuse at least some storage you've accumulated over the years through mergers and acquisitions, says Ashish Nadkarni, a principal consultant at the IT infrastructure consulting and services company GlassHouse Technologies.

Providers of such network-based storage virtualization (often as a component of the SAN offering) include BlueArc, DataCore Software, EqualLogic, FalconStor Software, IBM, Incipient, iQstor and LSI. Current offerings tend to be for medium-size environments of less than 150TB, notes Forrester's Reichman.

Storage virtualization's newfound flexibility and control does have risks. "The flexibility can be your worst's like giving razor blades to a child," says Wasatch's Engh. The issue that storage virtualization introduces is complexity.

Although the tools keep track of where the files' various bits really are, IT staff not used to having the data distributed over various media might manage the disks the old-fashioned way, copying volumes with partial files rather than copying the files themselves for backup. Or when setting up virtualized storage networks, they might accidentally mix lower- performance drives into high-performance virtual servers, hindering overall performance in mission-critical applications, notes GlassHouse's Nadkarni. 

Virtualization tools aren't hard to use, but it's hard for storage engineers to stop thinking about data from a physical point of view, says PHNS's Walls. "Everything you thought you knew about storage management you need to not bring to the party," he adds.

Another issue is choosing the right form of storage virtualization, network-based or array-based. The network-based virtualization technology is delivered via server-based software, a network appliance, or an intelligent Fibre Channel switch, and it comes in two flavors: block-level and file-level. Array-based virtualization is typically provided as part of the storage management software that comes with an array.

Array-based virtualization is mature, says Burton Group's Simpson. But it's limited to storage attached directly to the array or allocated just to that array via a SAN; IT usually must buy array storage from the array vendor, creating expensive vendor lock-in.

Network-based storage virtualization has been in existence just a few years and so has largely been offered by startups. It's the most flexible form of storage virtualization, says Forrester's Andrew Reichman, and lets you manage almost all your storage resources, even offsite, as long as they are available via the SAN. Although these tools can theoretically act as a choke point on your SAN, in practice the vendors are good at preventing that problem, he notes.

Most network-based storage virtualization products work at the block level, meaning they deal with groups of bits rather than whole files. While block-level network-based storage virtualization is the most flexible option, the technology typically requires that an enterprise change its storage network switches and other network devices to ones that are compatible, Nadkarni notes. "But no one wants to shut down their SAN to do so," he says. Although you can add the technology incrementally, that just raises the complexity, since you now have some virtualized storage and some nonvirtualized storage, all of which need to be managed in parallel. 

Thus, most organizations should consider adopting network-based storage virtualization as part of a greater storage reengineering effort, he advises.

That's exactly what both Champion's Etcheverry and PHNS's Walls did. Etcheverry brought virtualization in as part of an enterprisewide storage redesign, while Walls brought it in as part of adding a new data center and disaster recovery site.

In both cases, all the setup work happened in a nonproduction environment and could be tested thoroughly without affecting users. Once the two IT leaders were happy with their new systems, they then transferred the data over and brought them online. That meant there was only a single disruption to the storage environment that users noticed. "This was a one-time event," Walls notes.


Vendors find many ways to make storage virtualization work

ByDeni Connor, Network World 10/1/07

This approach enables controller-based virtualization products to attach to standard Fibre Channel storage arrays and look at their data in the same manner as their own internal hard disks.


In array-based virtualization such as that with the HP Enterprise Virtual Array, EMC Symmetrix DMX or 3Par, virtualization is enabled by the creation of virtual LUNs (logical attachments between server and storage.) While they do not provide the virtualization nirvana – that of working with heterogeneous storage devices – they do offer flexibility in managing LUNs.


Subscribe to the Power Tips Newsletter