Files compressed by Microsoft Office applications or popular image formats such as JPEG can't be reduced with many common compression techniques or may even increase in size. Neuxpower Solutions Ltd. claims that its software can shrink Office and JPEG files by as much as 95% without loss of image quality by removing unnecessary information such as metadata or details that can't be seen unless the image is enlarged. Ocarina, which is being acquired by Dell, says its products offer similar capabilities because they use multiple optimization algorithms tuned for different types of content, and they have the ability to test and choose among various compression methods for the best runtime efficiency.
Deduplication and compression are complementary. "Use compression when the primary focus is on speed, performance, transfer rates. Use deduplication where there is a high degree of redundant data and you want higher space savings," says Schulz.
3. Policy-Based Tiering
Policy-based tiering is the process of moving data to different classes of storage based on criteria such as its age, how often it is accessed or the speed at which it must be available (see "The Politics of Storage"). Unless the policy calls for the outright deletion of unneeded data, this technique won't reduce your overall storage needs, but it can trim costs by moving some data to less expensive, but slower, media.
Vendors in this market include Hewlett-Packard Co., which offers built-in policy management and automated file migration in its StorageWorks X9000, and DataGlobal GmbH, which says that its unified storage and information management software enables customers to analyze and manage unstructured files and other information and thereby reduce their storage needs by 60% to 70% for e-mail and about 20% for file servers.
Other products with tiering capabilities include Storage Center 5 from Compellent Technologies, HotZone and SafeCache from FalconStor, Policy Advisor from 3Par, EMC's FAST and F5 Networks' ARX series of file virtualization appliances.
4. Storage Virtualization
As is the case with server virtualization, storage virtualization involves "abstracting" multiple storage devices into a single pool of storage, allowing administrators to move data among tiers as needed. Many experts view it as an enabling technology rather than a data reducer, per se, but others see a more direct connection to data reduction.
Actifio Inc.'s data management systems use virtualization to eliminate the need for multiple applications for functions such as backups and disaster recovery. Its appliances let customers choose service-level agreements governing the management of various data sets from a series of templates.
With this method, the proper management policies are then applied to a single copy of the data, defining where, for example, it is stored and how it is deduplicated during functions such as backup and replication. Company co-founder and CEO Ash Ashutosh claims that Actifio can cut storage needs 75% to 90%.
5. Thin provisioning
Thin provisioning means setting up an application server to use a certain amount of space on a drive, but not using that space until it is actually needed. As with policy-based storage, this technique doesn't cut the total data footprint but delays the need to buy more drives until absolutely necessary.
If storage needs increase rapidly, you must "react very, very quickly" to ensure that you have enough physical storage, says Allen. The more unpredictable your needs, the better measurement and management tools you need if you adopt thin provisioning. Schulz advises looking for products that identify both the data and applications users need to track, and that monitor not only space usage but read/write operations to prevent bottlenecks.
One of the vendors in this market is IBM, which has extended thin provisioning "into all our storage controllers," says Balog. HP, which provides thin provisioning on its P4000 SANs, is set to acquire 3Par, which guarantees that its Utility Storage product will reduce customers' storage needs by 50%. Nexsan provides thin provisioning with its SATABeast arrays.
Before choosing a data reduction strategy, set policies to help make tough choices about when to pay for performance and when to save money by cutting your data footprint. Don't focus only on reduction ratios, Schulz says, but remember that you might get more savings with a lower reduction rate on a larger data set.
And don't be confused by vendor terminology. Compression, data deduplication, "change-only" backups and single instancing are all different ways of reducing redundant data. When in doubt, choose your storage reduction tools based on their business benefits and a detailed analysis of your data.
Which Dedupe Is Right for You?
There are deduplication systems to meet many different needs, depending the organization's reduction goals and system setup. Here's a sampling:
* Nexsan provides postprocessing deduplication for primary and archive data with its Assureon system, and for backup data with its DeDupe SG offering. DeDupe SG is based on FalconStor's deduplication software engine File-interface Deduplication System, or FDS. Combined with single instancing of data, this provides typical reduction ratios from 1:5 to 1:15, says Randy Chalfant, vice president of strategy at Nexsan.
* EMC Data Domain deduplication storage systems are for customers who want to keep their existing backup software but move from tape to disk for backup, says Shane Jackson, senior director of product marketing for EMC's backup recovery systems division. Data Domain supports both structured and unstructured data, with deduplication of various lengths of blocks, achieving reductions of 10:1 to 30:1, he says. EMC's Avamar provides source-based backup software with global deduplication, providing 30:1 to 40:1 reductions, says Philip Fote, marketing manager for the backup recovery systems division.
* Ocarina provides sub-file-level deduplication and compression of unstructured data. Its storage optimizers read data from network-attached storage, deduplicate it, compress it and write the optimized files on either the original NAS or a different storage tier. It optimizes the layout based on characteristics such as block sizes, caching strategies and metadata layout for each storage platform, says Greg Schulz, senior analyst at The Server and StorageIO Group. Ocarina is well suited for unstructured data that may not be "handled as efficiently by dedupe alone," says Schulz. Ocarina also resells its technology to vendors such as BlueArc Corp.
* HP's StoreOnce deduplication software currently runs on HP StorageWorks D2D Backup Systems and compresses data before deduplication, for reductions of up to 20:1. In the future, by deploying it across more platforms, it can avoid the problems caused by using multiple deduplication products, says Lee Johns, marketing director for unified storage products in HP's StorageWorks division. He says HP also plans to use StoreOnce to reduce primary storage in high-availability server clusters.
* Symantec Corp.'s forthcoming VirtualStore is designed to reduce storage requirements for virtual machines and the data associated with them by 80% -- especially for virtual desktop implementations. Among other things, it updates only the changes between the "parent" virtual machine and any clones and provides thin provisioning and tiering. VirtualStore will be available in November; future releases will have deduplication capabilities, according to Symantec.
-- Robert L. Scheier
Scheier is a freelance writer in Swampscott, Mass. Contact him at firstname.lastname@example.org.
This story, "5 Ways to Cut Your Storage Footprint" was originally published by Computerworld.