A Crazy Data Back-up Scheme That Works

I can think of a lot of words to describe small-business data protection -- "painful" and "expensive," to name two. But "cool" and "sexy?" Generally not.

Those last terms apply, however, to Symform's distributed peer-to-peer online backup architecture. This wildly innnovative approach has the potential to reduce the pain of small-business backup, provided it can win enough hearts and minds to stand the test of time.

Online backup in general beats the usual DIY slog, which is why more and more small businesses are taking the online route. With no dedicated IT staff, the simple act of ensuring that backup tapes have been changed is an error-prone hassle. Unfortunately, the online option has a big downside: If you're pushing more than a few gigabytes, it can get very expensive very quickly.

Online data backup and its discontents
Most online backup schemes ship your data off to the backup provider's data center, where it is stored and hopefully mirrored to a secondary data center. Given that the backup provider has to store and mirror all of your data (as well as all data from other customers), the biggest cost in managing your backups is in providing rock-solid back-end storage. And that's reflected in the dreaded per-gigabyte fee.

These recurring costs range anywhere from 50 cents per gigabyte per month and up. If you're trying to protect just a few small mission-critical data sets, that's not so bad. But if you intend to use online backup as a means to protect your entire infrastructure, even a relatively modest Microsoft Small Business Server environment could cost upward of $300 a month to protect. Compared to the long-term bill of maintaining your own local backup, this is not exactly a cost-saving option.

Symform's answer to storage backup
Here's where Symform's backup architecture changes the game. Instead of storing your data, Symform uses storage capacity provided by you and other customers like you. You pay a flat fee per month for each data source you want to back up -- and you can back up as much data as your Internet pipe can handle -- provided you offer an equal amount of storage to other Symform users. Removing the per-gigabyte fee from the mix suddenly makes online backup seem like a deal.

If you're like most people, you just said to yourself, "Wait, I'm supposed to ship my data off to a bunch of people I don't know and feel good about it? Get real." In fact, just about everyone I've talked to about Symform's backup methodology has had that sort of reaction. How can using a bunch of random companies spread across the world possibly result in a highly available and secure home for my backups?

So here's the cool and sexy part: Symform protects your data through a redundancy mechanism it calls RAID-96. First, your backup data is hacked into 64MB blocks. These blocks are then encrypted using the industry standard AES-256 encryption algorithm. They are further chopped into 64 blocks of 1MB each. From those 64 blocks, 32 more parity blocks are computed, allowing you to reconstitute the original 64 1MB blocks from any 64 of the 96 resulting blocks. From there, the 96 1MB blocks are shipped out to 96 other nodes in Symform's Cooperative Storage Cloud.

Data availability everywhere
From a data availability perspective, this means that no fewer than 33 of the 96 cloud-based nodes for that original 64MB block of data would have to become unavailable simultaneously for you to actually lose access to your data. It's certainly conceivable that some nodes will go offline occasionally or stop using Symform's service, but having more than a third become unavailable simultaneously is extremely unlikely. In situations where a storage cloud node goes offline, Symform can immediately reconstitute the data blocks it had been storing and move that data onto other nodes, which ensures that the highest level of reliability is maintained.

Having your data so heavily distributed also means getting it back in the event that you end up needing it is much faster. If you've ever used BitTorrent, you're already familiar with the concept: Utilizing the bandwidth of hundreds or thousands of small Internet connections spread throughout the world can crush the performance and reliability of a single high-performance Internet connection.

Symform data security
From a data security perspective, each of the cooperative cloud storage nodes that houses your data has only a 1/64 chunk of any given block of your encrypted data. If someone wanted to see your data, first they'd need to find the other 63 nodes in the cloud cooperative with that particular block of info, break into each of them to steal that block, and reassemble them. Given that each node has no idea whose data it's storing, someone would have to gain total access to Symform's own centralized databases to know which block was where.

After that, they'd still have to break the AES encryption. Barring a serious flaw in Symform's encryption implementation, this is better than most commercial encryption methodologies I've seen. If you're incredibly paranoid, you can even use your backup software to encrypt the data Symform is storing before the service ever sees it.

Online data backup won't change overnight
As cool as I think Symform's idea is on paper, they have an uphill fight on their hands to gain market acceptance. The idea is new enough that many will be turned off just by the fact that it's so different. As one of my colleagues said recently: "Still creeps me out. Call me old school." You can bet that sentiment will be shared by many potential customers. Worse, the functionality and integrity of the product depends very heavily on having a large number of customers.

So far, Symform has handled this by attempting to build a robust partner channel rather than going directly to end-users as most online backup providers do. Winning over a limited set of partners and relying on them to make the pitch seems like a good strategy. If Symform survives long enough to be considered trustworthy, I can see this inventive distributed approach taking the online backup world by storm. No matter how it turns out, you have to hand it to these guys for thinking outside the box.

This article, "A crazy data backup scheme that works," originally appeared at InfoWorld.com. Read more of Matt Prigge's Information Overload blog and follow the latest developments in storage at InfoWorld.com.

Subscribe to the Best of PCWorld Newsletter