Amazon's EC2 cloud went dark last week--knocking sites like Foursquare, Reddit, and Quora offline, and affecting hundreds of Amazon cloud customers. The outage is a black eye for the young cloud services industry and gives businesses a reason to think twice about trusting servers or data storage in the cloud.
The sales pitch for the cloud is like a travel brochure, or a military recruiting speech. They tell you all of the features and benefits--cost-effective, scalable, resilient--but fail to mention the down sides like the fact that if the cloud data center is offline, so is your business.
So, does that mean that the cloud is just too risky, and that you should avoid using cloud servers or storage? No. Not at all. There are still benefits to using cloud-based servers and storage, but the cloud has to be treated like any other technology your company relies on.
Box.net CEO Aaron Levie commented, "At Box, we run our site from multiple data centers, so in the event of an outage we're still able to successfully serve the application and data to our customers without interruption."
Obviously, Amazon has to determine the root cause and has some explaining to do to the affected customers. But, rather than simply embracing the cloud and entrusting your business-critical processing and data to it, you need to have a plan in place for situations like the Amazon outage. Don't use cloud services unless you can adequately answer the question "what happens to my business if the cloud service in unavailable?"
SmugMug managed to avoid being affected by the Amazon outage, and CEO Dan MacAskill explains in a blog post, "Any of our instances, or any group of instances in an AZ (Availability Zone), can be "shot in the head" and our system will recover (with some caveats - but they're known, understood, and tested)."
While Amazon's AZs do offer some level of redundancy for your cloud services, the Amazon outage proved that it is possible for multiple AZs to be impacted at once, making that solution inadequate in some situations. An alternative approach would be to contract cloud services from multiple vendors and implement your own redundancy to protect your business from any one cloud service outage.
Levie summed up, though, with a reminder for those who might be gun-shy about the cloud as a result of the Amazon outage. "It's also important to keep in mind that the overall uptime of cloud services greatly outperforms on-premise infrastructure in the case of vendors like Google, Salesforce and Box, freeing up IT departments to focus on strategic initiatives rather than maintenance."
So, the cloud is imperfect, but if you plan for failure and treat the cloud as you would any other server or data storage on your network, there is no reason your business can't safely enjoy the benefits the cloud has to offer.