Bandwidth Bottlenecks Loom Large in the Cloud
InterContinental Hotels Group (IHG) CIO Tom Conophy has no reservations when it comes to the cloud.
The hospitality giant, which manages, franchises or leases 4,500 hotels in 100 countries, has been able to improve the customer experience and reduce costs by moving storage and in-house applications for mobile phones to multiple data centers in the cloud. It's been such a success overall that the team is now rebuilding its room-reservations system, which processes more than 345 million transactions daily, for a move to the cloud.
But Conophy says all will be for naught if the IHG team doesn't focus squarely on one often-overlooked area: bandwidth.
"If your employees and your users can't access data fast enough, then the cloud will be nothing more than a pipe dream," Conophy says. In IHG's case, that meant re-architecting the network to distribute databases so data is quickly reachable and data centers remain in sync.
With all the talk about cloud, it can be easy to forget that there are risks that go beyond security. Users, by now accustomed to LAN-like speed and quality, could rebel if they experience performance or latency issues. Many of today's applications are interdependent and if they have to communicate across long distances, such as data center to data center, then slowdowns or even outages are possible. Also, if storage and backups suffer too many hops, they could stall out and fail.
Despite these potentially catastrophic outcomes, many businesses do not include bandwidth considerations in their cloud strategies, according to Theresa Lanowitz, founder of independent analyst firm Voke, Inc. in Portland, Ore.
Testing Cloud Apps is Key
"Most companies are testing their infrastructure in a silo, not in an integrated environment," she says. Therefore, they have no way of making sure applications, backups and storage will meet a defined quality of service, she adds.
Internet pipes are filled with diverse traffic, including streaming video and audio, which could negatively impact, say, a database's performance. Also, many applications haven't been cloud-hardened -- meaning the code's not been tightened up to reduce the back-and-forth, among other steps -- and they may start to break down when off the LAN.
Lanowitz recommends using emulation tools -- such as those from Spirent Communications and Ixia -- to discover potential bandwidth bottlenecks before permanently putting applications and data into the cloud. A hospitality company like IHG could emulate typical peak scenarios such as morning checkout through the cloud-based application.
"It's no longer about delivering an application that is great; it's about whether that application can survive in the wild. You have to examine the maximum use the cloud-based application and network will sustain," Lanowitz says.
Get the Right People Involved
Jim Frey, managing research director at consultancy Enterprise Management Associates, agrees with Lanowitz. Complicating matters, his research has shown, is that IT groups don't always have the right people responsible for predicting and resolving bandwidth bottlenecks. Often, the people who know most about the network and can take steps to resolve problems before they occur aren't involved with cloud storage and applications.
Frey's February 2011 report "Network Management and the Responsible, Virtualized Cloud" found that 62% of the 151 IT professionals surveyed are using some form of cloud services. A majority of the total -- 66% -- rely on an in-house cloud or virtualization support team for service performance and quality monitoring and assurance. Other major players in cloud oversight in many shops work in storage or data management, data center/server operations and security.
But only 54% of those surveyed said they involve network engineering/operations personnel, down from 62% in 2009. Sadly, the move away from network engineering has left traditional network best practices by the wayside, according to Frey.
Cloud services and deployment of virtual server technology often result in reduced visibility and control in the enterprise, making it difficult to manage the network aspects, he contends. "There are virtual network elements that ... should be accorded the same best practices for monitoring and management as the other elements in the network connectivity path," he writes in the report.
Chief among virtual network attributes in need of attention, he later said, is bandwidth.
What's lacking at many IT shops, in his opinion, is attention to the health of overall traffic delivery. For instance, only 28% of survey respondents believe collecting packet traces between virtual machines for monitoring and troubleshooting is absolutely required. And only 32% feel that collecting data about traffic, i.e., NetFlow information, from virtual switches for monitoring and troubleshooting is absolutely required. Both tasks give IT insight into how the network and its pipes are performing.
With this knowledge, businesses could discover that they need some type of extra help, such as WAN optimization controllers (WOC) or application delivery controllers, to alleviate bottlenecks and improve the end-user experience. To prevent multiple copies of the same data from clogging pipes, IT could use de-duplication in physical and virtual WOCs deployed in-house and in the cloud. Or IT groups could cache data locally to shrink the amount of back-and-forth traffic.
Next page: optimizing for backup and syncing...
Optimizing the Network for Data Backup
John Lax, vice president of information systems for Washington, D.C.-based International Justice Mission (IJM), credits WOCs for enabling the bandwidth-challenged global nonprofit's move to the cloud.
The IJM, a human rights agency that rescues children from sex trafficking and slavery, has 500 employees and 14 field offices in 10 countries around the world. Lax says many employees endure the triple challenge of incredibly low bandwidth (e.g., 512Kbps), frail connections that frequently drop and expensive fees (a 256Kbps link in Uganda costs $1,200 per month).
Introducing the cloud to remote areas had to be a carefully construed plan that would take these issues into account. The organization wanted to maximize the length of time the link stays active without interruption, he explains.
Lax decided the best use of the cloud for the farthest-flung workers would be for backups. "We no longer wanted manual intervention of changing and tracking tapes," he says. The field offices each have installed Riverbed's Whitewater cloud storage appliance that connects to another Whitewater appliance in IJM's Richmond, Va., data center.
Data, such as case workers' sensitive documentation about children, is encrypted, de-duplicated and compressed to speed transfers. The data center's Whitewater appliance is also used with a Whitewater virtual appliance to back up and archive data on Amazon's S3 Cloud Service.
Lax says the appliances have resulted in a six-fold reduction of traffic, reducing bandwidth costs and ensuring shorter, more accurate backup windows. Also, if users accidentally delete a directory, they can retrieve it from the built-in buffer in 12 seconds vs. the previous 36 hours necessary to recover from tape. In total, the IJM has been able to back up 5.5 terabytes of data to the cloud, ensuring the security and integrity of the group's work.
Syncing Across Data Centers
While optimization appliances can go a long way toward combatting bandwidth bottlenecks, IHG's Conophy took a different tack. Like Lax, Conophy has had to architect his cloud network to support users from the far reaches of the globe. The company has three primary data centers in Georgia, Virginia and California. Secondary data centers are located in Dubai, Shanghai, Singapore and Sydney. Conophy says they are strategically situated near users for an optimal and speedy user experience.
Although keeping data completely synchronized across all data centers would be impossible without a major investment, Conophy wanted to get close. Guests relying on a variety of sources, including smartphones, tablets and websites, are expected to conduct 50 billion transactions annually within the next decade. "Our guests connect to us via multiple channels and devices, and our challenge is to maintain data synchronization of their reservations and guest profiles while growing to meet the transaction challenge," Conophy explains.
Using the Terracotta Enterprise Suite, IHG quickly and efficiently syncs up Java Virtual Machines. Caches are distributed across data centers. "It's basically a repository that lets us do data shifting from a primary database across multiple nodes," he explains. The result, he says, is from 50 to 100 times faster access than traditional methods, good indexing and integrity from one data center to the next.
Sometimes, Conophy says, "You create your own data storm." This can happen if companies put an application in the cloud that has to frequently access an internal database. The back-and-forth can quickly overburden pipes and cause performance problems.
To avoid this, Enterprise Management Associates' Frey recommends using tools to map application interdependencies and devising cloud strategies to accommodate them. "Get some measure of what applications are drawing off each other and then you can move them closer together vs. taking a hit on latency," he says.
Much like an internal network, bigger bandwidth sometimes is the only solution to congestion. If you're suddenly pushing all of your users out to cloud-based services such as Google Apps, then you're going to need fatter pipes from your building and remote offices. This reality has to be weighed when deciding to head to the cloud.
Although bandwidth has mostly taken a backseat to other cloud-related considerations, analyst Lanowitz says now is the time to bring it to the fore. "The risk for failure is growing because the company brand is now inextricably linked to the technology running," she says. That said, companies can't hand over bandwidth quality control to external providers -- it's something, she says, that must remain in-house.