Over-Provisioning Via Cloud
For a lot of smaller online retailers, it's hard to justify the return on investment for increasing the capacity they need to handle 12 hours of peak usage on one day of the year, says Girard. "That's where cloud comes into play, and we're seeing some retailers adopt cloud strategies. That's really going to progress going forward." Retailers will be able to get additional peak capacity at an incremental cost by moving to the cloud, he says.
Zappos' Ongbongan says they handle all network functions internally and do not use cloud providers. "We have instrumentation around every transaction point on the website, from search pages to product detail pages to checkout," he says, "so we can look at each individually to see if there's any slowness or problems in any of those areas."
But no matter how prepared you are, problems can still occur, especially when you outsource to third-party vendors. "Nothing is fully bulletproof, so really what [online retailers] need to try and achieve is fault tolerance,'' says Mike Gualtieri, a principal at Forrester Research. He recalls a retailer he worked with that uses an external credit card service that went down one year on Cyber Monday, so the company's orders couldn't be processed.
"Their e-commerce system is in-house, so they had planned for volumes -- searching and shopping the site -- but they have a service level agreement with a credit card service processing service that said, 'We can handle that volume.' So they did all the right things for their own systems and planned for the [increased] volume on Cyber Monday, but were held hostage by this particular provider,'' Gualtieri says.
He says he recommended that the retailer re-architect its site so if the payment processor were to go down again the company could still collect the order and payment information and process payments at a later time. That's particularly useful for small retailers, he says, who may not be able to invest in technologies like an online shopping cart and have to rely on third parties for the functionality.
Regardless of their size, Gualtieri says, retailers need to examine every component of their systems and assign a confidence level between one and five. "Every online retailer should look at their entire ecommerce architecture and all the components they use: shopping cart, products search, account registration--whatever they have--and rate their confidence level.
"Don't assume that everything will go right,'' Gualtieri says. "Assign a confidence level and don't fret too much, but have a mitigation strategy and backup plan."
Optimize for Traffic
Among the lessons Karmaloop learned during the 2010 holiday season were that its content delivery network configuration was not optimized for the traffic it was going to experience on Cyber Monday, says Joseph Finsterwald, CTO at the online retailer of alternative street fashion for men and women. "We worked with our CDN vendor Akamai to come up with a configuration that was a better fit for us,'' he says. The firm also discovered problems with parallel processes on the network and synchronization issues when servicing up Web pages, which was corrected by rewriting code.
Revenues are growing 50% to 70% year over year, Finsterwald says, so Karmaloop is using Keynote's LoadPro Web load-testing services to ensure its site is not strained. Because its CDN network was not optimized to handle this level of traffic in past years, the site experienced "frequent" network outrages, he said, although he declined to provide specifics.
"It gives you peace of mind that we can come up with a reasonable facsimile under peak load,'' Finsterwald says. "Load testing is an inelegant science; you're trying to simulate user traffic, but you're integrating a lot of third-party components." If a test is done on a quiet day, a third party may be able to scale to handle that, but all bets might be off when they're handling multiple clients.
This year, when conducting load testing, Karmaloop scaled its systems to a high enough load to trigger a problem for the vendors to address proactively. "We saw performance degradation with some of our vendors," says Finsterwald, "so we're following up with them to make sure they're doing what they need to do."
Keynote's Karow concurs. "Load testing done right has to be a very close representation of what real users are going to do, so it takes real thinking about what people do and the various systems involved and are you stressing those systems?"
Talk to Your Stakeholders
Also critical to the success of keeping systems up and highly available is making sure everyone is on the same page. "Everybody needs to be involved in the planning and predictive process,'' says Zappos' Ongbongan. At Zappos, that means everyone from brand marketing to financial planning to warehouse staff is involved in planning for peaks in site traffic.
One thing his group learned from talking with other departments was that their peak traffic typically occurs in mid-December, as opposed to right after Thanksgiving or right before Christmas.
Forrester's Gualtieri says it's a definitely a problem when a marketing group doesn't let IT know what it's doing that might cause site traffic to spike. He says he worked with a large Midwestern insurance company that spent a couple of million dollars on its first TV ad during a football game. When the ad aired, the company's site went down "almost instantly," because the company's marketing department didn't tell IT it was running the ad. "So IT had no idea they were going to expect 500 times the normal amount of traffic,'' he says, and they ended up wasting their money on the ad.
Despite all the proactive measures retailers may be taking, Gaultieri predicts there will still be "some high-profile outages" this holiday season. "One, two or several will happen. I also think a lot will happen that you'll never hear about ... I don't think this problem is going to go away."
Although companies are becoming savvier about bulletproofing their sites, crashes will inevitably occur due to continuous changes made to enhance the online shopping experience, he says. "You can't just put a site up and have it be static; there are lots of moving parts and it creates complexity, and there's fallout."
Esther Shein is a freelance writer and editor. She can be reached at firstname.lastname@example.org.
This story, "How to Bulletproof Your Website" was originally published by Computerworld.