Like the unfortunate person who continually diets but only seems to gain more weight, power-hungry data centers — despite adopting virtualization and power management techniques — only seem to be consuming more energy than ever, to judge from some of the talks at the Uptime Symposium 2010, held this week in New York.
“There is a freight train coming that most people do not see, and it is that you are going to run out of power and you will not be able to keep your data center cool enough,” Rob Bernard, the chief environmental strategist for Microsoft, told attendees at the conference.
Power usage is not a new issue, of course. In 2006, the U.S. Department of Energy predicted that data center energy consumption would double by 2011 to more than 120 billion kilowatt-hours (kWh). This prediction seems to be playing out: An ongoing survey from the Uptime Institute found that, from 2005 to 2008, the electricity usage of its members’ data centers grew at an average of about 11 percent a year.
But despite all the talk in green computing, data centers don’t seem to be getting more power-efficient. In fact, they seem to be getting worse.
“We haven’t fundamentally changed the way we do things. We’ve done a lot of great stuff at the infrastructure level, but we haven’t changed our behavior,” Bernard said.
Speakers at the conference pointed to a number of different power-sucking culprits, including energy-indifferent application programming, siloed organizational structures, and, ironically, better hardware.
One part of the problem is the way applications are developed. “Applications are architected in the old paradigm,” Bernard said. Developers routinely build programs that allocate too much memory and hold on to the processor for too long. A single program that isn’t written to go into sleep mode when not in use will drive up power consumption for the entire server.
“The application isn’t energy-aware, it doesn’t matter that every other application on the client is,” he said. That one application will prevent the computer from going into a power-saving sleep mode.
The relentless pace of processor improvement is another culprit, at least if the data center manager doesn’t handle it correctly. Thanks to the still-unrelenting pace of Moore’s Law, in which the number of transistors on new chips doubles every two years or so, each new generation of processors can double the performance of its predecessors.
In terms of power efficiency, this is problematic, even if the new chips don’t consume more power than the old ones, Bernard said. Swapping out old processors for new ones may get the application to run faster, but the application takes up correspondingly less of the more powerful CPU’s resources. Meanwhile, the unused cores idle, still consuming a large amount of power. This means more capacity is wasted, unless more applications are folded onto fewer servers.
“As soon as you replace your hardware with something more efficient, your CPU usage, by definition, will go down,” Bernard said.
Speakers at the conference estimated that the average CPU utilization (which is the number of processor cycles that are actually tasked with doing something) hovered somewhere between 5 percent and 25 percent. Despite virtualization efforts, the percentage seems to be going down as time passes.
Organizations are not thinking enough about how to consolidate workloads, Bernard charged. Each new application added by an organization tends to get its own silo, and very little work is done in sharing resources.
Bernard used Microsoft as an example. He noted that while Microsoft online services such as Hotmail and Bing have really high CPU utilization rates, the company also has many other projects, both internal and external, that use only a small portion of the capability of the servers devoted to them. For each new project, a manager may provision too many servers for the task. And when the hardware is upgraded, the CPU utilization rate goes down even further.
Bernard said Microsoft, like many large organizations, has “hundreds and hundreds of small applications that aren’t mission-critical, but they need to be serviced, and they all overprovision and have massive headroom.”
The server makers and other component manufacturers have gone a long way toward building power saving into their equipment. However, again thanks to the low CPU utilization and ingrained organizational habits, power savings have proved to be minimal.
John Stanley, an analyst at the research firm The 451 Group, which purchased the Uptime Institute last year, surveyed power usage across industry members of Uptime. In a panel discussion, he previewed some of his early findings.
He had found that fluctuations in server traffic do not correspond with fluctuations in the amount of power that servers, as a group, draw from the power supply. “Even though you may have big variations with [different] boxes, overall, the variation in the average is very small,” he said. Stanley plans to publish his findings in a research note later this month.
Servers may have power-savings features, but given how the workloads are spread out across the servers, such features don’t seem to do much good in reducing the overall amount of energy consumed.
Even when it idles, a server can use hundreds of watts, though few users want to turn the servers off, given the time it would take to get them running again, Andrew Fanara said in the same panel discussion. Fanara is the former Energy Star manager for data center specifications and is currently with infrastructure-management software provider OSISoft.
What is needed is a more dynamic way for the data center to scale its power usage with the amount of work that needs to be done, speakers said. “As an industry, what we’d truly like to see is truly linear scaling where you’d use zero watts when doing zero work to drawing a lot of power [only] when you are doing more work,” Stanley said.
This idea was echoed by eBay’s data center chief, Dean Nelson, during his talk.
“What I believe will be coming is applications that tune the frequency of the server CPU, [so the application] can dynamically overclock or shrink the [CPU] frequency by demand. The physical infrastructure will dynamically match to the load,” Nelson said. Moreover, the application requirements could also control the amount of cooling needed. “That is a truly dynamic data center, and that’s where I want to get to,” he said.
As it happens, this sort of scalable computing is what Intel is trying to achieve with its successive generations of processors.
“Computers seldom work at full workload,” admitted Winston Saunders, Intel’s director of power initiatives, in an additional talk.
The goal Intel is working on is to develop chips that use “only the amount of energy necessary to scale to the load,” Saunders said. Already, some power-saving technologies have been built into the company’s processors. For example, the Xeon 5600 has a wide array of power-saving technologies, such as the ability to power down cores and flush the cache when the processor is only being lightly used.
Saunders promised that each new generation of processors will feature gains in energy efficiency and that the company is aiming toward “energy-proportional computing,” in which the power usage scales smoothly with the workload.
The CPU accounts for only about half of the power that a server uses, though. For dynamic power scaling to truly work, all the server components — fans, memory, disk drives and other components — must scale with the application workload, Stanley said.
Such coordination will need to go beyond server component makers and extend across all aspects of data center operations, Bernard said. “If you look at any one slice, which is what people tend to think about, all you do is push the problem up or down the stack,” Bernard said. The application managers must work more closely with the data center operators and even the facilities managers, to work out the most efficient operations overall.
“The idea is to not think about more transactions per watt, but to think about fewer watts per transaction,” Bernard said.