Storage

EMC's Gelsinger Shares Storage Federation Vision

Pat Gelsinger made headlines in September 2009 when he left Intel to join EMC as president and COO of information infrastructure products, a group that includes the company's information storage and information security businesses. Now, Gelsinger -- who was Intel's first chief technology officer and led both the desktop products group and the digital enterprise group during his career at the chip maker -- is making waves again.

During a presentation last month, Gelsinger offered industry analysts a first look at EMC's vision of global storage federation, which will allow companies to overcome latency and bandwidth problems to keep their data in sync over large distances. EMC believes this technology promises to alter the storage landscape in ways similar to how virtualization has changed the way companies deploy and use servers.

During his first trip to Asia on EMC's behalf, Gelsinger sat down for a telephone interview with IDG News Service to discuss the transition from Intel to his new job at EMC, and to offer some insight into EMC's vision of global storage federation and how it plans to bring the technology to market.

What follows is an edited transcript of that conversation:

IDGNS: You've been at EMC now for about six months. After so much time spent at Intel, how is the adjustment to working at a new company?

Pat Gelsinger: Obviously, after 30 years at Intel it was quite a significant transition, from silicon to systems, from West Coast to East Coast -- even though I'm living more bicoastally. I'm very excited about the opportunity at EMC, otherwise I wouldn't have moved. I see the disruptive nature of what's going on in the IT industry around virtualization, around cloud computing, and around the new model of consumption for IT in the future.

EMC is well positioned to be a beneficiary of these changes as well as a disruptor of the overall IT industry. In that regard, I saw it as a good opportunity. So far, it's like a major organ transplant; the body hasn't rejected me yet. So far it's going pretty well.

IDGNS: What have you found to be the biggest difference between Intel and EMC?

Gelsinger: Intel is almost all OEM (original equipment manufacturer) as well as indirect sales. EMC is a very direct sales oriented company and as a result it's extremely customer focused. If there's an issue for a storage customer of EMC, it's a five-alarm fire and you're guilty until proven innocent. It doesn't matter what the source of the problem is, fix the customer's problem and really build that very strong, long-term relationship with the customer.

The projects I worked on at Intel were four or five years long. Everything was built around Moore's Law. At EMC, being a systems company, you have much shorter development cycles because you have to respond much more quickly to different customers and competitive inputs, so I'm learning a different pulse rate for the business.

EMC has also been more acquisitive in terms of integrating different businesses. We bought Data Domain last year, that works for me. We just did the Archer Technologies acquisition at the beginning of this year, that's also part of my group. I'm learning in new areas. I'd previously spent about two seconds thinking about government risk and compliance, now I have the lead business in that area in the industry, so I'm coming up to speed with a whole new technology and customer space.

There's a very rapid learning curve and I think anytime you change jobs and companies it's one of the fastest learning curves you can be it.

IDGNS: You caught a lot of people's attention with your description of EMC's vision of data federation. Can you tell us more about that?

Gelsinger: By analogy, if you think about what we've done with virtualization for servers, first you're able to consolidate multiple servers on a single piece of hardware and that brought a lot of efficiency. Secondly, you're able to group servers together with things like VMotion and distributed resource scheduling (DRS) is allowed, and you have these pools of resources. To move virtual machines (VMs) over long distances you've been bound by physical storage, because you can't shove a terabyte across the wire and have this flexibility.

What we're trying to do in our overall vision for virtual storage is to mimic what we've done with servers for storage. The first part is to collapse multiple frames and tiers of storage into a single device. The second part is to create federation, which means being able to take pools of storage frames and treat them like one, so we create more agility for the storage subsystem free from the physical storage environment. But the big concept that we laid out is what we call geofederation, this ability to cache across distance large frames of storage. As long as storage follows the 80-20 rule -- a little bit gets used a lot -- we have the core technology that allows us to share data at great distances. We believe this will be a fundamental enabling technology for entirely new models and usage of computing.

For example, imagine that I did truly want to do teleportation of VMs, move a large number of VMs across distance, today I couldn't do it because I can't access the data. This will allow you to essentially have follow-the-sun, or follow-the-moon, where there's a balancing of your workloads. Another example might be active-active data centers. Maybe I have a big Oracle installation that I want to be operating a shared database over distance. You could have that kind of workload. An active-passive example would be I'll have a hot disaster recovery site, where I could truly failover in seconds, rather than hours today, to the backup site.

Some of these things that we think get enabled by this have really generated a lot of enthusiasm, both from some of our earlier customers as well as from the analyst pitch that we gave.

IDGNS: Some observers have pointed to the technical challenges of keeping data in sync over large distances. Has EMC solved that problem?

Gelsinger: We point to two bodies of work that we believe have enabled us to come up with this technology to solve the problem. One is that we have been doing deep analysis, particularly as part of our Symmetrix product, where we have built the world's largest cache environment in the embedded product for many, many years. We've gained a lot of insights into workloads and what caches well, what doesn't, and how to do that kind of work. We also acquired a piece of technology originally from YottaYotta that was a distributed cache protocol for technology to be able to do coherence algorithms for distributed data.

By bringing those two bodies of expertise together, we think we've actually been able to solve this idea of global federation, of being able to cache data over distance.

Now, having been a cache designer in my own career, starting with the 486 processor cache that I personally designed, you realize that not everything works in a cache. There are all these sub-workloads that don't work well. However, you find that most things actually work pretty well. As we say, most things follow the 80-20 rule: a little bit of data gets used a lot. As long as you live within the boundaries of how much bandwidth you acquire and make available, we're quite convinced this is a very significant new technology and maybe a breakthrough and as such, it's a new capability that quite differentiates this storage vision from anything that we've seen before.

IDGNS: How close is EMC to making a public demonstration of this technology?

Gelsinger: We wouldn't have talked about it if it wasn't well underway. Watch this space and hopefully you'll see more of the specifics at EMC World in May. As I indicated at the briefing that we gave, we're already engaged with customers who are experimenting with this and doing a proof of concept with some of the early appliances that we're building in this space.

IDGNS: The technology will first hit the market as an appliance rather than a capability built into an existing product?

Gelsinger: To get it right, the first version is an appliance because we're doing this new algorithm with caches, memories, traffic and bandwidth. So, the first version is an appliance. However, the core technology is a software protocol that, as I indicated in the briefing, we'll just embed that in some of our arrays and potentially have other software versions of the product available over time also. Generations one and, probably, two will be an appliance. Over the roadmap, you'll see us build this into some of our other storage arrays and perhaps other ways of how we productize it.

Subscribe to the Power Tips Newsletter

Comments