Storage Informer
Storage Informer

Storage And Cloud

by on Jun.26, 2009, under Storage

Storage And Cloud

EMC logo Storage And Cloud

One of the more popular questions that gets directed at me by journalists and others these days is around the above topic. I guess since EMC does storage — and is very active in things cloud-like — they expect us…

One of the more popular questions that gets directed at me by journalists and others these days is around the above topic.

I guess since EMC does storage — and is very active in things cloud-like — they expect us to have some nice sound bites.

Well, I have my pre-packaged answer that’s suitable for the press, but — if you have the time and the interest — the deeper answer is much more engaging.

Let’s Get Started

Since there are so many definitions of cloud floating around in the sky these days, it’s probably helpful to frame what I mean when I say “cloud”

  • aggregated and abstracted resource pools of compute, network and storage
     
  • a dynamic consumption model where applications acquire and release resources as their needs change
     
  • an implied oversubscription model that presumes uncorrelated resource demands
      
  • some notion of geographical distribution across time zones
      
  • an optimized operational model that is designed around service delivery, rather than traditional resource pools and processes.

Your definition may be different — but, since it’s my blog — I thought I’d start with mine!

Is It All About Capacity?

Go to any industry cloud discussion, and the storage conversation will inevitably shift to “cheap” and “big”. 

Web 2.0 companies (e.g. Facebook) are frequently used as the prime example.

If you’re a web 2.0 company, “cheap” is really important, since so many web 2.0 companies have business models that demand very inexpensive storage. 

And “big” is important since there are many examples of web 2.0 companies who have gotten surprisingly big, and thus need extremely large storage farms.

I do have to point out that a surprisingly small fraction of IT spend goes to the newer web 2.0 companies compared to the more boring (but much larger!) traditional enterprise IT consumers, so using these folks as an example can lead you astray in some cases.

However, if you’re close to storage technologies, you’ll realize that cheap is really a function of service level delivered. 

Take any disk drive and make it bigger — it’ll become cheaper — and slower. 

Spin it down — it’ll become even cheaper — and even slower. 

Start deduplicating or compressing data on it — again, even cheaper and even slower. 

Go from disk to tape — potentially even cheaper and slower again!

Conversely, make multiple copies on multiple disk drives, and the picture reverses — things get faster, more available — and more expensive.

Or Is It About Service Levels?

I would argue that cheap storage is relatively easy — and big storage is relatively easy — but what is *not* easy is getting the right data at the right service level at the right time.

Indeed, at one of the panels I attended at the recent GigaOm conference, a few of the web 2.0 IT architects pointed to solid state disk as the “single most important technology” to them going forward. 

You may be surprised by this statement — I’m not.

Now, start throwing global network latencies into the picture (also a determinate of perceived service levels, as well as cost), and you get a much more interesting picture, don’t you?

Extending The Definition of Storage and Clouds

If you go back to the definition I outlined above, we can get even more precise:

  • Aggregated and abstracted resource pools of compute, network and storage

This implies that not only storage capacity is aggregated and pooled, but storage bandwidth and response times are also aggregated and pooled. 

It also implies that storage is conveniently abstracted, ideally in such a way that complements the abstraction models being used by servers and networks.  Hint: think virtual machines.

  • A dynamic consumption model where applications acquire and release resources as their needs change

Well, we know that when it comes to storage capacity, the meter goes in only one direction — more storage. 

But when it comes to storage performance (and perhaps availability) a different picture emerges.

It implies the ability to have pools of information go from very slow/cheap to very fast/expensive (and back again) dynamically.  Hint: think technologies such as FAST.

  • An implied oversubscription model that presumes uncorrelated resource demands

This sort of performance profile changes the way you think about storage array design, as well as storage network design.

If you think about it, this is a very different aggregate performance profile for storage, isn’t it?  Traditional measurements and benchmarks (think about our old friend the SPC for example) are utterly useless in this world.

  • Some notion of geographical distribution across time zones

Having the right information in the right place at the right time dramatically improves end user application performance and can dramatically reduce associated network costs.

Indeed, you’ve seen a healthy dose of that thinking with the current EMC Atmos product.  It solves an interesting use case for geographically distributed storage models, but not every use case — which implies that you’ll probably see more along these lines from EMC and other vendors before too long.

  • An optimized operational model that is designed around service delivery, rather than traditional resource pools and processes.

If you parse this statement, you’ll realize the implication is that storage in the cloud isn’t managed the way we traditional manage storage today — it becomes simply an extension of the service being delivered.

Those of you who are doing long-term career planning as storage architects and administrators might take note of this thought.

Should We Be Talking About Cloud Storage vs. Cloud Compute?

I do have to share one of my personal biases — the whole category of ‘cloud computing’ is probably misguided, at least in my book.

Computing in the cloud seems relatively straightforward. Lots of different ways to do it.  My belief is that private clouds — based on virtualization — will be the dominant model for most enterprise IT shops.

However, getting the right information to that application, in the right location, at the right cost, at the right service level, at the right protection level, while keeping everything secured — well, that just seems so much more challenging, doesn’t it?

We’ll see where the discussion goes in the future, won’t we?

Update your feed preferences

URL: http://emcfeeds.emc.com/rsrc/link/_/storage_and_cloud__76746568?f=84f8d580-01de-11de-22d1-00001a1a9134

:, , , , , , , , , , , , , , , , , , , , , , , ,

Leave a Reply

Powered by WP Hashcash

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

A few highly recommended friends...