We talk quite a bit about Storage Efficiency at NetApp, but what exactly does it mean to the end users? Is Storage Efficiency simply just data deduplication and nothing more? I highly doubt it. Well, consulting the all-knowing/all-seeing wizard at Wikipedia I was shocked to find there is not an entry about Storage Efficiency. Why not? We know it's taboo for corporations to create or edit their own Wikipedia pages so we won't go there.
IF there was to be an entry for Storage Efficiency, what should it say? Here's my proposed definition (very short and simple):
Storage Efficiency: The ability to store and manage your data while consuming the least amount of space on disk with little to no impact on performance, resulting in a lower overall cost.
In reality, Storage Efficiency goes beyond just deduplication of data - what about those technologies that don't duplicate data to begin with? To me, that's more even more efficient than removing duplicate data by some process. What about adding in compression? If the data is replicated, you also want efficiency in your replication to avoid saturating you available bandwidth, correct? What about using the larger capacity SATA drives that are available, while getting the performance equivalent of FC disks?
To me, a good story around Storage Efficiency has to take all the various components of a data center (and remote office) into account. In my next couple posts, I plan to focus on how NetApp's data protection solutions fit with our Storage Efficiency story.
That's a start. What do you think? Please let us know what you think by leaving a comment.
Jeremy Merrill Data Protection Solutions Technical Marketing Engineer
From where I work, Storage Efficiency is also about how the users and administrators interact with the storage system. Making it efficient is not just about being able to make the most of the physical blocks on the disk, it's about being able to easily provision flexible storage, and how the end user can interact with this seamlessly.
For an Administrator, we provide the tools to make this easy for each department. The Exchange administrator has a familiar interface that they already recognise from administering Exchange, so the storage element feels native to them. Provisioning, growing and protecting this storage is simple and easy for them. Products like the much awaited Systems Manager make these even more flexible and transparent for these administrators. An MMC interface they are used to using available at their fingertips!
For the Unix teams, we give a very similar interface, the command line has close parity with a lot of commonly used commands and the syntax is easy to follow and easy to pickup. It's also easily scriptable which makes it a very powerful appliance. This ticks all the right boxes for the Unix teams and gives them efficient use of the storage appliance. They hit the ground running and they don't need to re-learn a new system to start working.
All administrators have the ability to control their own storage (including BURA and BC/DR), this makes the process and workflow of the entire enterprise more efficient. No need for any lengthy retraining or waiting for people to understand an entirely new system. The storage just slots into the workflow and people get new useful features that slot into their existing way of running, not another system to manage and another headache to deal with.
For the end user, one of my favourite features is the VSS client integration. Having the most powerful and efficient snapshot technology around is fantastic stuff, but being able to get the end user the ability to restore anything they need in 2 clicks is priceless when it comes to efficiency. The end user themselves is now self sufficient! Each SMAI product has ways of giving the end-users easy ways to interact with the storage system.
So for me, although Storage Efficiency is great at the block level, this is only half the story. Making storage efficiency work at the human level is priceless, and for me this is where NetApp solutions fit so well.
Absolutely brilliant post, Chris. I personally do tend to get wrapped up in the bit twizzling on the backend and at times forget that we were once called Network Appliance for a reason. I hope this perspective does become part of a wikipedia definition: human efficiency and hardware efficiency. If you ever get a chance to watch "Dirty Jobs" on the Discovery Channel, you'll see a funny mixture of the two!
We might talk about this in terms of reducing capital expenses and operational expenses but if you just change it up to talk about hardware efficiency and human efficiency, it personalizes it more. The more we can see the impact on an individual, the easier it is to understand the impact. I don't mean this in some diabolical marketing sense. I think this personalizes a good point that really puts you in a different frame of mind to think about this.
I'll give Chris the credit but I'm so ripping this off for my presentations.
To me its all about the measurement. To steal a phrase from a pretty good book (Purple Cow) "if you measure it, it will improve." Otherwise, how do you know how efficient something is unless you measure it and compare it to something that is inefficient?
The question becomes, how exactly do you measure storage efficiency? To me, its simple. Just compare the storage you bought from your friendly storage vendor (aka Raw Capacity) to the storage you can actually use to run your business (aka Allocated Capacity.) Most people call this utilization, and I've seen vendors fight other whose utilization is higher - hey we are are 70% and you are only 60%! Thinking more about this, I began to wonder if this is an old way of thinking, and a new standard is in order - out with the old and in with the new!
Anyway, there are lots of new technologies out there that make data "appear" larger than it actually is. Ones that come to mind are compression, deduplication and thin provisioning, but I'm sure there are others. So why not take all these smart features and use them drive utilization to 100% and beyond? What a concept, buy 10TB and actually use 10TB.
So here's the simple formula for storage efficiency I'd recommend:
Storage Efficiency = Allocated Capacity / Raw Capacity
Allocated Capacity is the capacity apparent to applications and users, while Raw Capacity is the manufacturers' stated capacity including system reserves.
How high can you go? 100%, 150%, even 200% storage efficiency? Let the measuring begin!
These are impressive results but one of the points we try to emphasize is assessment. We don't know what is possible in a particular environment. It's important to take a look at your environment to understand what is possible. (We have a number of tools that we can use to assessment an environment so we can make some educated estimates.) Secondarily, all of the above customers looked at storage efficiency as a strategy vs. focusing on one particular technology feature. While deduplication was extremely powerful for these customers, they also made use of thin provisioning, cloning, RAID-DP, Snapshots, and thin replication. These customers had different levers to pull to achieve their results and we think it's critical for customers to have an array of features for use in their storage efficiency strategy. What's more, we designed these features to be practical to use individually and collectively. Customers have to have confidence that we can still meet their business needs with these features enabled.
Thought you'd never ask. Follow this link and look under "Related Content" to see some examples of organizations that had game-changing experiences as a result of storage efficiency. New case studies are appearing all the time, so check back often. It appears that storage efficiency is really catching on!