The great thing about blogging and independence is that we can post things that add value that we want to share as long as we give the proper recognition. One of my colleagues, Mike Dutch from the CTO office of SSG and long time SNIA member had some thoughts as it pertained to storage tiering that were insightful so together we decided to share this post. I hope you enjoy it.

I'm guessing that many people define a storage tier by its particular storage technology (like SATA). While this may be a useful working definition it obscures the essential notion of what a storage tier really is and leads to confusion when a new technology like data deduplication comes around. A precise definition may also lead to some interesting innovations if we were to take a slightly different path.

Should deduplicated storage be considered a storage tier? I would say “no” and here's why: because a technology such as deduplication can span, and optimize across all tiers.

A storage tier is storage space that has availability, performance, and cost characteristics different enough from other storage tiers as to economically justify the movement of data between it and other storage tiers based on the importance (value, performance need etc…) of the data. While storage tiers are often thought of as being tied to a particular type of hardware,

e.g., Flash, FC, SAS, SATA, VTL, PTL, COM (Computer Output Microfiche), or even paper, this is not necessarily the case. For example, highly available cloud or network-based virtual disks could leverage multiple technologies within their single tier. Since a variety of technologies can be used to provide a particular storage service level, you should not think of a specific technology as a specific storage tier, but should instead evaluate what technology, or combination of technologies would deliver the availability-performance-cost point that I need for this level tier. "SATA" is not a storage tier, it just happens to be one "technology-set" that can deliver for a single storage tier.

Note that storage tiers are not defined by their capacity, per se, but there is usually less capacity of more expensive tiers precisely because it is more expensive. Deduplication is "simply" a method to save and access data on a storage medium which is why capacity optimization techniques are best considered features of storage platforms rather than standalone products. (Of course deduplication can also be used as part of a WAN optimization solution but here we're talking about deduplication in relation to storage tiers, and dedupe engines without storage aren't very interesting storage tiers).

In other words, deduplication lets you lower the cost/GB associated with a particular storage tier, but it isn't a storage tier in and of itself. The same rationale applies on why other space efficient storage technologies (e.g., compression) are not tiers unto themselves. It's the mixing and matching of both old and new technologies to create a new "availability -performance -cost" point, that makes up a new storage tier.

So who cares what a storage tier is anyway? On one hand, as long as you can help your customer affordably satisfy their business requirements it doesn't matter. But at another level, it profoundly matters. If you don't have the knowledge to think about a subject precisely, you may not only be unable to solve problems related to the subject. Even more, you may not even be able to recognize there is a problem. Having the right knowledge lets us understand our challenges and more importantly find alternative solutions to them. After all, isn't storage tiering really about helping to deliver on a "no more tears" promise?

The efficiencies that data deduplication and storage tiering bring to data protection enable businesses to reduce risks as well as costs. Information that was previously protected on an adhoc basis, if at all, can now affordably be brought into the ILM umbrella as a full fledged corporate citizen. The Storage Networking Industry Association defines Information Lifecycle Management (ILM) as "The policies, processess, practices, services, and tools used to align the business value of information with the most appropriate and cost-effective infrastructure from the time information is created through its final disposition." Data deduplication and storage tiering are two arrows in the ILM quiver that can be used pervasively within the enterprise to score a bull's eye in backup... and beyond. Limiting our thoughts about how any technology can be used, whether it be data deduplication, Flash, or whatever the Next Big Thing is, simply limits the solutions we can find.

Should deduplicated storage be considered a storage tier? No.

Should deduplicated storage be used as a storage tier? Pervasively.

Thus endeth the sermon for the day.

Tags:

Backup, Business Continuity, disk, EMC, Process