I often joke w/ customers that when my friends were growing up they would dream of being a professional baseball player or a rock star and I used to dream of becoming a data protection technologist. Recently I read something very profound in Chuck Hollis’s internal EMC blog. Chuck said, "Decide what you're passionate about ...and write about it... it is hard to write about stuff you don't care about." I am passionate about data protection. Not because data proteciton is "cool" or anything, but it is one of the most important practices in the data center. It is also one of the most challenging practices in the data center and it involes not just technology but people and process as well. I had an old boss once who said, "Where there is chaos, there is cash." and given the fact that the data protection market is a $10B market, I would say he was correct. I have started this blog along with my colleagues because we truly believe in what we do, who we work for, the challenges we solve and benefits we bring to a customers challenging world around data protection. We write because we are passionate about data protection, not because we are being paid to.

Something I read a while ago in Tony Assaro’s blog, Leaders Dilemma as well as Setting the Record Straight really got me charged up but I wasn’t sure how I wanted to comment. Tony, you see, writes for money (not passion), which means he has to write ‘for’ the company that is paying him and at the same time, spend time ‘Manufacturing Confusion’ in the market. (Sorry Tony, I liked you better as an analyst when you heard all the vendors product messages and would form an opinion about what was really going on in the market.) What I am referring to are the comments specifically about "EMC is the one big player going after this market in earnest with three different products (which will confuse the market and themselves)". Quite frankly, EMC’s philosophy and message to its customers regarding data deduplication isn’t confusing at all. In fact when I speak with our customers, they believe we have one of the more thoughtful and consistent messages around this topic. So in an effort to educate, let me share EMC’s data deduplication philosophy and how EMC will take backup, beyond. EMC will:

  1. Provide deduplication as a pervasive & architecturally consistent service
  2. Coordinate deduplication throughout combinations of data storage and data movement
  3. Deduplicate at the highest level of abstraction
  4. Deduplicate as close to the source as practical

When these values are leveraged, the entire spectrum of data protection morphs into methods that will be used to protect data well into the future.

Back to the subject of the blog. Data Domain will continue to sell good products to customers. Data Domain will continue to innovate their existing technology to meet customers’ demands. But they will do this at the expense of a lack of innovation. Remember, the hardest thing to change in IT is process, not technology. Backing data up to disk targets is nothing new and now, backing data up to disk devices that perform deduplication is not innovative. However, the paradigm of using traditional backup software to move full files across an expensive network is beginning to evolve. It MUST evolve, and when it does, what happens to the companies that have interesting features that are just one small morsel in the food chain? If you don’t own any significant IP in the extended processes that is data protection, then you will be left out of the backup buffet. And as Maslow would say, "If all you have is a hammer then everything looks like a nail."

EMC has taken a leadership position in the data deduplication space not because they offer multiple products but because of the way we look at technology. Data deduplication is made up of different components:

  1. Data 'chunking'
  2. Compression / Encryption
  3. Assign Content ID
  4. Store

The goal is to be able to leverage these components across multiple storage platforms providing deduplication at the highest level of abstraction as possible and as close to the sorce as practical based on the requirementsof the application . Preserve the content by deduplicating content instead of data. The objective, over time, is to provide deduplication as a pervasive and architecturally consistent service across EMC's entire storage portfolio. When you do this the entire paradigm of protecting information evolves and this is why EMC is the leader in data deduplication. Not because we have 3 (or however many) products, but because of the way in which we look at data deduplication.

At the end of the day EMC has over 2000PB of deduplicated data under protection utilizing both source and target based deduplication solutions. And, I would venture to estimate that if you include NetWorker, RecoverPoint etc… EMC has exabytes of data under protection. EMC has a long history of changing with the times, listening to their customers, investing in new technologies and protecting customers data they way they want and need it to be protected. That is taking backup, beyond.

Posted by Steve Kenniston

Tags:

Avamar, Data Domain, Dedupe, Deduplication