The 'Fat Middle'

In the 'fat middle' of the triangle, as I stated last week, there are a number of ways to protection information. I have chosen to break apart the middle into two categories. The reality is, this is meant to be used as a tool for helping you lay out a strategy so your boxes could be based on capacity and could end up in different areas of the triangle depending upon your business needs. The thing to keep in mind is that it's not about your environment matching these boxes exactly, but it's about making sure that all of the critical data that requires backup with a 24 hour RPO is protected; you then alignthe data value in the box with the most appropriate technology to 1) solve the challenge 2) fit best in your environment.

SMB / ROBO

First, let me clarify my terminology. ROBO is remote office, back office and SMB is small to medium business. If we think about the business needs that are most important in this arena, they are:

  1. 1) Low cost
  2. 2) Simplicity (one tool)
  3. 3) 24 hour RPO is adequate

Small and medium businesses, as well as remote offices, need a robust data protection solution that allows them to meet their backup windows and that has the ability to recover data that is not any older than 24 hours (RPO). The RTO drives whether the backup target is disk or tape. Faster recoveries come from disk. Another thing to keep in mind is that there isn’t usually a lot of technical expertise at these sites so the backup application needs to be very simple to manage.

Backup appliances or appliance-like backup technologies tend to work very well in these environments. A self contained backup appliance, (disk based) with the ability to replicate efficiently to another site for disaster protection is a great solution for sites like these.

In the case of SMBs, they can take advantage of a single application with integrated disk that could replicate to the cloud for very little cost and management while meeting their data protection objectives. If cost is a driving factor, and the customer just wants better backup and recovery performance, moving to an appliance-based, capacity-optimized disk solution that could replicate is a viable option. If the customer does not have a desire to replace their existing backup solution because it is working fine for them, then moving to disk based backup can help with most performance requirements. (This is also true for the data center as well.) And when customers really want tape as their backup medium for getting data off-site then the management will be a bit more complex but still easily achievable.

For remote offices in large corporations, again, an appliance that IT can remotely manage and replicate efficiently back to a data center gives users at the remote site local recovery time objectives, hours, as well as a DR strategy in the event there is a site level issue.

Along these lines, I have spoken to a number of customers lately who are utilizing virtual machines. In a number of these cases, a virtual backup appliance is a great way to reduce the amount of complexity that is added to a customer’s environment yet still achieve the business requirements.

The Data Center

Next in the ‘fat middle’ is the data center. There are many different backup challenges here. One challenge follows the 80 / 20 rule. Eighty percent of the data is usually unstructured data (file system) and 20% is structured (database and email). As a general rule, the 80% of the data that is file system data is great for next generation data protection solutions such as source-based data deduplication. Now, there are exceptions to this rule but a majority of the time source-based deduplication is the perfect fit.

A source-based deduplication solution could require that old backup agents are removed and new ones be deployed. It may also mean that media servers are removed or repurposed. The tradeoff for the extra work required to implement a source-based solution however are:

1) Faster backups

2) Less capacity stored on backup media (disk)

3) More time freed up for the 20% of the backup environment that needs more resources

The third item in this list is very important. As we discussed there is no longer a ‘one size fits all’ solution for data protection. I also mentioned that source-based deduplication is a great fit for the unstructured data in your environment. However, for the structured data, information that has a high change rate, traditional backup applications typically backup this data faster than source-based deduplication. Keep in mind; if you are in a larger data center, you have probably architected your backup infrastructure to meet the demands of your more important applications. These applications probably backup on the SAN, may be server-less, and have likely required a good deal of time spent by IT ensuring that there are no issues with protecting these applications. However, with the data growth in the environment, backups across the enterprise are running more slowly so they are having an impact on the critical business applications. By implementing a source-based backup solution for the 80% of the data in the environment that it is a good fit for, you off load the traditional backup application so that it can focus on the 20% of the data in the environment that may have a greater business need.

Another good fit for source-based deduplication is in virtual server environments. The benefits of server virtualization are soon forgotten when it comes time to back them up. The reality is, virtual servers are not designed for high I/O and backup is the application in your environment with the highest I/O. By leveraging source-based deduplication and removing all the redundant data at the source before it needs to be sent through the virtual servers physical resources, you can dramatically decrease your backup bottleneck and increase your TCO with virtual servers.

When it comes to source-based deduplication, one thing to consider is that some customers may not want to go through the process of changing over to a new data protection technology. If this is the case or if there is an area where source-based deduplication isn’t a good fit, disk targets such as VTL or target based deduplication is a good way to increase backup performance over tape, and reduce the capacity of data that you are storing on a daily basis.

Also, when I speak with customers these days, they want to reduce their reliance on tape more and more. Deduplication solutions allow for appliance to appliance replication very efficiently. This enables customers to get data off site efficiently and store data on disk at the same cost as storing data on tape while increasing operational recovery and ensure you are on the Road to Recovery.

Tags:

Backup, Data Deduplication, Data Protection, Recovery, Restore