Syncsort spun out its ‘data protection’ technology in January of 2014 and when it did, Catalogic Software was born. But why the name Catalogic Software and what does that have to do with ‘data protection’? I find myself answering that question a lot lately as I have been briefing press and analysts on the state of Catalogic as we gear up for the NTAP Insight events in Las Vegas and Berlin.

So how did Catalogic Software get its name? In 1997, then Syncsort, built one of the industries first block level incremental backup to disk, data protection technologies. Having done so and having a deep, rich history in data sorting and meta-data, the team really understood the value of having a ‘catalog’ of your data. In the backup days, the way you recovered data was by browsing your backup catalog to find what you wanted to restore.

Around the same time, as backup to disk became more and more popular, NetApp was telling it's customers that they didn’t need complex backup infrastructure any longer. Customers could recover from Snapshots and could perform disaster recovery leveraging SnapMirror. The reality was this was a little more difficult than anyone realized. While it was true, you could recover from your snaps and mirrors, in large environments, finding what data you wanted to restore was very difficult. As data has grown over the years, and filers have multiplied like rabbits, it is even more difficult. This is why that 90+% of recoveries still come from traditional backup software, because you can leverage the catalog to find what you need to restore fairly quickly and then perform the recovery.

Fast forward to today. IT is changing constantly. No longer can the business wait a month or even a week for IT to respond to business requests. In addition, IT still needs to perform a number of the same tasks they always have like backup. So the question is, how can IT stop using backup as just an insurance policy and start to leverage that data in a way that helps them increase their operational efficiency, while responding to the needs of the business providing agility?


The answer, it starts with a Catalog! At the heart of Catalogic Software is one of the most robust, scalable and intuitive catalogs in the market. Wrapped around that catalog are two products, DPX and ECX.

The catalog helps to provide visibility and insight into your physical, virtual and public/private/hybrid cloud environments. Having a catalog helps users to locate their data quickly. The catalog captures metadata about every file that lives in any NetApp repository in your environment. It captures data on files, snapshots, SnapVaults, SnapMirrors, and does so in your infrastructure, your converged infrastructure (Flexpod), and your public, private and hybrid cloud (Cloud ONTAP).

If you peel back the onion, you will see that this very robust database is built on MongoDB and is highly scalable, so if you have 1 filer or 10,000 filers the Catalogic catalog can capture it all. Additionally, it installs agentlessly as a virtual machine and can scan through your entire environment in minutes to capture things like number of files and files per filer. To capture further meta-data information on all files such as owner, permissions, etc… it may take a bit longer.

Now that you have a catalog for your environment, what can you do? You can open up the power of your NetApp snapshots and have full control over your envionment. You can begin to leverage your data for more than just recovery and disaster recovery. You can utilize these data copies for different lines of business such as Engineering for Test/Development or Marketing for real-time analytics by providing instantly mountable access to the most recent data sets.

With Catalogic’s ECX product, you have the ability to build a level of control in your envionment that you haven't ever had before. You can develop Closed Loop Automated Workflows (or leverage the CLAW capabilities) that help you set up workflows for different data sets for different business operations. You can automate the testing of your DR environment to take place every day if you like without the need to spend sleepless weekends at the office. CLAW can provide a fully automated DR test as well as tear down or promote to production, consistently, on a nightly basis. And if you can leverage your data for this, you can also use CLAW to set up Test/Dev environments every morning so when engineers come in they are working with the latest data set. You can also do the same thing for Marketing so they can run their analytics on the most recent data sets in real-time. Additionally, they can run their analysis against all the data in a database vs the traditional ETL method which is time consuming and only grabs parts of a database. There is no flexibility built into today's current analytic methodology and it is very time consuming.

At the end of the day, with business moving at the speed of technology, having extreme demands on IT, having a catalog of your data allows you to have visibility, insight and control over the data in your environment and not only provide operational efficiency and cost savings, but also provides business agility, allowing IT to not only respond to line of business demands quickly, but work on the next big thing…

Tags:

analytics, Berlin, big data, business agility, catalog, claw, Cloud, data, Data Domain, dpx, ecx, EMC, ETL, filers, files, hybrid cloud, Insight, Las Vegas, meta-data, MongoDB, NTAP, private cloud, public cloud, Storage, syncsort, workflow