Disaster Recovery - What is it and why should we care?

Contributed by

4 min read

Disaster recovery is about being able to recover from major outages hitting your data center. As more and more critical business applications move into Hadoop, you must have a solution that meets gets you back in business immediately.

Jack Norris talks with Donnie Berkholz, PhD an analyst at RedMonk, about Disaster Recovery in the context of Hadoop on the What Are The Facts video series. RedMonk is the first and only developer-focused industry analyst.

You can watch the video here:

Jack Norris: Welcome to WATF, where we examine various issues and claims about Hadoop and ask WATF. In this episode, we look at disaster recovery. There have been a lot of conflicting information and claims about DR, and to help us cut to the core of this, Donnie Berkholz from RedMonk. RedMonk is the first and only developer-focused analyst firm. First of all, what is disaster recovery? Why do people care?

Jack Norris: Great – with respect to Disaster Recovery and Hadoop, what are the facts?

Donnie Berkholz: There's a few different approaches to DR and the Hadoop communities today. The first one is no DR at all, which is surprisingly common, as it turns out, because it turns out that cloning a Hadoop cluster is fairly expensive. But for those who are willing to make the commitment, there are a couple of different approaches.

The first one is called BDR, which is a graphical interface. It controls DR workflows and it's more of a file copy approach, which means you're transferring all the full files across the network every time you're doing a DR backup.

Jack Norris: What's the implication of that approach?

Donnie Berkholz: The implication is a lot more network traffic. You're sending entire files over the network instead of small differentials of changes over time across them. This takes a lot of time and it can be a lot more expensive. If you have business operations where you're trying to recover from a disaster, it means you're taking a lot longer to get up and running, and losing more money as a result.

Jack Norris: Is it a common approach to do a file copy approach with respect to DR? Is it used in other systems?

Donnie Berkholz: The much more common approach is to go with differential changes over time. This is an approach that goes back to standard Linux technologies like Rsync.

Jack Norris: Great – let's dive into a little more detail on this third approach. What is that third approach? How do you describe it? Tell us some more details.

Donnie Berkholz: Third approach, which is pretty common with a number of databases and other DR solutions, is mirroring. With mirroring, you're keeping a live backup that's in sync all the time, so you're not worried about a regular copy of files over a slow network.

Jack Norris: How does recovery work in that environment?

Donnie Berkholz: In that kind of environment, you're ready to get up and running almost immediately, because you're constantly sending the latest changes across the network to make sure that backup is ready to go.

Jack Norris: This is probably a good time to point out that that's how MapR works. It supports mirroring, not a file copy approach. The next time that you're talking about DR and Hadoop and the issue comes up, make sure you ask, "What are the facts?"

This blog post was published August 18, 2014.