Using NFS and VIPs for High Availability


Michael Lewis

Sr. Director, Global Training Delivery, MapR

Using NFS and Virtual IP pool (VIPs) with the MapR File System allows you to do more with your data, and also helps ensure users are consistently able to access cluster data from client machines. In this webinar you will learn key differences between MapR-FS and HDFS. You will also learn how you can leverage MapR Virtual IP addresses and direct access NFS for data placement and accessibility.

In this Webinar you will learn:

  • Data ingestion with and without NFS;
  • How to implement NFS for accessing your data;
  • What VIPs are and how you can use them in your environment.


Michael: Good morning, everyone. So today we'll be talking about using NFS and VIPs for high availability. There's two learning goals that we'll be discussing today, accessing cluster data and I'll go through some use cases and teach you how to configure virtual IP addresses using the MapR platform. So we'll begin with accessing cluster data and what I want to do to start is I want to go over how you would typically access data and ingest data into a typical HDFS cluster, Hadoop Distributed File System.

Michael: So with HDFS there's many tools that you can use. Some of the more common would be Hadoop fs -put. This gives you the ability to copy files from your local system or in your Hadoop cluster internally to different locations. There's also Hadoop fs -copyFromLocal, which gives you the ability to copy a file locally from your local file system and then copy this into your Hadoop cluster. Next is something called Hadoop DistCp. This is short for distributed copy. It's a Map Only MapReduce Job that leverages the power of mappers to copy data within your cluster or to another cluster. It's often used for backups in a typical HDFS environment. There's also third-party tools or ecosystem tools. A couple examples would be Apache Flume and Apache Sqoop for ingesting data directly into your cluster. These are all common tools when discussing HDFS.

Michael: The opposite side of copying data into your cluster is, well, of course you also need to know how to copy data outside of your cluster. So similar to Hadoop fs-copyFromLocal, there's tools such as -copyToLocal in which you would take data out of your HDFS cluster and copy into your local file system and of course Hadoop fs -get similar to -put.

Michael: What I'm going to do here is I'm going to walk you through an example in which we're copying a text file into an HDFS cluster and then cover what you would need to do if you had to modify that data. So, first, we would copy our file, File 1 in this example. Text file as I've said. We copy it into a cluster using our standard tools Hadoop fs -put for example. We copy this in and some time has passed and then we realize that we need to modify the data. HDFS is read-only. There is an append function but that does not mean you can modify.

Michael: So if we actually had to modify the data or overwrite the data, what we would have to do is we'd have to take that file that's stored in our HDFS cluster and then copy it out to our local file system and make the modifications that we need. Now it doesn't stop there. Because this is read-only, it also means that you cannot overwrite a file within HDFS. You would have two options. You could either remove the source file from your HDFS cluster and then copy your new file in or simply copy this file into another location within HDFS.

Michael: Now MapR does not use HDFS. MapR has the MapR file system. The MapR file system is fully read-write. This gives us many options. For one, we now have the ability to read data as they are being written to your cluster. This is beneficial for many, many reasons. One of which we can discuss as log data. If you're analyzing log data, you're able to analyze it the moment it hits your cluster. It also gives us now the ability to overwrite and modify files directly in place. That's accompanied very well with MapR's Direct Access NFS. That's either through the NFS POSIX compliant client or the NFS Gateway that you'll see in a moment. This give you the ability to directly query and access your data using all your standard Linux commands. Because we're read and write, this also means we're fully executable. So now your cluster is open up to other options rather than straight standard Hadoop applications.

Michael: We are now able to load third-party applications such as Apache or whichever you see fit, MySQL, etc., and run them directly on your cluster. So, again, we now give you the ability to use your cluster for other utilities that you may not have considered when you were first along this route. To do this there's many ways to set up NFS through your MapR cluster. I'm going to talk about a few in a couple slides but I want to get down to some language before we go down that route. So as you can see this slide right here, it says mount your cluster file system locally. The default is /mapr/ and if you have installed the cluster using the GUI this will do this for you.

Michael: Now what does this mean, mount your file system locally? Well, if you're running the MapR NSF POSIX client or the NFS Gateway and you want to access the date on that same node, what you would do is you would mount your cluster through the loopback address not the IP address. This gives you incredible performance increase because now you're not operating over your network. You're operating over a memory channel. It's much, much faster. To be clear, you're certainly still able to mount over your IP address, and you'll see that in a moment, but mounting locally is always your best approach if you can. Doing this now gives you the ability to use all your standard Linux commands, sed, AWK, find, cat, whatever it may be you're now able to do directly on your MapR platform.

Michael: Using that same example that we were talking about earlier in which we've copied a text file in into HDFS and we now need to modify a file, well, let's go through that same process if you were using MapR. So we have a file on a client machine or maybe our ingest server. We've copied this text file into our cluster. We're now able to do that through cp or mv, rsync, su, whatever you prefer and this is going to give you much, much faster ingestion rates. If we have to modify that file, we can now modify in place. We do not need to copy that file out of the cluster, back to the local file system to copy it back in. If you're able, you can open that file right up in VI, make the changes using whichever favorite tool you have and then save it and then be done and it's in the same location. No need to move it at all. No different than your standard Linux file system.

Michael: So now what I'm going to do is I'm going to talk to you about how you would actually set this up and cover some use cases and go through why you would want to use virtual IP addresses in conjunction with accessing data over NFS. To be clear, this is not mandatory. You can use ... you can access your cluster and mount your cluster using your normal IP address or as you'll see why you may want to consider using virtual IP addresses to do this.

Michael: So, first, what is a VIP? Well, as I said, a VIP is short for virtual IP address. Typically, a VIP is a pool of static IP addresses. In this example, we can see our virtual IP is connected to a pool of real IP addresses or hardware of through The virtual IP address is going to connect to one of these IP's and that will be its primary connection and I'll talk more about that in a moment. So what's the benefit here? Well, in this example we have a client or ingest server connecting to the server Well, if that server were to fail or maybe there was a network connection drop, power supply, whatever the issue was that client would not have to re-mount to another IP address in this pool manually. It's possible that user doing this wouldn't even know how to do that and they would have to get an administrator involved.

Michael: A better approach would have that client connect to the virtual IP address of, which is tied to that pool of 1 through 5 as I said. It would maintain its connection to the first network and, in the event it failed, it would automatically mount to another IP completely transparent to the user. As far as the user is aware, they are still connected to Left the administration more up time.

Michael: Now the next couple of slides I'm going to show you several ways that you can accomplish this. There's many ways of doing this. So if this doesn't fit for you, that's okay. There's very likely an answer to one that would fit for you. So I'm going to explain this slide here as if it was a very small cluster that relies heavily on high availability. So we have three virtual IP addresses, through Each has a primary host. In this example for 25.1, its primary IP will be 25.37. It's backup will be 25.59, which coincidentally is the virtual IP 25.3's primary host and its backup host is 25.62, which is 25.2's primary. The benefit of this is now we have a small cluster. So we have the luxury of having primary and backup nodes dedicated to these VIPs. We're leveraging the power of these three nodes and using each other as a primary and a back up for the VIPs.

Michael: Now this is a very common scenario that you're looking at even in very large clusters. I know we started this talk off for using a smaller cluster but I said that only to show you the benefit of doing this through a large or small cluster. Now in this example Client 1 is connecting to 25.1, which in turn is rerouting its traffic out to 25.37. If that were to fail, the VIP would automatically be rewritten to 25.59 or redirected, apologies, redirected out to 25.59 and, again, the Client 1 would be unaware.

Michael: Now this port's really busy. We're showing this to show how granular you can actually get doing this. In this example each server has two network-interface cards, eth0, eth1, which of course have separate IP addresses. We can have virtual IP addresses for Subnet A and virtual IP addresses for Subnet B. We can have each be a primary with two backups and we can patrol the two subnets to ensure they're talking to two different NICs. You might want to do this to meter your load. You don't want all of your ingest traffic going through the same ethernet card. So this is just one way of limiting the amount of ingest going to each server or just giving high availability to two different subnets.

Michael: When a VIP is created within a pool, it will randomly select one of the IP addresses. As I've said it will maintain the primary connection to that IP address until the connection fails or if the connection fails. That failure can be from anything, network failure, hard drive failure, host goes down due to power. Whatever it is the system is intelligent enough to know that there was an outage and redirect to a host that is running an NFS gateway and is still up and, again, completely transparent to the user in question.

Michael: I'm going to walk you through how quick and easy this is to set up. This demonstration here I'm using the MapR Control System or the MCS. Simply click on surfaces, click on NFS, and then click add virtual IP. From there you're presented with several options. First, you have your starting virtual IP and your ending virtual IP address. This, as you can see, is how you would set up a pool of VIPs. Now if you would prefer to have just one virtual IP address, you can do that. Just ensure that the starting and ending is the same IP and of course that the appropriate network matched. There's two radio dials at the bottom. The first is use all network interfaces on all nodes that are running the NFS Gateway service. What this means is every single node within your cluster will be part of this pool. You can also select specific interface cards or network interface cards, which GLI, I'll show you in the next slide. Anything you can do in the MCS, you can do through the using maprcil virtual IP add.

Michael: Now in this example we see all the nodes that have NFS on them supported. What we would do is we would select the NIC that we want. Use the right arrow to select that into the pool. You can also go in here if you need to remove IP addresses that you no longer want associated with the IP pool. Just use the left arrow to remove it. Once that's done you're all set. From here you'll see information on the VIP pool. You can see the range. You can see the node that it's currently connected to, the physical IP and of course the MAC address.

Michael: Something important to note and that you'll see is that this is not strictly a MapR tool. I mean this comes with MapR, however, straight through the Linux environment you'll see your virtual network. You'll see 80, which is your physical network card on this node. You can see its IP address of and this is the virtual network interface card with the VIP, the virtual IP of, and, lastly, you'll see that they are sharing the same hardware address. This is very typical in networking when you're setting up VIPs. It's a very common approach. This is how the quick spillover happens.

Michael: Okay, lastly, some characteristics when dealing with VIPs. Remember, each virtual IP will send connections to one single node and it will stay on that node forever unless there is a failure. Virtual IP addresses are not load-balancing. If you still need to use load-balancing with this environment, your best approach is to put a load balancer in front of this. Still use virtual IP addresses in conjunction with the load balancer but, again, if you need true load-balancing that's your best approach. You can also you use DNS Round Robin if that works for you. If you're not familiar with that, that simply means you have one fully-qualified domain name with multiple IP addresses and each time it is resolved, it will resolve to a different IP address and remember you can select which NICs are assigned to what VIP. Okay, that was the example I showed you earlier.

Michael: Okay, that's it for my talk. This was a segment that we pulled from MapR Academy. You can continue with your learning by signing up for MapR Academy and going through ADM 203, which would be cluster maintenance. I'll spend some time if there are any questions.

David 2: Thanks, Michael. So just a reminder if you have any questions to enter them now into the chat window in the bottom-left corner of your browser. Just a few that have come in, Michael. Let's start with since the file system is read and write, does that mean it is also executable?

Michael: Yes, as I said earlier that's a huge benefit to using MapR. Because it's read and write, yes, you can fully execute files on there, which means you could run scripts within MapR. This is not something you can do with an HDFS.

David 2: Okay and then does MapR use /etc/exports?

Michael: Good question. No, but MapR does have its own exports file with an opt/mapr/conf. If you're familiar with etc/exports, it's the same structure. So if you're not aware, this is a way of limiting what you can and can't mount. So MapR is very security conscious. So we will allow mounting of specific network ranges, IP's and CAPs using this.

David 2: Great. Thanks, Michael. One more. Does the virtual IP need to be on the same network address as the host IP?

Michael: It does not. However, keep in mind if you are using a different network IP address then ... I'm sorry ... different network range then there would be some [inaudible 00:20:04] network side to support that but, no, there's no requirement on our side.

David 2: Okay. Another one that came in. If I'm in the process of reading a large file and I switch nodes during a failure, do I have to start reading the file over again?

Michael: You shouldn't have to. Questions like this I always am very delicate with how I answer that but, no, nothing should drop as it happens but of course depending on what type of failure it was and depending on how long you were out, well, I mean of course it's possibly but more commonly, no, you should not have to restart anything over again. It would be completed transparent to the user.

David 2: Okay, so it looks like that's all the questions that have come in today. So thank you, Michael, that is all the time we have and thank you, everyone, for joining us. For more information on complete training courses, please visit or for other useful information please visit Thank you again and have a great rest of the day.