Dataware for data-driven transformation

Kubernetes Basics - Whiteboard Walkthrough

Contributed by

5 min read

Editor’s Note: For applications that run in Docker-style containers and orchestrated by Kubernetes, you need a way to store and access state outside the container. In this Whiteboard Walkthrough, Skyler Thomas, Principal Engineer at MapR, shows step-by-step how simple it is to access data stored in the MapR Data Platform from a Kubernetes pod you create or one you’ve downloaded from the Docker Hub.

Hi, this is Skyler Thomas. I'm an engineer here at MapR. Today we're going to talk about KubernetesBasics and how you access MapR data from a pod that you created or a pod you downloaded from something like Docker Hub.

In the Kubernetes world, applications that you build are run inside of containers, and you can build those containers yourself, or you can download a standard container from Docker Hub, like PostgreSQL. Those containers have no way on their own to store state information, and if you want to access state, or store state, you need MapR or something like it to store your data. The way we do that is that we specify where we mount data into the pod within the YAMLs that you ship with your pods. We specify a place where we can see that data externally, so here I have a link to data in the /mapr/c1/data/db17 directory. And I want to have that visible inside of my pod in the /db directory. So depending on whether I'm external or internal, I will see the data in a different location on the file system. If I have a container and something like Postgres, when I download that container, baked already inside of the container are several directories. Things like your user directory, your bin directory, your home directory. But we want this DB directory to contain the data that's in MapR at this external location. We do that by specifying it in a set of YAML files.

Now the way this works physically is that you have a YAML file that references your Postgres container out in Docker Hub. Also, within that YAML, you specify an internal mount point, and that internal mount point is this /db directory. That's where the mount is going to appear if you're inside of this pod.

Now, you will also specify an external persistent volume claim that will provide that data to that internal mount point. And this is a Kubernetes concept, it's something that you specify, reference to within your pod YAML. Then at run time, that persistent volume claim will bind to a persistent volume, and that persistent volume will call out to a volume plugin. So before any of this thing can work, you need to install the MapR volume plugin, and you do that just by installing the YAML file for that volume plugin.

At run time, when the persistent volume calls out to this volume plugin, the volume plugin will call out via a FUSE driver and create a POSIX mount to a real MapR volume that's specified here. Now we specify this location along with some other parameters here in our persistent volume, and that's where we specify the YAML for the external mount point.

Now let's take a look at the YAML for the persistent volume to look at the different parameters that are there. You'll see that we point to the name of our plug-in, and you'll see that we point to a ticket location. This ticket location will determine what you can see on the file system when the FUSE POSIX driver creates the mount.

So that's it – it's simple. That's how simple it is to mount MapR into Kubernetes. Thank you very much. I'm Skyler Thomas.

This blog post was published January 28, 2019.

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.

Get our latest posts in your inbox

Subscribe Now