Dataware for data-driven transformation

MapR Data Platform Installation On Rancher Kubernetes

Contributed by

12 min read

Introduction

Most projects these days require a Kubernetes cluster; it is a proven tool for managing and deploying Docker-based containers in production environments. These containers in a Kubernetes environment are short-lived and stateless. When containers are stopped or moved, the state of the containers is lost. The MapR Data Platform provides persistent storage for pods and their containers. It also allows containers running Kubernetes to use the MapR File System for all their storage needs.

In this blog post, we will look into how MapR FlexVolume Driver can be installed in a Rancher Kubernetes environment and how pods can use Mapr-FS as storage.

About Rancher Kubernetes

Kubernetes is a powerful engine for container orchestration; it is becoming a standard to manage hundreds or even thousands of containers in a production environment. Rancher includes a full Kubernetes distribution that can be deployed in different cloud vendors. Rancher adds value by unifying all of them as a single Kubernetes Cloud and providing single authentication and access control.

MapR Volume Driver Plugin for Kubernetes

The MapR Data Fabric for Kubernetes consists of a set of Docker containers and their respective .yamlconfiguration files for installation on Kubernetes. Once installed, Kubernetes FlexVolume Driver for Mapr-FS and Kubernetes Dynamic Volume Provisioner are available for both static and dynamic provisioning of MapR volumes.

Steps for MapR Volume Driver Plugin Installation

The MapR Volume Driver Plugin for Kubernetes allows running any Docker container from Docker Hub on a Kubernetes cluster where MapR is the persistent data store for the container. The MapR Volume Driver Plugin consists of various YAML files to configure and deploy pods and containers. The YAML files for the MapR Volume Driver Plugin can be found on the public package.mapr.com repository.

  1. Locate and check the latest version of the MapR Volume Driver Plugin: https://package.mapr.com/tools/KubernetesDataFabric/
  2. Download the version v1.1.0 files. For the plugin, download the Ubuntu version. Even though the host is running CentOS (RedHat), kubelet in Rancher is running Ubuntu, so use the Ubuntu plugin.
    wget https://package.mapr.com/tools/KubernetesDataFabric/v1.1.0/kdf-namespace.yaml
    wget https://package.mapr.com/tools/KubernetesDataFabric/v1.1.0/kdf-rbac.yaml
    wget https://package.mapr.com/tools/KubernetesDataFabric/v1.1.0/kdf-plugin-ubuntu.yaml
    wget https://package.mapr.com/tools/KubernetesDataFabric/v1.1.0/kdf-provisioner.yaml
    
  3. After the YAML files are downloaded, configure the IP address of the Kubernetes master in the volume driver plugin YAML configuration.
    Set value to match the Kubernetes Service location. In the Rancher environment, it is set to the following:
    - name : KUBERNETES_SERVICE_LOCATION
                value: "10.43.0.1:443"
    
  4. Configure the FLEXVOLUME_PLUGIN_PATH:
    - name : FLEXVOLUME_PLUGIN_PATH
                value: "/var/lib/kubelet/volumeplugins"
    
  5. Configure the hostPath:
    volumes:
          - name: plugindir
            hostPath:
              path: /var/lib/kubelet/volumeplugins
    
  6. In Rancher environment, modify the kubelet container configuration to allow volume mount for the following:
    (Note: This step will upgrade the kubelet service and restart the kubelet container. Please proceed with caution.)
    1. Go to “Kubernetes” -> “Infrastructure Stack” -> Click on “kubelet.”
    2. On the right side, click the edit and choose “Upgrade.”
    3. Click on Volumes tab and add the following:
      /opt/mapr:/opt/mapr:shared,z
      /etc/kubernetes:/etc/kubernetes:shared,z
    4. Click on Upgrade.
    5. Wait for the new kubelet container to start up.
    6. (Optional) Once the new kubelet comes up fine, click on "Finish upgrading." It should clean up the old kubelet completely.
  7. Deploy the plugin and provisioner YAML files. Use the kubectl command line tools to load the YAML files into Kubernetes:
    $ kubectl create -f kdf-namespace.yaml
    $ kubectl create -f kdf-rbac.yaml
    $ kubectl create -f kdf-plugin-ubuntu.yaml
    $ kubectl create -f kdf-provisioner.yaml
    
  8. Check that the MapR Volume Driver Plugin is deployed in the Kubernetes cluster by navigating to the overview of the “mapr-system” namespace or from the command line as shown below:
    [user01@psdemo6279 examples]$ kubectl get daemonset mapr-kdfplugin -n mapr-system
    NAME             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
    mapr-kdfplugin   3         3         3       3            3           <none>          5d
    [user01@psdemo6279 examples]$ kubectl describe daemonset mapr-kdfplugin -n mapr-system
    Name:           mapr-kdfplugin
    Selector:       name=mapr-kdfplugin
    Node-Selector:  <none>
    Labels:         k8s-app=mapr-kdfplugin
    Annotations:    deprecated.daemonset.template.generation: 1
    Desired Number of Nodes Scheduled: 3
    Current Number of Nodes Scheduled: 3
    Number of Nodes Scheduled with Up-to-Date Pods: 3
    Number of Nodes Scheduled with Available Pods: 3
    Number of Nodes Misscheduled: 0
    Pods Status:  3 Running / 0 Waiting / 0 Succeeded / 0 Failed
    Pod Template:
      Labels:           name=mapr-kdfplugin
      Service Account:  maprkdf
      Containers:
       mapr-kdfplugin:
        Image:      maprtech/kdf-plugin:1.1.0_001_ubuntu
        Port:       <none>
        Host Port:  <none>
        Command:
          bash
          -c
          /opt/mapr/plugin/copy2host
        Requests:
          cpu:     500m
          memory:  2Gi
        Environment:
          KUBERNETES_SERVICE_LOCATION:  10.43.0.1:443
          FLEXVOLUME_PLUGIN_PATH:       /var/lib/kubelet/volumeplugins
        Mounts:
          /etc/localtime from timezone (ro)
          /host from host (rw)
          /hostetc from hostetc (rw)
          /plugin from plugindir (rw)
      Volumes:
       timezone:
        Type:          HostPath (bare host directory volume)
        Path:          /etc/localtime
        HostPathType:
       plugindir:
        Type:          HostPath (bare host directory volume)
        Path:          /var/lib/kubelet/volumeplugins
        HostPathType:
       host:
        Type:          HostPath (bare host directory volume)
        Path:          /opt
        HostPathType:
       hostetc:
        Type:          HostPath (bare host directory volume)
        Path:          /etc/kubernetes
        HostPathType:
    Events:            <none>
    [user01@psdemo6279 examples]$
    

Deploy Test Applications to Verify Plugin Installation

Statically Provisioning a MapR Volume

Now that the MapR Data Fabric is installed, we will launch a test application on the Kubernetes cluster, which will leverage the MapR Data Platform as the persistent data store for the containers.

  1. Git clone KubernetesDataFabric repository:
    $ git clone https://github.com/mapr/KubernetesDataFabric/tree/master/examples
    
  2. Start by creating a namespace by editing testnamespace.yaml (e.g., mapr-samples).
    $ kubectl create -f testnamespace.yaml
    
  3. Next, create a Kubernetes secret; edit the file testsecureticketsecret.yaml.
    1. Create a security ticket on a MapR cluster as a “mapr” user:
      $ maprlogin password
    2. Encode (base64) the ticket generated:
      $ echo -n $(cat /tmp/maprticket_5000) | base64 -w 0
    3. Update the CONTAINER_TICKET in testsecureticketsecret.yaml:
      CONTAINER_TICKET:
      $ kubectl create -f testsecureticketsecret.yaml
      
  4. Create and deploy a pod, using the secret and namespace created in step 2 and step 3.
    Open testsecure.yaml and configure the cluster, cldbHosts, securityType, ticketSecretName, and ticketSecretNamespace. For example:
                cluster: "my.cluster.com"
                cldbHosts: "psdemo6282.mapr.com:7222
                psdemo6282.mapr.com:7222"
                securityType: "secure"
                ticketSecretName: "mapr-ticket-secret"
                ticketSecretNamespace: "mapr-samples"
    
    $ kubectl create -f testsecure.yaml
    
  5. Verify MapR File System mounted on /mapr. To verify, connect to the pod.
    $ kubectl get po --all-namespaces | grep mapr-samples
    mapr-samples   test-secure                            1/1       Running   0          2d
    $ kubectl exec -it test-secure  -n mapr-samples -- sh
    / $ ls /mapr
    apps       hbase      installer  oozie      opt        postgres   test       tmp        user       var
    / $ touch /mapr/tmp/file_from_test_secure
    
    Verify the file is created on MapR File System in the cluster. Run this command on one of the MapR cluster nodes.
    $ hadoop fs -ls /tmp/
    Found 2 items
    -rw-r--r--   3 mapr mapr          0 2019-02-11 12:58 /tmp/file_from_test_secure
    drwxr-xr-x   - mapr mapr          4 2019-01-29 10:43 /tmp/nftest
    

Dynamically Provision a MapR Volume

Unlike static provisioning, there may be use cases where dynamic provisioning is useful – specifically, in cases where you do not want MapR and Kubernetes administrators to create storage manually to store the pod state/data. In this case, a Persistent Volume is created automatically, based on the parameters specified in the referenced StorageClass that the MapR Data Platform is installed; we will launch a test application on the Kubernetes cluster, which will leverage the MapR Data Platform as the persistent data store for the containers.

  1. Git clone KubernetesDataFabric repository:
    $ git clone https://github.com/mapr/KubernetesDataFabric/tree/master/examples
    
  2. Start by creating mapr-samples namespace, edit testnamespace.yaml, and set name to mapr-samples.
    $ kubectl create -f testnamespace.yaml
    
  3. Create provisioner secret using testsecurerestsecret.yaml, update the MAPR_CLUSTER_USER and MAPR_CLUSTER_PASSWORD by encoding (base64) username and password (echo -n "mapr"| base64 -w0). Here, the username and password are set to “mapr.” For example:
    MAPR_CLUSTER_USER: bWFwcg==
    MAPR_CLUSTER_PASSWORD: bWFwcg==
    
    $ kubectl create -f testsecurerestsecret.yaml
    
  4. Create StorageClass, edit testsecureSC.yaml, update MapR cluster rest service IP, cldbHosts, and cluster name. For example:
        restServers: ""psdemo6284.mapr.com:8443"
        cldbHosts: "psdemo6282.mapr.com:7222 psdemo6282.mapr.com:7222"
        cluster: "my.cluster.com"
        securityType: "secure"
        ticketSecretName: "mapr-ticket-secret"
        ticketSecretNamespace: "mapr-samples"
        namePrefix: "pv"
        mountPrefix: "/pv"
        advisoryquota: "100M"
    
    $ kubectl create -f testsecureSC.yaml
    
  5. Create pod, using testsecureprovisioner.yaml; this file references storage class “secure-maprfs.”

    $ kubectl create -f testsecureprovisioner.yaml
    $ kubectl describe pod test-secure-provisioner -n mapr-samples
    
    [user01@psdemo6279 examples]$ kubectl describe pod test-secure-provisioner -n mapr-samples
    Name:         test-secure-provisioner
    Namespace:    mapr-samples
    Node:         psdemo6280.mapr.com/10.12.205.43
    Start Time:   Thu, 14 Feb 2019 00:15:21 -0500
    Labels:       <none>
    Annotations:  <none>
    Status:       Running
    IP:           11.222.58.145
    Containers:
      busybox:
        Container ID:  docker://f7931943e6d3d302ce66b2d3c5da5d3c8b65f734a01f0126f6c5dee2cf0c5818
        Image:         busybox
        Image ID:      docker-pullable://docker.io/busybox@sha256:7964ad52e396a6e045c39b5a44438424ac52e12e4d5a25d94895f2058cb863a0
        Port:          <none>
        Host Port:     <none>
        Args:
          sleep
          1000000
        State:          Running
          Started:      Thu, 14 Feb 2019 00:15:30 -0500
        Ready:          True
        Restart Count:  0
        Environment:    <none>
        Mounts:
          /dynvolume from maprfs-pvc (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from default-token-7sqkg (ro)
    Conditions:
      Type           Status
      Initialized    True
      Ready          True
      PodScheduled   True
    Volumes:
      maprfs-pvc:
        Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
        ClaimName:  maprfs-secure-pvc
        ReadOnly:   false
      default-token-7sqkg:
       Type:        Secret (a volume populated by a Secret)
        SecretName:  default-token-7sqkg
        Optional:    false
    QoS Class:       BestEffort
    Node-Selectors:  <none>
    Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                     node.kubernetes.io/unreachable:NoExecute for 300s
    Events:
      Type     Reason            Age                    From                                 Message
      ----     ------            ----                   ----                                 -------
      Warning  DNSConfigForming  3m31s (x622 over 13h)  kubelet, psdemo6280.mapr.com  Search Line limits were exceeded, some search paths have been omitted, the applied search line is: mapr-samples.svc.cluster.local svc.cluster.local cluster.local rancher.internal mapr.com amer.mapr.com
    [user01@psdemo6279 examples]$
    

    When the pod is getting created, watch the logs in /opt/mapr/logs/provisioner-k8s.log on the node that is running the mapr-kdfprovisioner pod. The below command will provide that information:

    $ kubectl get po -n mapr-system -o wide
    
    [user01@psdemo6279 examples]$ kubectl get po -n mapr-system -o wide
    NAME                                   READY   STATUS    RESTARTS   AGE   IP               NODE
    mapr-kdfplugin-k65vj                   1/1     Running   0          5d    11.222.207.147   psdemo6281.mapr.com
    mapr-kdfplugin-n27cw                   1/1     Running   0          5d    11.222.96.79     psdemo6279.mapr.com
    mapr-kdfplugin-np6c5                   1/1     Running   0          5d    11.222.46.39     psdemo6280.mapr.com
    mapr-kdfprovisioner-6766586754-f2h65   1/1     Running   0          16h   11.222.202.200   psdemo6280.mapr.com
    [user01@psdemo6279 examples]$
    

    Log in to that node and verify volume is provisioned in the log. As the volume is provisioned, there should be activity indicating a volume is provisioned in the provisioner-k8s.log file.

    $ tail -f /opt/mapr/logs/provisioner-k8s.log
    

    Verify the volume provisioned in MapR MCS. Log in to MapR MCS web UI as a MapR user.

    Open the URL in your favorite browser: https://psdemo6284:8443

    Navigate to Data -> Volumes. Note the volume name in the log file in step #5, and search for this volume by entering it in the search volume text box. Search should return a volume created.

  6. Connect to the pod, create a file in the volume provisioned, and verify it on the MapR File System.
    $ kubectl exec -it test-secure-provisioner -n mapr-samples --  sh
    $ cd /dynvolume
    $ touch create_file_from_pod
    
    On the MapR cluster, verify the file created:
    $ hadoop fs -ls /pv/pv.tcsccjvuhj
    Found 1 items
    -rw-r--r--   3 root root          0 2019-02-11 14:37
    /pv/pv-tcsccjvuhj/create_file_from_pod
    

Summary

In this blog post, we looked into setting up MapR in a Rancher Kubernetes environment, where the Kubernetes version is v1.11.

Recently, we released MapR Container Storage Interface. This interface is recommended if the Kubernetes version is v1.13 or greater.


This blog post was published March 12, 2019.
Categories

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.


Get our latest posts in your inbox

Subscribe Now