MapR Accelerates the Separation of Compute and Storage

Latest Release Integrates with Kubernetes to Better Manage Today's Bursty and Unpredictable AI/ML Workloads


Three Benefits for Separating Compute and Storage for Today's AI and Analytics Workloads


Today's AI/ML and analytical workloads can be bursty and unpredictable. Provisioning infrastructure for worst-case (maximum load) compute scenarios is costly and unnecessarily increases administrative overhead. Kubernetes solves for this, partially, by letting organizations orchestrate and spin up containers as compute needs arise.

Yet challenges persist. In particular, organizations are left with the difficult task of manually integrating their applications with Kubernetes constructs like Namespaces and Operators. Additionally, segregating and isolating resources within Kubernetes in today's de facto multi-tenant environment is far from automatic. Finally, organizations must make difficult choices as they try to leverage and persist data across tenant containers.


Key Capabilities in This Release* MapR accelerates the separation of compute and storage with the following key capabilities:

  • Handling compute bursts – typical of AI/ML workloads – by spinning additional compute containers without having to add more physical host servers
  • Deploying Spark and Drill container applications across multi-cloud environments, including private, hybrid, and public clouds
  • Running different versions of Spark and Drill on the same platform, facilitating the multiple stages of dev, test, and QA that are typical in a data engineer's workflow
  • Preventing applications from starving each other of resources by setting granular limits on quotas, and isolating resources by using Spark job operators to create different Spark clusters
  • Accommodating fluctuating query workloads by growing Drillbits dynamically based on load and demand

Key Technical Integrations MapR has introduced a number of key technical integrations that simplify this experience:

  • Tenant Operators create tenant namespaces (Kubernetes Namespaces) for running compute applications, allowing for a simple way to start complex applications in containers within Kubernetes.
  • Spark Job Operators create Spark jobs, allowing separate versions of Spark to be deployed in separate pods, facilitating the multiple stages of dev, test, and QA that are typical in a data engineer's workflow.
  • Drill Operators start a set of Drillbits, which allow auto-scaling of queries based on demand.
  • CSI Driver Operator mount persistent volumes through a standard plugin to run stateful applications in Kubernetes.
Learn More

*GA is expected in Q2 2019.