Why Is Kubernetes with MapR Better?
MapR Data Fabric for Kubernetes provides persistent storage for containers and enables the deployment of stateful containers. It addresses the limitations of container use by providing full data access from within and across clouds and on-premises deployments. Now stateful applications can easily be deployed in containers for production use cases, machine learning pipelines, and multi-tenant use cases.
The combination of distributed computing, streaming analytics, and machine learning is accelerating the development of next-generation intelligent applications, which take advantage of modern computational paradigms powered by modern computational infrastructure.
The MapR Data Platform combines a fully read/write distributed file system with the unusual features of a built-in NoSQL database and built-in stream transport. Data handled by MapR is directly accessible by AI and analytics tools, legacy programs, and via modern open source APIs. The MapR platform serves as dataware for building a comprehensive data system across on-premises, multi-cloud, or hybrid data centers.
The MapR XD Distributed File and Object Store is designed to store data at exabyte scale, support trillions of files, and combine analytics and operations into a single platform. MapR XD supports industry standard protocols and APIs, including POSIX, NFS, S3, and HDFS. Unlike Apache HDFS, which is a write once/append-only paradigm, the MapR Data Platform delivers a true read-write, POSIX-compliant file system. Support for the HDFS API enables Spark and Hadoop ecosystem tools for both batch and streaming to interact with MapR XD. Support for POSIX enables Spark and all non-Hadoop libraries to read and write to the distributed data store as if the data was mounted locally, which greatly expands the possible use cases for next-generation applications. Support for an S3-compatible API means MapR XD can also serve as the foundation for Spark applications that leverage object storage.
The MapR Event Store for Apache Kafka is the first big-data-scale streaming system built into a unified data platform and the only big data streaming system to support global event replication reliably at IoT scale. Support for the Kafka API enables Spark streaming applications to interact with data in real time in a unified data platform, which minimizes maintenance and data copying.
MapR Database is a high-performance NoSQL database built into the MapR Data Platform. MapR Database is multi-model: wide-column, key-value with the HBase API, or JSON (document) with the OJAI API. Spark connectors are integrated for both HBase and OJAI APIs, enabling real-time and batch pipelines with MapR Database.
The combination of Kubernetes and the MapR Data Platform form a powerful pair for taking advantage of application deployment via containers, either on-premises, in cloud and multi-cloud, or as a hybrid on-premises/cloud architecture. Kubernetes provides the orchestration layer for containerized applications, and MapR acts as the dataware needed for data orchestration. In this context, you can think of MapR as being like Kubernetes for data.