• Home
  • Platform Overview

The MapR Data Platform

The Industry’s Next-Generation Data Platform for AI and Analytics

WEBINAR

MapR Clarity Program Webinar

THE MAPR DATA PLATFORM

The MapR Data Platform is an all-software data platform for AI and analytics that runs on commodity hardware across on-premises, cloud, and edge deployments. MapR users store, manage, process, and analyze all kinds of data with mission-critical reliability while meeting production SLAs. Unique to MapR is a core set of data services designed to ensure exabyte scale and high performance while providing unmatched data protection, disaster recovery, security, and management services for disparate data types, including files, objects, tables, events, and more. Open APIs and support for containerization ensure broad distributed application access and seamless portability of applications across disparate environments.

MapR includes a wide variety of analytics and open source tools such as Apache Hadoop, Apache Spark, Apache Drill, Apache Hive, and more. Additionally, support for POSIX allows cutting-edge AI and ML tools like new Python ML libraries to run on the same MapR cluster as your analytics.

DATA PLATFORM

APIs

XD DISTRIBUTED FILE AND OBJECT STORE

DATABASE

EVENT STORE FOR APACHE KAFKA

MAPR CORE SERVICES

DATA PLATFORM

APIs

MAPR CORE SERVICES

CLOSE

Color Waves

XD Distributed File and Object Store

MapR XD Distributed File and Object Store manages both structured and unstructured data. It is designed to store data at exabyte scale, supports trillions of files, and uniquely combines analytics and operations into a single platform. MapR XD Distributed File and Object Store supports a wide range of workloads, including AI/ML, analytics, and Hadoop. The XD Distributed File and Object Store is integrated with MapR Database and Event Store for Apache Kafka, allowing users to run nearly any workload on one cluster in production.

Learn More

Database

MapR Database is a high performance NoSQL database management system built into the MapR Data Platform. It is a highly scalable, multi-model database that supports wide-column, key-value, JSON (document), and time-series data in a single database allowing developers to choose the model best suited to their use case. MapR Database brings together operations, analytics, real-time streaming, and database workloads to enable a broader set of next-generation data-intensive applications. The Database is integrated with MapR XD Distributed File and Object Store and Event Store for Apache Kafka, allowing users to run nearly any workload on one cluster in production.

Learn More

Event Store for Apache Kafka

MapR Event Store for Apache Kafka is the first massively scalable publish-subscribe event streaming system built into a unified data platform. It is the only publish-subscribe streaming system to support global event replication reliably at IoT scale. Event Store for Apache Kafka supports the Kafka API and includes out-of-box integration with popular streaming frameworks such as Spark Streaming. The Event Store for Apache Kafka is integrated with MapR XD Distributed File and Object Store and MapR Database, allowing users to run nearly any workload on one cluster in production.

Learn More

FEATURES

COMMON SECURITY AND GOVERNANCE MODEL

  • MapR is secure by default and provides unified platform-level security across all data. Users benefit from built-in auditing, expressive authorization, and flexible authentication supporting any username/password registry and Kerberos. An enterprise data catalog covers governance across the enterprise, not just for your MapR deployment.

RESILIENCE AT SCALE

  • The MapR Data Platform ensures critical data is never lost via configurable levels of replication. Automatic failover ensures the cluster is always available so applications can run on a 24x7 basis, helping organizations meet stringent business SLAs. Files are mirrored using consistent, point-in-time snapshots, while tables and event streams are instantly replicated to ensure enterprise-grade disaster recovery capabilities.

GLOBAL NAMESPACE

  • Global namespace gives you a single view of all your data wherever it is. Access data sets on a remote cluster (assuming you have permissions) as if they were part of the local cluster. Run multiple cluster deployments across the globe as a single, logical cluster.

POLICY-BASED DATA TIERING

  • MapR makes it easy to balance performance, cost, and capacity tradeoffs by storing, managing, and analyzing data in different data tiers: hot, warm, and cold. For example, automated tiering lets you seamlessly move "inactive" data to and from S3-compatible object stores, including cloud-based object stores.

UNIVERSAL DATA SERVICE

  • MapR has a single distributed data service that can serve files, tables, and events. The advantage of this universal data service is that resiliency, availability, recovery, and security are built in, simple to manage, and consistent across real-time, interactive, and batch data sets.

DISTRIBUTED METADATA

  • MapR automatically replicates its metadata along with application data, making high availability (HA) part of the core architecture. This also makes it extremely easy to implement HA, which works right out of the box with no requirements for deploying specialized name-nodes on specialized hardware and with minimal configuration to setup and monitor.

SERVICE MANAGEMENT AND DISCOVERY

  • MapR provides a cohesive set of APIs to manage services on the cluster. These capabilities are essential when operating multiple multi-node clusters.

THE MAPR DATA PLATFORM IS OPEN AND FLEXIBLE

OPEN APIs

MapR supports open APIs like HDFS, S3, HBase, JSON, Kafka, and more. Unlike HDFS-based platforms, MapR also supports POSIX, which allows legacy applications, web servers, and other clients to read and write data directly to the MapR Data Platform.

OPEN SOURCE

With the MapR Ecosystem Pack (MEP), MapR supports a wide variety of open source tools including Hadoop, Spark, Drill, Hive, and more. Additionally, support for POSIX allows cutting-edge AI and ML tools like new Python ML libraries to run on the same cluster as your analytics.

DEPLOYMENT OPTIONS

The MapR Data Platform can be deployed on-premises, in the cloud, at the edge, or all of the above at the same time. By mirroring and replicating data between these disparate environments, MapR is your only data platform in an increasingly containerized, hybrid, and multi-cloud world.