With the MapR Data Science Refinery, MapR provides businesses with a suite of data science tools to help them distill insights from their data and turn those insights into operational next-gen applications.
MapR has recognized the need for agile, containerized solutions that can scale to fit the needs of all types of data science teams. Within the MapR platform, support is offered for popular open source tooling in a preconfigured offering that can be distributed to many data science teams across a multitenant environment.
The MapR Data Science Refinery is an easy-to-deploy and scalable data science toolkit with native access to all platform assets and superior out-of-the-box security.
The MapR Data Science Refinery offers:
Access to All Platform Assets - The MapR FUSE-based POSIX Client allows app servers, web servers, and other client nodes and apps to read and write data directly and securely to a MapR cluster, like a Linux filesystem. In addition, connectors are provided for interacting with both MapR-DB and MapR-ES via Apache Spark connectors.
Superior Security - The MapR Platform is secure by default, and Apache Zeppelin on MapR leverages and integrates with this security layer using the built-in capabilities provided by the MapR Persistent Application Container (PACC).
Extensibility - Apache Zeppelin is paired with the Helium framework to offer pluggable visualization capabilities.
Simplified Deployment - A preconfigured Docker container provides the ability to leverage MapR as a persistent data store.
The MapR Data Science Refinery is the only data science offering with secured access to all data. It connects out of the box with:
MapR-XD: for files and containers
MapR-DB: a highly scalable, multi-model, NoSQL database management system
MapR-ES: global publish-subscribe event streaming system
A core component of the MapR Platform, MapR-ES is a global publish-subscribe event streaming system for big data. With native integration between MapR-ES and machine learning libraries, organizations can now create real-time machine learning pipelines, allowing them to apply ML models to real-time data.
The MapR Data Science Refinery offers the Apache Zeppelin Data Science Notebook to provide the ability to work across many engines in one visual space:
Easy To Deploy
The MapR Data Science Refinery comes with 8 out-of-the-box visualization libraries, including MatPlotLib and GGPlot2. Apache Zeppelin provides a pluggable visualization framework to enable:
The MapR Converged Data Platform is ideal for storing model and notebook repositories. Organizations can leverage the MapR Platform’s global namespace and superior replication capability. The MapR Platform also offers immutable snapshots to persist and deploy various versions of the same model, making it possible for data scientists to compare the performance and accuracy of each version of the model.
Machine learning models are only as good as the data they are trained on. With the MapR Data Science Refinery, data scientists get access to all data, which improves the accuracy of the models.
Using MapR-ES, the MapR Data Science Refinery allows data scientists to create real-time machine learning pipelines. Organizations can now apply machine learning models to real-time data to gain instant business insights.
MapR Data Science Refinery provides access to a broad range of popular data science tools and libraries, making it easy for data scientists to select the tool of their choice. As a result, data scientists are more productive.
The MapR Data Science Refinery is easy to deploy and manage. It also provides access to data in-place, removing the need for additional hardware for copying data. As a result, the MapR Data Science Refinery has a lower TCO compared to other data science offerings.
The MapR Data Science Refinery provides pluggable and broad visualization support, helping business leaders and decision makers to visualize the business as it happens.
The MapR Data Science Refinery helps organizations incorporate machine learning and AI into day-to-day business workflows, enabling intelligent processes that can operate without human intervention.
Get Started With The MapR Data Science Refinery
ML is an active area of research and market innovation. There are game-changing ML companies, investing to improve data science productivity as well as build domain-specific machine learning solutions. As a data platform company, we want to be open and give our customers flexibility to use these solutions on the PBs of business data they are relying on MapR to store and manage. MapR has a robust Converged Partner program, and we’re extending this program with selected Refinery partnerships as a holistic approach to enabling the MapR Platform for all types of data science teams.
The MapR Community’s Data Science Refinery Page is the place to go for: