MapR Announces Complementary Data Management and Logistics for RAPIDS Open-Source Software from NVIDIA
October 10, 2018
Accelerates Data Access and Production Deployments for NVIDIA-Driven Data Science
Santa Clara, Calif. – October 10, 2018 – MapR Technologies, Inc., provider of the industry's leading data platform for Artificial Intelligence (AI) and Analytics, today announced support within the MapR Data Platform to accelerate data access and production deployments for data science through the RAPIDS open-source software.
MapR uniquely helps data scientists accelerate the access of required training data by focusing on easing the issues of on-boarding, cleansing, cataloging, and feeding data at high performance to GPUs and NVIDIA DGX systems. The MapR solution also manages the deployment and management of multiple models into production to speed business impact.
"The challenge for most data scientists is the data logistics to locate, prep and access the right data for training. In many cases, 90 percent of the time is spent data wrangling," said Anil Gadre, EVP and chief product officer, MapR Technologies. "MapR complements RAPIDS with a data management and logistics fabric to accelerate the high-scale processing and access of disparate data across geographies. The same fabric also speeds the deployment of models into production and coordinates the continuous deployment and updating of multiple models to impact business in real-time at scale."
Central to the solution is the ability to coordinate data flows from across the enterprise and, through a pre-built MapR container for GPUs, make it easy to integrate into NVIDIA's complete end-to-end data science training pipelines. The MapR Data Platform for RAPIDS enables data scientists to:
- Collect data at scale from a variety of sources and preserve raw data so that potentially valuable features are not lost
- Make input and output data available to many independent applications even across geographically distant locations, on premises, in the cloud or at the edge
- Manage multiple models during development and easily roll into production
- Improve evaluation methods for comparing models during development and production, including use of a reference model for baseline successful performance
- Support rapid stream-based delivery of standard files including Parquet, ORC, JSON, Avro, and CSV file formats directly into RAPIDS
"MapR's work with NVIDIA in the RAPIDS ecosystem is helping make broad adoption in the enterprise easy for the largest breadth of workloads," said Clément Farabet, vice president of AI infrastructure at NVIDIA. "MapR's ability to span on-prem and cloud, from IoT edge to core with a scalable, high-performance common platform means that more data can be fed to GPUs and more innovative applications can be created by data scientists faster."
The MapR container for GPUs aimed at making it easy to integrate automated data logistics into RAPIDS is available today in NVIDIA's repository www.RAPIDS.ai.
A Reference Architecture (URL) providing detailed technical information on how to configure NVIDIA DGX systems with MapR Data Platform for optimal use is also available here.
About MapR Technologies
MapR Technologies is a visionary Silicon Valley software company and creator of the next-generation data platform for AI and analytics, with the scale and reliability required by enterprise-grade, mission-critical deployments. The MapR Data Platform delivers the power of dataware to accelerate data-driven innovation. Forward leaning companies such as Cisco, Philips, and Société Générale, are able to create new data-driven solutions to outperform the competition. Learn more: mapr.com.
MapR is a registered trademark of MapR Technologies, Inc. in the United States and other countries. Other names and brands may be the property of others.