3 min read
The MapR Data Science Refinery (DSR) is an easy-to-deploy and scalable data science toolkit with native access to all MapR Data Platform assets and out-of-the-box security. MapR’s DSR includes tools such as the Apache Zeppelin notebook to facilitate collaboration plus Apache Spark, Apache Drill and Apache Hive for data processing and preparation. You can find out more about the role of the Data Science Refinery and the MapR Data Platform for AI and machine learning systems in the ebook Buyer’s Guide to AI and Machine Learning.
Ready to explore the DSR? We are proud to announce a set of tutorials allowing users to easily run the MapR Data Science Refinery in their local Docker environment and connect to their MapR cluster.
This tutorial is based on a set of step by step guides located in the following GitHub repository:
The content of the tutorial is the following:
The Data Science Refinery is deployed using containers. In this document you will learn how to configure and run the Data Science Refinery Docker container, allowing you to connect Apache Zeppin to your MapR Cluster.
Discover the power of Apache Zeppelin interpreters that let the user work with their favorite tools to process data, for example Apache Spark, Hive or Drill.
Discover how to access or create notebooks with Zeppelin on MapR.
Use Helium in Apache Zeppelin to create rich user experience in your notebook.
This documentation explains in detail the MapR Data Science Refinery for MapR release 6.1.
eBook Machine Learning Logistics by Ted Dunning & Ellen Friedman
eBook Getting Started with Apache Spark 2.x by Carol McDonald with Ian Downard
Webinar recording: "Getting Started with Spark 2.x and GraphFrames to Analyze Flight Delays and Distances" by Carol McDonald
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.