3 min read
So, let's say you want to share your notebooks with a colleague or just utilize the persistent storage provided by MapR XD. How would you go about doing this?
Convergence and portability are at the heart of our design of the MapR Data Science Refinery container. The goal was to create something really agile that would allow our customers to create turn-key development environments that they could spin up or down as needed, while still leveraging their MapR Data Platform as a persistent store.
The ability to leverage the global namespace from the container is exactly the sort of ability that you need to create a secure collaborative environment. Why create a new space with all of the security and IT overhead involved, when you already have all of this securely in place in your cluster? Simplicity is key here.
A typical container can only access the space inside the container and the underlying filesystem on which you're running Docker. That's fine for some use cases but doesn't really take advantage of the benefits that the MapR Data Platform offers:
So, we've included the MapR POSIX Client for Containers in our build, enabling you to access your global namespace:
And, since security is handled by passing a security ticket into the container, you (and your collaborators) only have access to those parts of the file system that you've explicitly been granted permissions for.
So, let's say that you've created a space in your global namespace, where you want to share notebooks with others, and that this space is:
This space contains a notebook that our very own Ian Downard has shared with you, called Churn Prediction with Spark:
In order to access this notebook using the Data Science Refinery container, it's as simple as pointing Zeppelin, in the Docker run command, to this directory. The switch that you need for this is:
Now, when you spin up your container and log into Zeppelin, you'll see that this notebook has been added to your list:
In addition, Zeppelin has built-in integration, including versioning support with Git. Read more here:
You can watch a demo of how to do this collaboration here (time pinned to the beginning of the relevant demo):
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.