MapR Clarity: Your Clear Path to Data Management

MapR Clarity: Your Clear Path to Data Management

  • Data access, visibility, and metrics provide you with information, knowledge, and insights.
  • HDFS, NFS, S3...Object storage or Hadoop? Bring your analytics, real-time, Spark, and AI/ML jobs; MapR supports it all today.

Learn more about the MapR Clarity program


Bill: 00:01 Hi I'm Bill Peterson.

Suzy: 00:03 Hi I'm Suzy Visvanathan.

Bill: 00:05 Today we're gonna talk about yet another aspect of our Clarity Program. We're gonna focus on data management. So Suzy, with regards to data management, what are some of the concerns around the Cloudera Hortonworks merger and what does it mean for customers?

Suzy: 00:20 When you say data management several things comes into the minds of a customer. Like one it could mean data access. Two it could mean not just data access, but once I bring the data into the platform, what kind of visibility, what kind of metrics, what kind of information do I get at all times? So I have a better knowledge and insights into what is happening in that cluster, how much of that information I can use to plan in the future and so on.

Suzy: 00:49 So data management typically is a bigger topic. However, in order to talk about data management, the fundamental foundational aspect is you need to have a data platform. A data platform to access the data. Back even at least six seven years ago, we all took pride in specializing in a certain way to access the data.

Suzy: 01:13 For example, if it is only in FS, you always had vendors who propped up said, "I'm an AS vendor. I'm an AS vendor." Prior to that it was a blocked devices. Everybody took pride in being a block device. Then came S3, of course. That totally changed the way we perceived things. From just a data platform perspective of how you bring in the data, how you store it, how you access it, how elastic it can be.

Suzy: 01:39 So one of the things that MapR did in its own way and the pedigree of MapR is we realized early on that in order to have a good analytics experience, in order to let the customers run their big analytical jobs, not just today, but a decade from now or even a decade after this, is that it lies in the crux of the matter where you need to have a solid foundational data platform. Then we also realized instead of just focusing on a single access mechanism and just focusing on us being an AS vendor or just an analytical platform or a Hadoop platform, we said that having a platform and a way to access the data through any form of industry protocols or industry mechanism is ideal. So that just about most applications can then beat our numbers.

Suzy: 02:44 Our focus is not on being an AS vendor or a block vendor. That's not even our focus. Our focus is if a customer has an analytical job, be it AIML today, deep learning today, or it could be something else tomorrow, even GPU based workloads tomorrow. Or it could just be good ole Hadoop analytics, right? Whatever the case may be and whichever way they wanna access the data, we wanna give them a platform to do it.

Suzy: 03:15 Now it so happens that there are customers who bring in and access data using HDFS. Excellent, we do that. There are customers who bring in data using NFS, but they may be running analytics on it. They may be running real time jobs on it. They could be running spark jobs on it. Which is what we wanted to capture. So we support that as well.

Suzy: 03:37 S3, of course, is a huge developer based protocol. A way to bring in your data there. And then, nowadays, what I'm hearing in the industry is all of a sudden everybody's talking about object storage being the best platform to run your AIML. We know that already. We have done that already. We support it today.

Suzy: 03:57 So if you look at data management in itself, we have made the right choices and the decision and instead of just focusing or cornering ourselves into one particular vendor, we have always had the vision to say, "Today it is Hadoop. Tomorrow it is AIML. Is MapR ready to run the AIML jobs on it? Yes, we do." By just merely having a platform that is just so elastic, guess what. If a customer comes and says, "Forget about all of this. I am a container shop. I have switched over to a new environment and I'm using Kubernetes. Do you talk Kubernetes?" We can do that as well.

Suzy: 04:41 From a data management perspective, we are there already. We are continuing to invest in it. So when the customers are ready to make their journey, the MapR data platform is solid and ready for them to deploy on.

Suzy: 04:56 What happens with the merger is they have historically not had a data platform play. They have been very much an open source story. They have been focusing on Hadoop workloads. And I'm not saying that that is wrong. They have very much been focusing on that. Which is put the spotlight on them. Oh can you run spark? Can you run yarn? However, what has happened is now when you're talking about Kubernetes or containers or object storage, we being ready and having a product already and a platform already is a huge plus for any customers. Because otherwise a company like Cloudera Hortonworks are just now reinventing the wheels that has already been mature in the marketplace.

Suzy: 05:49 Those are some of the challenges that I see.

Bill: 05:51 Okay great. So from the Clarity Program point of view, what is better about our approach in data management? What's new?

Suzy: 06:00 Like I mentioned, we are not just focusing on just one or two protocols, per say, to bring in the data. We have a plethora or a catalog of choices which we like to offer to our customers.

Suzy: 06:14 Scale is one of the biggest things that resonates very well with our customers. We have time and again proven that when it comes to AIML, when it comes to analytics, there are really customers who may like to start small, but they quickly go into petabytes of data. So for us the scale and just the ability to quickly be elastic to expand with the customer is a huge plus for us.

Suzy: 06:45 But we have also made quite a bit of inventions, right? Meaning for us our hybrid cloud story is a huge thing. You'll hear a lot more about it later. We are already there. We have already talked about it. We have invested in it. And it's no secret it's out there for customers to use.

Suzy: 07:07 Likewise, all the integral aspect, even security, for example. We have a full end to end security. Secure by default. We have good integration with the data governance. We have good integration with a lot of other partner ecosystem who specialize in those things.

Suzy: 07:24 So many of these factors weigh into that. As to why our platform is the most ideal one.

Bill: 07:32 Right. Okay great. You mentioned customers there a couple of times. So put it in context for our listeners, what's a customer example? A good one for us for data management from MapR?

Suzy: 07:43 Yeah so the customer approach varies, of course, from region to region. But I'll give one example. Almost invariably the customer first starts off with what can your enterprise platform give me? Can it give me data protection? Can it give me ways to recover? Those are fundamentals, right? Ten years ago, those were things that were unique enough that a vendor could say, "Oh I support data protection and thereby I'm unique." Nowadays it's a stable stake.

Suzy: 08:17 So they start off with the bare minimum ask to a checkpoint to see. Do you have it? So you're engaging with the customers in that. However, the customers start differing from each other sort of in the upper layers, if you will. They all want the same enterprise features, there's no doubt about that. But what they differ in is how they bring in their data. Or rather their IO profile. Some of them are, you know, if you get into the nitty-gritty details, some of them would say, "I have a burst of spark jobs today. And later on, I may not have some." Some say, "Well you know I actually have structured data. I need to put it in a database. But what I do with it is all primarily dictated around the database."

Suzy: 09:10 So they differ in how they store the data, how they access the data, but the underlying platform benefits are almost the same across all of them. Another differentiation in the customers is many of them will say, "I wanna keep active data here. But ideally would like MapR to give me a cloud story." So that they can then figure out how to segregate the data. What to keep here. What to move out of MapR. And it becomes a managed service basically.

Suzy: 09:51 So customers use it in several different ways. But their fundamental requirements on what our data management's aspects should be, what the data platform should look like, is pretty much global across all of our hundreds and hundreds of customers.

Bill: 10:08 Got it. Great. Thank you very much. Join us next time when we'll talk about yet another aspect of the MapR Clarity Program. Thank you very much.

Suzy: 10:15 Thank you.