Kubernetes Should Manage Your Containers. But How are you Managing Your Data?


Krishna Mayuram

Engineering Architect, Cisco Systems

Suzy Visvanathan

Director of Product Management, MapR

Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" and progressively improve performance on a specific task with data, without being explicitly programmed. Many machine learning systems run best in containers managed by Kubernetes. Kubernetes, however, only manages the compute side of the problem. That computation is done by programs running in containers, and containers work best when they aren't bogged down by tons of data.

How can we enable containers to work best and avoid getting bogged down? We have a solution.

In this webinar you will learn:

  • Learn about the key benefits of the Cisco UCS Integrated Infrastructure for AI and ML with MAPR Converged Data Platform
  • Understand why MapR becomes the premier platform for the full range of tasks needed for machine learning/artificial intelligence as well as the best data platform to augment Kubernetes.
  • See how Cisco UCS Director Express for Big Data provides a holistic interface for end-to-end system management and detailed and precise visibility and control over every part of an enterprise data platform, thus making cluster management simple and straightforward.


Susie: Good morning, good afternoon, good evening. Thank you for joining this webinar. I am product management here at MAPR technologies. Today we'll be talking about how you can manage your data with containers, and how Kubernetes allows you to orchestrate, but what specific steps you would need to be looking at in order to configure, manage, and scale your data under this environment.

Susie: A few context before we get into the exact details. This assumes that you are quite familiar with the containers in Kubernetes. If those are concepts that you are not aware of, there are several other webinars and blog posts that we do have at mapar.com, the pages that I highly encourage you to go and read and get familiarized with.

Susie: So let's talk about what MAPR is doing to enable a great data platform for your AI MLN analytics. So a little bit of history, and to walk you through the journey we have been, to get to where we are today, about nine years ago, we started off as just a Hadoop company. We quickly got a lot of traction in that area. We then came up with our database, which is now parDB to address those customers who wanted to have not only unstructured data, but a structured data on the same platform. Then we came up with innovative ideas for streaming, which is our MAPR's too. And then we have our own through Cisco [inaudible 00:01:50] as well, which is our MAPR drill.

Susie: So we have been steadily innovating, but at the same time, we have been progressing towards the industry's trends as well, so that this platform as you see it has evolved over time to quickly adapt and offer capabilities and features for any of the new technology trends. If you look at our pedigree, we are pretty much a platform that can offer a steady foundation for you to run any kind of applications.

Susie: So having said that, how is this made possible? So the foundation or the premise for us to be able to keep with the technology trends, to progress steadily and innovate, lies on a foundational data platform. Our scalable distributed data platform allows us to be quite versatile and varied in the kind of applications that can be run. So the first foremost thing that I should point out here is just in the way that we can handle different kind of protocols and the way we can handle different kind of interfaces. We are quite unique in that where we can support traditional NSS, we have a POSIX spaced client as well. We support REST, S3I API's, we support HDSS, and quite a plethora of industry standard protocols.

Susie: This is quite a foundational aspect to the basic table space, mainly because as the businesses grows, customers no longer stick to just one type of applications. So it was very important for us to be able to host a platform where you can bring in any type of applications that talk any kind of protocols that can be hosted on this. Having said that, that opened up quite a few possibilities. Just by having this ability, we can now go well and beyond just Hadoop. We are no longer just Hadoop, and as many of you are familiar with ANASI just goes with many customers, Hadoop no longer is the solution for big data. Big data has quite evolved and has taken many forms.

Susie: We have steadily made sure that as the ecosystem grows, as the tools used by you changes, we will be able to provide you with the correct platform for it. Many of you actually use quite a bunch of ... Are ardent of users of Spark, have moved on to TensorFlow and quite a plethora of tools that are available today for AI analytics and running your MO data pipeline.

Susie: The second aspect of course is how we are integrating with the other aspects that are integral for AI MLN analytics. You will be able to see in these slides that first and foremost, the edge environments, or the edge clusters, are quite a huge volume of data, and the source of data where a lot of analytics happen. Now edge can be deployed in a couple of different ways. It could be self contained at the sources, where data resides there, and you may have a need to run analytics right then and there. And/or, customers do have an environment where they bring in the data from these different ed sources, into a central depository or a data center, and then run analytics on top of them.

Susie: MAPR allows you to do that. We are not only catering to large on premises data center, but if you have such an edge environment, with a bunch of IOT devices, we allow you to run analytics in place or use our streams to transfer data from the edge nodes over to a central data center and run analytics there as well.

Susie: One of the biggest value add that you have is this combination of keeping it in their places and getting the value out of the data right then and there, and at the same time, you could also combine it with archival, or long term back up DR strategy by bringing the data to an on premises data center. This is an excellent use case when it comes to running quick real time analytics, and this is an excellent area where MAPR has definitely seen a lot of traction.

Susie: Another area where we do see quite a bit of adoption is in our cloud offerings. So while we cater to those on premises data centers and edge clusters, we do see quite a lot of innovation or test dev and POC's being done in the public cloud, so we have quite a catalog or menu of offerings. We are available in the market places, which is an easy go to place where you can quickly download the environment you need, spin up a MAPR cluster, and see the benefit of running analytics in the cloud.

Susie: We also offer multi cloud support. Depending on the cloud vendor of your choice, we have several different integrated features with them which you will see in each of their marketplaces that you can avail of. At the same time, we have a huge investment going on in the hybrid cloud strategy as well. Given that, the customer or the market segment predominantly has sensitive data on premises, over the years, almost every and each one of you do have a data center and a cloud strategy going on in parallel.

Susie: Many of you do use cloud as long term archival, but we do see instances or use cases where customers bring live data or training data to run modeling in the cloud, and then bring the data back to the on premises production environment. So in those cases, we do have a strong hybrid cloud set up capabilities, where you can move data from the on premise to the public cloud. You can spin the MAPR clusters in the cloud as and when you need, and in the cases of long term archival, you can have one directional data movement based on policies where you can then automatically tier your data from the on premise cluster over to the cloud.

Susie: As we go through some of these capabilities that MAPR offers, you can quickly envision that we started off as just a Hadoop company, but over the time we have made innovations in several areas. We have made innovation in ability to host different kind of applications on our platform. Our entire ecosystem is quite rich and varied. We have also given you a scalable foundational platform for you to scale from just mere terabytes, to petabytes of data. One of our largest customers has more than 100 petabytes of data in a single cluster. That is something that is quite phenomenal and quite a compelling story to share.

Susie: At the same time, innovation is also happening in the disparate environment each of you deal with on a day to day basis. If it is edge clusters, MAPR offers you a solution, If it is an on premise data center, there is a solution for that as well. If you're combining with the cloud and an on premise, we do have several capabilities for that.

Susie: Now having done all of this, one of the biggest and latest trends, after virtualization, has clearly been containers. So having all this premise about different types of data, different types of applications, of full MLAI ecosystem being hosted on MAPR, and different kind of environments that are supported, what is MAPR doing for containers and Kubernetes.

Susie: So, early on, we came up with a quick and dirty solution, if you will, for customers to be able to deploy applications in containers. So at the very beginning of this presentation, I had presented different kinds of interfaces that we support, one of which I had mentioned as the [inaudible 00:11:53] based POSIX client. So the POSIX client is mainly for ... Not just for communication and connecting your host clients to a MAPR cluster. It has quite a rich set of advantages over other connections, where it is fumed and performance optimized.

Susie: But extending that concept, we allow you to run that same client in a container. So you would run that container in your host server, and that will allow you to run the application container to communicate via this client, onto the MAPR platform. What are the benefits of doing that, having this quick and dirty version is, you can take advantage of the same existing MAPR benefits of accessing the data, analyzing the data. You can analyze the database, if you're using the database. You can analyze streams. All of that are accessible through this environment.

Susie: There are several customer who do use this for an easy way to deploy it. However, this environment is pretty much managed entirely by you. The customer or the end user is held responsible for managing your environment, how many containers are being deployed, how many you create, how many you manage. All kinds of ... Configuration management is entirely left to the end user.

Susie: Now what about planning containers in the production order the largest scale. So for that, we have a different kind of choices and capabilities that we offer. The easiest one, or rather the second option you have is, how we have integrated our entire platform with Kubernetes. So if you're familiar with Kubernetes, Kubernetes offers a very good, robust, pluggable API architecture. So by subscribing to those API's, MAPR exports a volume driver. What happens with that is, you can now use Kubernetes to manage your containers, you can ask Kubernetes to manage your compute, however, we will give you a way to manage your data through the volume driver. You host these volume over to the Kubernetes structures, and then leave it up to the Kubernetes to create, delete, move your containers, and have a direct communication over to the MAPR volumes.

Susie: If you're looking for an easy way to just plug and play, and manage your own containers, then you would probably better off with the first solution that I had offered, but if you are really looking for a scalable environment with a full management orchestration where we bring all the salient features of a persistent data. We give you enterprise features like snapshots, metering, replication, and you don't want to be worrying about how to manage your compute clusters, then you would want to be using this solution about where we have our volume driver integrated with Kubernetes. So if you are thinking about when to use which, then these are some of the guidelines that I will offer to pick and choose between the applications.

Susie: Now, we talked about how MAPR sets up a good platform, we talked about Kubernetes containers. How are we then making it easy for MLAI? Because of this combined capabilities, we have capabilities of global name space. Because of the global name space, we are able to allow you to see data independent of where you are. Whether it is an on premise data center, edge, or the cloud, the global name space gives you a unified view. You can run analytics of the same place in place if you need be, you can move the data across different environments, you can even call it as data tiers if you will, where we allow these policies to move the data and then run analytics where you want it to be. We allow so much flexibility across all of these different platforms. We give you a rich ecosystem package for you to pick and choose the app that you want to run, and we give you the scalable, persistent, secure environment that you would need.

Susie: So within the short span of 19, 20 minutes, you should have a fairly good idea of how MAPR is bringing together all the salient points, of giving a very good, nice software platform for you to run in.

Susie: Now having said that, how is MAPR then collaborating with Cisco to give you a full end to end turnkey solution to run your MLAI? This is where my colleague and good friend Krishna Mayura comes into picture, and he will take the story that I've been saying so far and extend it to how Cisco does it well from their end, and how the marriage together actually works well.

Susie: So Krishna, over to you.

Krishna: Thanks Susie, thanks for having me here. Again, good morning folks based in another time zone. However you are, good morning, good evening, and thanks for joining. So let me give the journey for this, how MAPR and Cisco together, enabling a platform for mission learning and analytics based on artificial intelligence to scale through the set, from terabytes, zettabytes, even to ... And go beyond.

Krishna: So, the journey starts here, defining the major problems of our challenges we see with big data and AI/ML together. So one of the things is distributed applications and data. As we all know, the source of data is applications. Today, applications in any enterprise deployment are the protocols, multiple types of physical, virtual, digital, across different silos. Everything has human limitations. What kind of skills do we have today which will help us harness the AML capabilities from the data.

Krishna: So that means if you want to unlock the data, you need specialized kits, not only the just hardware and software, and how are we enabling making things easier. Third thing is, traditional management. So building a monster to monster, to manage the monster, the big data is going beyond anyone's imagination. I'm going to show it in the next slide, how did we go from several in a scale of several terabytes, to zettabytes and beyond. So such a monstrous data volume, how are we going to manage. These are the key challenges of AML and big data.

Krishna: So, one of the other interesting this shows based on a [inaudible 00:19:58] research, it is better to have a larger data set to trade the models. That means more data, better results. Instead of relying on some assumptions and weak correlations. Presence of more data results in better and more accurate models will always be helpful in identifying and solving a problem to get the better insights.

Krishna: So as I said, with we're talking about building a monster to build a monster, this is what the monster is. Starting from 1986 to 2019, we started total data volumes about three exabytes, and now we are into four zettabytes data. As you see, these are exponentially grained. Not only the data and diversity of data. When you talk about diversity of data, we talk about data coming from different types of sources, like audio sources, video sources. Coming from a POF kind of systems, point of fail kind of systems, and also IOT kind of sets of data.

Krishna: So how are we going to manage this kind of a data deluge, which is coming from different types of sources in different varieties? We are stuck together, unstructured data and whatnot. Now, in order to manage that data, we have together with MAPR, come in with solution, which helps us look into the insights.

Krishna: So to summarize the key types of a successful database approach is to have a seamless data access. As Susie mentioned that the MAPR data platform provides access of different types of data and different types of technologies, right from LFS all the way to F3, different varieties. Could be on premises or off premises. So we will have a seamless data access, that is one of the criteria. And technical capabilities. A strong distal formation.

Krishna: So looking into the type of infrastructures we have, which I want to talk about the fifth generation of fiscal UCS systems, what kind of distal formation we can create together with a company and technologies of MAPR, and the power aspect to be the leadership. This is willingness to implement such a solution. This we think are the key criteria for an AIML driven business approach to solving everyday problems.

Krishna: Now let's take a look into what we have done as a solution. We have used MAPR data sites refinery. On the top of Cisco UCS and MAPR data platform. So what we're trying to bring here is we are creating an industry standard solution. By the way, our Cisco validated designs with MAPR. We also done some benchmarks, performance tuning, and automation to check the pay of the customer problems. So we support variety of tools, variety of applications, which can run on the top of the tools, like Jupyter or Zaplink, or R studio, and on MAPR streaming, Spark, could be on TensorFlow Exetron.

Krishna: Having all those things deployed in a secure way and also having Cisco's ACI application centering infrastructure, which adds the management of the platform seamless, whether you are talking about tens of notes, hundreds of notes, or thousands of notes. That I'm going to talk about in a bit. Now let's talk about the solution we are putting together. This is about the new Cisco validated design with MAPR, version 6.01, which enables an ML platform at scale. The reason I'm talking about scale, we can start with tens or hundreds of notes, you can all the way go up to total of notes. The current solution is best on Cisco UCS C240M5 service. We'll talk about the Caffe of Cisco UCS M5 service in a little bit.

Krishna: This platform uses multiple layers. Predominantly three layers. One if Cisco hardware, on the top of it we have Cisco UCS manager. UCS manager is an automation tool which helps us automate and scale the deployment of nodes in a programmatic way using what we call our service profiles, and also it will help us scale as a data center requirement grows, or the business requirement grows. On the top of it, we have Susa operating system. It's an OS, which also brings us container add service, which is the key component of orchestration which spins off docker containers, managed by Kubernetes. As we said earlier, in order to do machine learning, we need to provide containers and move them where the data is. As we are getting into a data leg where the data can be coming from multiple sources and variety of forms, it is ideal to push the compute and attach the storage, that's where MAPR data volume comes into picture.

Krishna: In this scenario, what you are doing, as Susie said a few minutes back, we are attaching the data volumes where the data is, and adding compute as we grow. That means we are scaling your containers, and also we are scaling the data as the data needs are going up. In other words, the compute and data are now part of this container, and then you can bring different types of applications like TensorFlow, and expose them through Zaplink or Jupiter Notebook. So all these things are now available in the form of a Cisco validated design.

Krishna: So let me talk about the benefits of Cisco MAPR enabling this platform, which is highly scalable. So we're talking about more accurate results. So as we mentioned, MAPR can support variety of file formats, variety of data sources. With Cisco uses CVD, the solution will help you better access to all data. Could be on-prem, or could be off prem. And we would be able to bring certain amounts of data, which is from off-prem to on-prem, and will be able to process the data because Hadoop recommends data locale is very important for processing.

Krishna: In order to do that, we have tools, and so this is available to bring the data on premises and process the data volumes as the user requirements. On other interesting thing I would like to bring here is M-file also comes with GPU's. We have an option of enabling GPU based work loads to run on on-premises Cisco UCS.

Krishna: Second thing is second insights. So the platform used the capabilities of real time munition using MAPR streaming and Spark, et cetera. With this ecosystem of tools, we would be able to ingest the data, real time, and also will be able to process either based on the GPU workloads or based on the CPU workloads, based on the customer requirement, and bring their instant insights into what's happening, near real time. And also, higher data scientist productivity.

Krishna: So, with the containers, the sampling of data becomes much easier, data scientists can work on a small subset of data using the containers, and then deploy the data and destroy the data as they finish their experimentation. That means instead of going through entire data set, they can do a small subset of it, which can be part of the whole ecosystem, and using the tools and services, they can be coming up to speed with any of the special project at a fraction of the time compared to what we used to have.

Krishna: And about a lower TCO. This isn't the same point. There are two aspects of TCO. When I said before Cisco uses manager, along with Cisco, you will see data [inaudible 00:30:22] for big data. Helps us automate the environment. What does it mean? With the service profiles, with the automation we have in place, it takes hours and days instead of weeks and months to get everything going, right from either a new cluster set up, or expanding an existing cluster by adding additional storage of compute. In either cases, the time we save is enormous, that helps us lower our total cost of ownership.

Krishna: Let's get into the ritualize in our business. So what we are providing together with MAPR is a way to spend more time in understanding, analyzing your business problems and leave the infrastructure problems for Cisco and MAPR to deal with. That help you focus and also help you realize your value using our tools, which have got multiple pluggable and also broad ritualization tools, as you saw in our previous slide, we support variety of tools that helps you spend more on ritualizing your business value, rather than spending time on infrastructure.

Krishna: And the last one: Intelligent processes. So this is a general thing happening in the market. If you look at some of our joint solutions and joined solutions Cisco had with MAPR and other solutions we have, we had going into a state of applications, which are not rules based, rather intelligence based. That means, think of a use case maintenance vs a predictive maintenance. Amount of benefits business go and get with AI is going to be enormous. You may talk about autonomous driving, you may talk about mod buildings, or mod cities, you may talk about the next generation of industrialization of what we call as future of manufacturing or industry for [inaudible 00:32:48], all are activated by AI.

Krishna: SO now, let me show you what is behind this. SO the behind the scene, we are working on a revolutionary system, which works across all workloads, which is Cisco, UCS, and Hyperflex. We have several form factors, small and large to augment the MAPR data system. As Susie was mentioning, we can work on the edge, we can work in the data center, and with multiple form factors. We go all the way from edge using ... Cisco uses many [inaudible 00:33:35]. We go to the mainstream computing, which I was talking about the fifth generation UCS, M5. We can also talk about the converged infrastructure with multiple partners, and also Cisco's own Hyperflex systems, which is a hyper converged infrastructure. And also we can talk about the scale out, which are all based on Cisco C series [inaudible 00:34:01] service, and also Cisco F series, F3000 series.

Krishna: So it's all managed by a unified management with single control plane, and single API. So that makes the deployment of different types of devices be it edge, or be it the data center, which is enterprise, with a single application to control them. As we get into the more details of the next generation, you'll see M5. The characteristics we have are mainly cloud scale, that means we're talking about F30 to 60, which is an extreme density, so when it is fully loaded, when in [inaudible 00:34:50], which like, almost half a terabyte, so it is very compatible, or better than F3 in terms of cost of performance per gigabyte, and also this is used for extreme density use cases.

Krishna: Enterprise performance, when we talk about a performance either be no sequel or Hadoop use cases, we recommend these form factors, which are C240, and 5, compatium4, which is more coarse, more numeric, and also now, it supports up to two GPU's. And C2-44, recommended for Hadoop type of workloads. When we talk about C220 M5, which are again, increased performance compared to the previous generation, it is ideal for no sequel, kind of workloads. And also we have the 200 series, and also C480 series, which is actually mission critical, and also for certain work loads, like deep learning, C484 really is suited, because it supports up to six GPU's. And we also have B480 series, which supports up to four GPU's and has more storage and more memory.

Krishna: So based on the different types of workloads, we can augment different form factors to support your journey right on the edge, or in the data center.

Krishna: Let's talk about the automation, which we've talked about. So, we can go bit of additive of designs, what all that means in automation, to support the growth from ... Either for monitoring, maintaining, provisioning, and validating your solutions, and making it simpler with unified management. And also we will provide you with a linear scalability of our compute storage and network devices based on your data footprint.

Krishna: Now let's take a look into why Cisco UCS for big data and machine learning, and AI. So one of the things that I'm going to share about the transaction processing console Hadoop benchmark, we are several benchmarks, and we are number one in the TPCX of benchmark. It's easy to scale as a set. We have unified fabric for network storage and management traffic. Simplified management, again. A single pane of glass for thousands of notes, proven solution. So validated design will take all the pain and uncertainty out of the customers, and with the Cisco and MAPR backing, we will assure you even more performance and scale as you grow.

Krishna: And other incentive we are bringing is our next generation platform. We called Cisco, Intersight, what does it mean? You will be able to manage your global data centers from a single planar place in the cloud. So that is next. Along with that, we'll also provide you much more monitoring aspects of it, change management, and analytic insights into your infrastructure for multiple things. One is for your future requirements, your DR requirements, your visibility into what's happening in the system faults, and for detection, all these things are going to be analytics based. With the power of Cisco Intersight, you will have much more seamless management of the platform across multiple data centers.

Krishna: So let's talk about Cisco and MAPR partnership. So this is an interesting thing I would like to bring up on here, I'm not sure how many or few are aware, Cisco's first partner in Hadoop space is MAPR. Cisco's first CVD, or what we call a Cisco Validated Design for Hadoop, heavily joined with Cisco, UCS and MAPR is deployed in one of the large Hadoop deployments in the industry. Cisco supports MAPR in Cisco UCS data Director Express. UCS data Director Express is an automation tool. Using that, we can make the deployment of MAPR completely automated, and also we can scale it as you add more infrastructure, and industry's first TPC big data results read it together with MAPR.

Krishna: And the performance benchmark, as I mentioned before, these are the performance benchmarks we have done with MAPR, demonstrating the high performance in multiple scale factors. So typically you can also see the system availability, the RY, and one terabyte results, and ten terabytes results. We have industry leading, or industry ... One of the best performance results with MAPR for this kind of workload.

Krishna: And Cisco is also reseller of MAPR-XD, and more. We have a different BOM, bill of materials. We can enterprise premieres, enterprise standout, and POSIX client. You can, as a single window, we can resell MAPR, that will help multiple customers based on the requirement.

Krishna: That comes to the conclusion of our main session. What we covered today, we talked about MAPR capabilities in terms of how we can scale data along with scale the containers. And also we told some of the capabilities of the CVD with the new solution, which is going to provide container service on the top of Cisco made better, and also we went through the use cases where it is more applicable, and AI and ML landscape, and also we went through different form factors, the Cisco next generation UCS M5, and the capabilities with GPU for GPU accelerated workloads. And also we talked about the performance metrics and the total cost of ownership because of the CVD's, and the research we have done with the Cisco UCS platform and also with MAPR technologies.

Krishna: Thank you for attending, now it's open to Devon for any Q&A.

David: Thank you Krishna. So just a reminder, we're going to have some Q&A right now, If you have a question and you haven't submitted it yet, please submit it via the chat window in the bottom left corner of your browser. So there are a few questions that have come in, so we'll just go through this and Susie and Krishna can answer. What do you suggest as mechanism to manage the SLA's between edge and public cloud?

Susie: I think that actually, this isn't with the participants on this offline. There are actually several ways we allow you to manage the SLA's especially between the data that is on I10 cloud. We give you something called policies, you can set up these different data tiers because of the single global name space, you don't really care whether the data is on the edge cloud or on the on premise data center, so we could surface that as data volume sitting in the current environment, and then you apply policies onto that, and then you can move the data. I'd be more than happy to take this offline and actually expand a little bit more to the participant.

David: Okay.

David: Okay. Next one that came in. How do you manage the compute cluster?

Susie: Yeah, so, I suspect this question is more pertaining to our containers and Kubernetes environment. If that is the case, or even otherwise, actually, we have something called a helper as a topology feature that allows you to siphon off or segregate the nodes in your cluster as to where the data can be, so you can do innovative things where you can ... If locality is a requirement, then you can make sure that you place those data in those nodes where your applications are running. In the case of ML/AI, if you are someone who has GPU based servers, you can even use topology to mark off or set aside these GPU nodes where you can then run these ML models just on those nodes.

Susie: So we do give you some features like that where you can specifically configure your compute clusters. If the question is meaning much more than that, then I would be happy to follow that up as well.

Susie: There was another question about the Cisco UCS which, Krishna, perhaps you should ...

David: Okay, so the question on the Cisco UCS, does the MAPR data platform hit on top of Kubernetes manage cloud, or is it separate cluster?

Krishna: To repeat the ... That's MAPR cluster fits on Kubernetes. So this is basically two things. What we have done is, so this is a part of the integral system. MAPR cluster is closely integrated with Kubernetes. What we do is we actually expose MAPR volumes to the Kubernetes managed containers. That's what the solution architecture is. What happens is the data needs to be there, coming from the applications, which is part of MAPR data volumes. The MAPR data volumes are actually tagged using a configuration file based on how you want to take it, that may be attached to a particular container. If you want to know more details on configuration, how this is done, you can please, feel free to reach out.

David: Okay.

David: One more. With zettabytes of data distributed at endpoints around the country, even with Kubernetes handling the processing through containers over UCS, how can we do this over telecommunication links? T1, DS3, et cetera.

Krishna: So that's a very interesting question. So now, this is a very, very ... Time for bringing this up. So when we talk about this, one of the ... Yeah, ML challenges are actually gone through in the slide, the data is distributed is distributed today across my three data centers. So instead of re bringing all the data, what it makes easier is based on the data locale, we can actually deploy a container and as the compute requires, either GPU or CPU based workloads can be run or attached to this server where the data is, then we can bring that aggregate data and go for the processing. That can be one way of doing it. That way, your bandwidth on the networking is going to come down. If it is in MapR data center.

Krishna: Or we can also, again, your bandwidth are very, very important, that is one of the reasons we're actually taking your draft sync, too, where the data locale, and where that belongs, too. So that is that rather than having containers, and running and attaching the required compute, and sending it where the data is. If anything else, if you want to know more details on how do we do that, please feel free to reach out.

David: Okay. So that is all the questions we have for now. Krishna and Susie, any last thoughts?

Susie: Yeah, so what I would like to leave the audience here with is, the AI/ML area in itself is evolving as we speak, so there's a lot of innovation happening there. Customers themselves are figure out which to fit to their environment, do they even need the ML pipeline. What kind of ML pipelines would I need to build? How do I model? So that is an entire package that we offer, which is called MAPR data science and MapR data science refinery, which is specifically targeted for the end users to be self service. Sort of a way for them to quickly familiarize themselves and get them up and running. This combination of data science refinery MAPR and Cisco is a very good place to start with.

Susie: If you're still feeling uncertain, if this is something that you would need or not, then I would highly recommend starting with the Cisco side or the MAPR side, and then we will guide you through the journey. Because of the nature of this ever evolving trend, that is something that we're all learning, we're all evolving at the same time, so I would highly encourage people to reach out through presenters like us, or to your MAPR or Cisco account, and then get started on that, but we can help hand hold you through the journey, that is one thing that I would like to help, and emphasize on, because I see a lot of customers who think about it, but they are very uncertain either because they don't have a company wide mandate, or an investment to continue on this process, or they do and they don't know where to start with.

Susie: So this would be a great starting point for them.

Krishna: Yeah, I think that's a great answer Susie. I really definitely ... What we can do, in addition, we can do a POC, we have a POC lab in Santa Fe, we will give you based on the type of engagement, we welcome that, please reach out to the MAPR or Cisco account reps. We can work on a POC to make sure that your problem can be solved. We can do both GPU based workloads, we can support GPU's. We can support CPU's, and we can also lend variety of AI/ML tools in our lab and we will make it available.

David: Okay. Thank you Krishna, thank you Susie, and thank you everyone for joining us. That is all the time we have for today, for more information on this topic and others, please visit mapr.com/resources. Thank you again and have a great rest of your day.