MapR and Cisco Together: Performance at Scale for Data-Intensive Streaming Workloads

Contributed by

6 min read

MapR Technologies and Cisco Systems announce the MapR-XD on Cisco UCS platform as the new performance leader based on the benchmark measurement by SPEC SFS (Standard Performance Evaluation Corporation SPEC).

MapR-XD is a cloud-scale storage platform that manages highly distributed networks of files, objects, and containers as a single unit.

The Cisco UCS S3260 storage server offers a modular architecture with a dense capacity in a compact 4RU form factor. The MapR partnership with Cisco gives customers the perfect turn-key solution with the combined horsepower from Cisco and the scalable data platform from MapR.

mapr-xd-on-cisco

Goal of Benchmark

MapR combines a distributed, scalable platform with data analytics, which paves the way for various workloads on a single converged platform. The focus of the benchmark is to highlight the benefit of the converged platform to customers, especially those who need to store and analyze streaming workloads. MapR picked Cisco UCS S3260 as the underlying server infrastructure because of its robust applicability, in terms of both scalability and performance, to data-intensive workloads.

This benchmark is the first of many joint solutions from MapR and Cisco.

Highlights of Benchmark

SPEC SFS 2014_VDA (Video Data Acquisition) is the industry standard benchmark for streaming video workloads. In this setup, MapR-XD on Cisco UCS S3260 served as the recording target for streaming video:

  • 2070 streams
  • 34 msec avg latency
  • 20,668 stream ops/sec
  • 9.5GB/sec

Using the benchmark we have demonstrated, we can handle more than 2000 streams with a streaming throughput of 10GB/s. We also expect this number to scale considerably higher as the number of nodes scales linearly.

The benchmark used six Cisco UCS S3260 dual node chassis with high density 8TB spinning disks and MapR-XD Enterprise Premier Edition.

Compared to some of the vendors, the MapR-Cisco joint solution performed better with a potential for even greater performance as the platform scales. Please note that the above performance results are on spinning disks, which means performance will be better on SSDs.

metric-streams

The full details of the benchmark can be found here: https://www.spec.org/sfs2014/results/res2017q4/sfs2014-20171107-00022.html

Guideline on Use Cases

MapR and Cisco give you massive scalability, a global namespace, automated data placement, security, and multi-tenancy to support a wide range of use cases.

Massive Scalability The ease and speed with which data is ingested is well showcased by this Cisco UCS and MapR-XD benchmark. In addition, system performance scales linearly as data grows. The combination of both platforms supports scale up and scale out environments. The MapR-XD distributed data platform allows you to start small and scale dynamically. This is complimented by Cisco UCS S3260, which supports up to 28 drives per node.

Global Namespace Users get a unified view and access method to files for easy data retrieval without worrying about the physical file location. This allows for easy file sharing across remote office locations.

Automated Data Placement A single mount point supports easy data ingestion and retrieval. This provides a great use case for data analytics and customer assessment based on data temperature for long term and short term trend analysis. For example, you can see customer buying patterns over days, months, and seasons - an ideal feature for log data ingestion and retrieval for post analysis.

Support for Multi-Tenancy and Security This solution makes it easy to create isolation at the volume level thus bringing in privileged access to data volumes. Multi-tenancy is extremely valuable in financial use cases where different privilege levels have access to different data sets for analysis.

Now that we have gone through the high-level features, let's talk about the use cases this benchmark serves. After all, the benchmark is but a guideline.

  • If you have streaming workloads generating 'x' capacity of data every day and require a central repository to store data for 'y' number of days, this benchmark gives you an idea of the rate at which you can stream data into the MapR-Cisco solution.

  • If you have scenarios in which you want to both store data and perform analytics on it, the MapR Converged Data Platform provides you with all the tools you need. For instance, if your organization requires you to run facial recognition on the video images you have generated, MapR offers you an ideal analytics platform. You can also make use of the big data ecosystem on MapR to run any big data analytics on the streaming data.

  • Once data is brought into the MapR-Cisco platform using MapR's distributed capabilities, you can leverage the inherent features of a highly scalable storage solution - including multi-tenancy, tiering to cloud, and analytics - all in a single platform.

If you wondering if your environment is best served by the MapR-Cisco partnership, check out the following links for more valuable nuggets of information:
https://blogs.cisco.com/datacenter/cisco-mapr-sds-world-record
https://mapr.com/partners/partner/cisco-delivering-advanced-performance-hadoop-workloads/

In Summary

Cisco and MapR offer a fast, scalable platform for your streaming workloads, be it video, audio, or any data ingest and analyze workload streams.

To learn more about MapR, go to mapr.com. To learn more about Cisco UCS servers, go to www.cisco.com/go/servers.


This blog post was published January 02, 2018.
Categories

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.


Get our latest posts in your inbox

Subscribe Now