NoSQL solutions deliver database functionality at large scale, well beyond what traditional databases like Oracle can do. To support real-time enterprise applications, NoSQL databases must provide high throughput and low latency for a wide variety of fetch-update workloads.
MapR M7 supports highly scalable NoSQL database applications, including 24x7 online applications as well as large-scale analytics. Strong data consistency, high reliability and extreme performance with consistent low latency values at both 95th and 99th percentile levels make M7 the best choice for NoSQL applications. Better yet, M7 comes integrated with Apache™ Hadoop®. M7 architectural details can be found at:https://mapr.com/products/
This benchmark report compares the performance of HBase applications on MapR M7 against that on other distributions of Hadoop. The report delineates how M7 delivers performance gains ranging from 4x to over 10x for throughput and eliminates latency across varying workloads.
The tests were performed using Yahoo Cloud Serving Benchmark (YCSB)– the widely accepted standard for testing NoSQL performance. YCSB’s Zipfian data distribution model was used.
Two sets of tests were conducted; the first one was on a 10-node Hard Disk Drive (HDD) cluster and the second one was on a 5-node Flash-memory based Solid State Drive (SSD) cluster. Five different workloads were tested across each of the two clusters.
Please see the Appendix for test environment details.
The plots below show the per node throughput results from the HDD and the SSD tests across the five workloads.
It is evident from the results that Apache HBase applications on M7 perform better than HBase applications on other distributions across all of the workloads on both HDD and SSD hardware. SSD usage delivers a marked increase in read performance for M7, providing an order of magnitude advantage over other distributions.
Along with the throughput tests, latency tests on HDD were also conducted to measure read and update latencies across the different workloads.
The figure below showcases the consistency of M7’s performance compared to other distributions. Consistent latency provides users with predictable performance at all times.
Read Latency: Lower is better. YCSB Mixed Workload (50% Fetch - 50% Update); 10 Nodes; 2TB (1K row size); 10 second moving average; Y-axis cap = 400 msec
As the table below illustrates, M7 provides dramatically lower read latency across all of the workloads including for the 95th and the 99th percentiles.
The table below shows the update latency across the Batched Put and the Mix workloads. M7 provides, on average, update latency advantage of 3-15x over other distributions.
MapR M7 delivers superior database performance, including 95th and 97th percentile latencies, for wide-ranging workloads across HDDs and SSDs. Strong data consistency, high reliability, zero maintenance windows and extreme performance with continuous low latency make M7 the best choice for NoSQL database online applications as well as for large-scale analytics.
MapR M7 delivers high throughput and consistent low latencies because:
M7 provides 100% data locality
The database client has only a single network hop to any data
Low disk I/O coupled with a smaller disk footprint makes database operations on disk fast as well as predictable
Auto-tuning cache mechanism ensures performance is optimized across varying workloads
Implementation in C/C++ allows for greater system-level control and avoids performance inconsistencies
For more information on M7 (now called MapR Database), please visit: mapr.com/products/mapr-database/
MapR M7 is available from MapR for use on premise and is also available in the Cloud through Amazon Web Services and the Google Cloud Platform.
Apache HBase Configuration
Appendix: Test Environment Details (continued)