2 min read
The technology you use to fill the needs of a dataware layer must have certain essential capabilities. These include seamless reliability and availability. The MapR Data Platform has these capabilities, as demonstrated in this short video.
This is a demonstration of how the services on any node in a MapR cluster can be restarted at any time with minimal disruption.
The demonstration here is on a cluster with only a single node running the container location database (CLDB). The CLDB in a MapR cluster contains location data for the large on-disk data structures known as containers which hold all file system data and metadata.
In production systems, you typically have three such CLDB nodes so that one or even two could be lost without stopping operations on the cluster. Even without such redundant services, the data in CLDB is replicated for safety, but if you restart a non-redundant CLDB node, you will see a short outage in the web interface because the CLDB is used to store some operational status information. Running programs may not even notice, however, if they have enough file system information already cached. In any case, running programs should automatically retry any operations and continue through the outage. To the program, this appears as a temporary hang in I/O operations.
Take some care in doing this if you have a high-value cluster. You may not want to do it on more than one node at a time unless you don't mind some data being temporarily unavailable.
eBook AI and Analytics in Production by Ted Dunning & Ellen Friedman
Try MapR free via web or sandbox VM
Whiteboard Walkthrough video "High-Level View of How MapR Gives Multi-API Access to Files, Tables, and Streams" with Ted Dunning
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.