Cluster Heatmap Pane

The Cluster Heatmap pane displays the health of the nodes in the cluster, by rack. Each node appears as a colored square to show its health at a glance.

If you click on the small wrench icon at the upper right of the Cluster Heatmap pane, a key to the color-coded heatmap display slides into view. At the top of the display, you can set the refresh rate for the display (measured in seconds), as well as the number of columns to display (for example, 20 nodes are displayed across two rows for a 10-column display). Click the wrench icon again to slide the display back out of view.

The left drop-down menu at the top of the pane lets you choose which data is displayed. Some of the choices are shown below.

Heatmap legend by category

The heatmap legend changes depending on the criteria you select from the drop-down menu. All the criteria and their corresponding legends are shown here.

Health

  • Healthy - all services up, MapR-FS and all disks OK, and normal heartbeat
  • Upgrading - upgrade in process
  • Degraded - one or more services down, or no heartbeat for over 1 minute
  • Maintenance - routine maintenance in process
  • Critical - Mapr-FS Inactive/Dead/Replicate, or no heartbeat for over 5 minutes

The following table shows the legend for all Heatmap displays, such as CPU, memory and disk space.

Legend CPU Utilization Memory Utilization Disk Space Utilization
CPU < 50% Memory < 50% Used < 50%
CPU < 80% Memory < 80% Used < 80%
CPU >= 80% Memory >= 80% Used >= 80%
Unknown Unknown Unknown

Alarms

The following table shows the alarms.

 
Too Many Containers Alarm Containers within limit Containers exceeded limit
Duplicate HostId Alarm No duplicate host id detected Duplicate host id detected
UID Mismatch Alarm No UID mismatch detected UID mismatch detected
No Heartbeat Detected Alarm Node heartbeat detected Node heartbeat not detected
TaskTracker Local Dir Full Alarm TaskTracker local directory is not full TaskTracker local directory full
PAM Misconfigured Alarm PAM configured PAM misconfigured
High FileServer Memory Alarm Fileserver memory OK Fileserver memory high
Cores Present Alarm No core files Core files present
Installation Directory Full Alarm Installation Directory free Installation Directory full
Metrics Write Problem Alarm Metrics writing to Database Metrics unable to write to Database
Root Partition Full Alarm Root partition free Root partition full
HostStats Down Alarm HostStats running HostStats down
Webserver Down Alarm Webserver running Webserver down
NFS Gateway Down Alarm NFS Gateway running NFS Gateway down
HBase RegionServer Down Alarm HBase RegionServer running HBase RegionServer down
HBase Master Down Alarm HBase Master running HBase Master down
TaskTracker Down Alarm TaskTracker running TaskTracker down
JobTracker Down Alarm JobTracker running JobTracker down
FileServer Down Alarm FileServer running FileServer down
CLDB Down Alarm CLDB running CLDB down
Time Skew Alarm Time OK Time skew alarm(s)
Software Installation & Upgrades Alarm Version OK Version alarm(s)
Disk Failure(s) Alarm Disks OK Disk alarm(s)
Excessive Logging Alarm No debug Debugging

Zoomed view

You can see a zoomed view of all the nodes in the cluster by moving the zoom slide bar. The zoomed display reveals more details about each node, based on the criteria you chose from the drop-down menu. In this example, CPU Utilization is displayed for each node.

Clicking a rack name navigates to the Nodes view, which provides more detailed information about the nodes in the rack.

Clicking a colored square navigates to the Node Properties View, which provides detailed information about the node.