Node Manager Metrics

Every 10 seconds, the collectd service uses a MapR plugin to gather the following Node Manager metrics on each node in the cluster.

Name Description
mapr.nm.allocated_GB The amount of memory allocated to the Node Manager in GB.
mapr.nm.allocated_containers The number of containers allocated to the Node Manager.
mapr.nm.allocated_vcores The number of CPUs allocated to the Node Manager.
mapr.nm.available_vcores The number of CPUs available to the Node Manager.
mapr.nm.available_GB The amount of memory available to the Node Manager in GB.
mapr.nm.containers_completed The number of containers that have completed.
mapr.nm.containers_failed The number of containers that have failed.
mapr.nm.containers_initing The number of containers that are initializing.
mapr.nm.containers_killed The number of containers that have been killed by the Node Manager.
mapr.nm.containers_running The number of containers that are running.
mapr.nm.containers_launched The number of containers started by the Node Manager.
mapr.nm.jvm.gc_count The number of garbage collections.
mapr.nm.jvm.gc_count_ps_mark_sweep The number of parallel scavenge mark sweep collections.
mapr.nm.jvm.gc_count_ps_scavenge The number of parallel scavenge collections.
mapr.nm.jvm.gc_time_millis The amount of time spent on garbage collection in milliseconds.
mapr.nm.jvm.gc_time_millis_ps_mark_sweep The amount of time spent on parallel scavenge mark sweep collection in milliseconds.
mapr.nm.jvm.gc_time_millis_ps_scavenge The amount of time in milliseconds spent on parallel scavenge collection.
mapr.nm.jvm.log_error The total number of ERROR logs.
mapr.nm.jvm.log_fatal The total number of FATAL logs.
mapr.nm.jvm.log_info The total number of INFO logs
mapr.nm.jvm.log_warn The total number of WARN logs.
mapr.nm.jvm.mem_heap_committed_m The amount of heap memory committed to the Node Manager in megabytes.
mapr.nm.jvm.mem_heap_max_m The maximum amount of heap memory that can be committed to the Node Manager in megabytes.
mapr.nm.jvm.mem_heap_used_m The amount of heap memory used by the Node Manager in megabytes.
mapr.nm.jvm.mem_max_m The maximum amount of memory that can be committed to the Node Manager in megabytes.
mapr.nm.jvm.mem_non_heap_committed_m The amount of non-heap memory committed to the Node Manager in megabytes.
mapr.nm.jvm.mem_non_heap_max_m The maximum amount of non-heap memory that can be committed to the Node Manager in megabytes.
mapr.nm.jvm.mem_non_heap_used_m The maximum amount of non-heap memory that can be used by the Node Manager in megabytes.
mapr.nm.jvm.threads_blocked The number of Node Manager threads in BLOCKED state.
mapr.nm.jvm.threads_new The number of Node Manager threads in NEW state.
mapr.nm.jvm.threads_runnable The number of Node Manager threads in RUNNABLE state.
mapr.nm.jvm.threads_terminated The number of Node Manager threads in TERMINATED state.
mapr.nm.jvm.threads_time_waiting The number of Node Manager threads in TIMED_WAITING state.
mapr.nm.jvm.threads_waiting The number of Node Manager threads in WAITING state.
mapr.nm.shuffle.shuffle_connection The number of Node Manager shuffle connections.
mapr.nm.shuffle.shuffle_output_bytes The amount of Node Manager shuffle output in bytes.
mapr.nm.shuffle.shuffle_outputs_failed The number of failed Node Manager shuffle outputs.
mapr.nm.shuffle.shuffle_outputs_ok The number of completed Node Manager shuffle outputs.
mapr.nm.ugi.get_groups_avg_time The average amount of time spent by Node Manager on group resolution.
mapr.nm.ugi.get_groups_num_ops The number of group resolutions completed by the Node Manager.
mapr.nm.ugi.login_failure_avg_time The average amount of time spent by Node Manager on failed login attempts.
mapr.nm.ugi.login_failure_num_ops The number of failed login attempts by the Node Manager.
mapr.nm.ugi.login_success_avg_time The average amount of time spent by Node Manager to successfully login.
mapr.nm.ugi.login_success_num_ops The number of successful logins by the Node Manager.