Resource Manager Metrics

Every 10 seconds, the collectd service uses a MapR plugin to gather Resource Manager metrics on the active Resource Manager. Collectd gathers metrics on the Resource Manager JVM process, YARN applications, and nodes that are managed by the Resource Manager. The method used to gather the metrics differs based on the metric type.

YARN Application Metrics

Collectd gathers YARN application metrics via JMX and REST API. The application metrics that are collected by JMX have the metric name mapr.rm.<metric_name>. Application metrics collected via REST API have the metric name mapr.rm_queue.<metric_name>.

The following metrics are collected via JMX. To filter these metrics by queue using the rm_queue tag, see Configure Queue Filters for mapr.rm.<value> Metrics.

Name Description Additional Tags
mapr.rm.active_applications The number of active applications.
  • rm_queue: Display values for a specified queue.
mapr.rm.active_users The number of users with active applications.
  • rm_queue: Display values for a specified queue.
mapr.rm.aggregate_containers_allocated The number of allocated containers.
  • rm_queue: Display values for a specified queue.
mapr.rm.aggregate_containers_released The number of released containers.
  • rm_queue: Display values for a specified queue.
mapr.rm.allocated_MB The amount of memory allocated to the Resource Manager in MB.
  • rm_queue:Display values for a specified queue.
mapr.rm.allocated_vcores The number of CPUs allocated to the Resource Manager.
  • rm_queue: Display values for a specified queue.
mapr.rm.apps_completed The number of completed applications.
  • rm_queue: Display values for a specified queue.
mapr.rm.apps_failed The number of failed applications.
  • rm_queue: Display values for a specified queue.
mapr.rm.apps_killed The number of killed applications.
  • rm_queue:Display values for a specified queue.
mapr.rm.apps_pending The number of pending applications.
  • rm_queue:Display values for a specified queue.
mapr.rm.apps_running The number of running applications.
  • rm_queue:Display values for a specified queue.
mapr.rm.apps_submitted The number of submitted applications.
  • rm_queue: Display values for a specified queue.
mapr.rm.available_MB The amount of memory available to the Resource Manager in MB.
  • rm_queue: Display values for a specified queue.
mapr.rm.available_disks The number of disks available to the Resource Manager.
  • rm_queue: Display values for a specified queue.
mapr.rm.available_vcores The number of CPUs available to the Resource Manager.
  • rm_queue: Display values for a specified queue.
mapr.rm.pending_MB The amount of memory, in MB, waiting to be allocated by the Resource Manager.
  • rm_queue: Display values for a specified queue.
mapr.rm.pending_containers The number of containers waiting to be allocated by the Resource Manager.
  • rm_queue: Display values for a specified queue.
mapr.rm.pending_disks The number of disks waiting to be allocated by the Resource Manager.
  • rm_queue: Display values for a specified queue.
mapr.rm.pending_vcores The number of CPUs waiting to be allocated by the Resource Manager.
  • rm_queue: Display values for a specified queue.
mapr.rm.reserved_MB The amount of memory reserved by the Resource Manager in MB.
  • rm_queue: Display values for a specified queue.
mapr.rm.reserved_containers The number of containers reserved by the Resource Manager.
  • rm_queue: Display values for a specified queue.
mapr.rm.reserved_disks The number of disks reserved by the Resource Manager.
  • rm_queue: Display values for a specified queue.
mapr.rm.reserved_vcores The number of CPUs reserved by the Resource Manager.
  • rm_queue: Display values for a specified queue.

The following YARN application metrics are collected via REST API.

Name Description Additional Tags
mapr.rm_queue.aggregate_containers_allocated The number of containers allocated for applications in the default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.appmaster_used_disks When queue resources are managed by the Capacity Scheduler, this is the number of disks used by the Application Master for applications in the default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.appmaster_used_memory When queue resources are managed by the Capacity Scheduler, this is the amount of memory, in MB, used by the Application Master for applications in the default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.appmaster_used_vcores When queue resources are managed by the Capacity Scheduler, this is the number of CPUs used by the Application Master for applications in the default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.apps_pending The number of pending applications in the default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.apps_running The number of applications running in the default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.fairshare_disks When queue resources are managed by the Fair Scheduler, this is the number of disks allocated to default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.fairshare_memory When queue resources are managed by the Fair Scheduler, this is the amount of memory, in MB, allocated to default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.fairshare_vcores When queue resources are managed by the Fair Scheduler, this is the number of CPUs used by applications in the default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.used_disks The number of disks used by applications in the default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.used_memory The amount of memory, in MB, used by applications in the default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.used_vcores The number of CPUs used by applications in the default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.max_disks When queue resources are managed by the Fair Scheduler, this is the maximum number of disks available to default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.max_memory When queue resources are managed by the Fair Scheduler, this is the maximum amount of memory, in MB, available to default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.max_vcores When queue resources are managed by the Fair Scheduler, this is the maximum number of CPUs available to default and custom queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.user_allocated_disks When queue resources are managed by the Capacity Scheduler, this is the number of disks allocated to the queues.
  • rm_queue: Display values for a specified queue.
  • rm_user : Display values for a specified user.
mapr.rm_queue.user_allocated_memory When queue resources are managed by the Capacity Scheduler,this is the amount of memory, in MB, allocated to the queues.
  • rm_queue: Display values for a specified queue.
  • rm_user : Display values for a specified user.
mapr.rm_queue.user_allocated_vcores When queue resources are managed by the Capacity Scheduler,this is the number of CPUs allocated to queues.
  • rm_queue: Display values for a specified queue.
  • rm_user : Display values for a specified user.
mapr.rm_queue.user_appmaster_used_disks When queue resources are managed by the Capacity Scheduler,this is the number of disks used by the queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.appmaster_used_memory When queue resources are managed by the Capacity Scheduler, this is the amount of memory used by the queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.appmaster_used_vcores When queue resources are managed by the Capacity Scheduler, this is the number of CPUs used by the queues.
  • rm_queue: Display values for a specified queue.
mapr.rm_queue.user_apps_pending When queue resources are managed by the Capacity Scheduler, this is the number of applications pending in the queues.
  • rm_queue: Display values for a specified queue.
  • rm_user : Display values for a specified user.
mapr.rm_queue.user_apps_running When queue resources are managed by the Capacity Scheduler,this is the number of applications running in the queues.
  • rm_queue: Display values for a specified queue.
  • rm_user : Display values for a specified user.
mapr.rm_queue.user_used_disks When queue resources are managed by the Capacity Scheduler,this is the number of disks used by the queues.
  • rm_queue: Display values for a specified queue.
  • rm_user : Display values for a specified user.
mapr.rm_queue.user_used_memory When queue resources are managed by the Capacity Scheduler,this is the amount of memory, in MB, used by the queues.
  • rm_queue: Display values for a specified queue.
  • rm_user : Display values for a specified user.
mapr.rm_queue.user_used_vcores When queue resources are managed by the Capacity Scheduler,this is the number of CPUs used by the queues.
  • rm_queue: Display values for a specified queue.
  • rm_user : Display values for a specified user.

Resource Manager JVM Metrics

The following Resource Manager JVM metrics are collected via JMX.

Name Description
mapr.rm.jvm.gc_count The number of garbage collections. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.gc_count_ps_mark_sweep The number of parallel scavenge mark sweep collections. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.gc_count_ps_scavenge The number of parallel scavenge collections. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.gc_time_millis The amount of time spent on garbage collection in milliseconds. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.gc_time_millis_ps_mark_sweep The amount of time spent on parallel scavenge mark sweep collection in milliseconds. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.gc_time_millis_ps_scavenge The amount of time in milliseconds spent on parallel scavenge collection. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.log_error The total number of ERROR logs. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.log_fatal The total number of FATAL logs. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.log_info The total number of INFO logs. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.log_warn The total number of WARN logs. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.mem_heap_committed_m The amount of heap memory committed to the Resource Manager in megabytes. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.mem_heap_max_m The maximum amount of heap memory that can be committed to the Resource Manager in megabytes. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.mem_heap_used_m The amount of heap memory used by the Resource Manager in megabytes. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.mem_max_m The maximum amount of memory that can be committed to the Resource Manager in megabytes. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.mem_non_heap_committed_m The amount of non-heap memory committed to the Resource Manager in megabytes. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.mem_non_heap_max_m The maximum amount of non-heap memory that can be committed to the Resource Manager in megabytes. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.mem_non_heap_used_m The maximum amount of non-heap memory that can be used by the Resource Manager in megabytes. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.threads_blocked The number of Resource Manager threads in BLOCKED state. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.threads_new The number of Resource Manager threads in NEW state. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.threads_runnable The number of Resource Manager threads in RUNNABLE state. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.threads_terminated The number of Resource Manager threads in TERMINATED state. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.threads_time_waiting The number of Resource Manager threads in TIMED_WAITING state. This metric is available as of MEP 3.0.1.
mapr.rm.jvm.threads_waiting The number of Resource Manager threads in WAITING state. This metric is available as of MEP 3.0.1.

Resource Manager Node Metrics

The following Resource Manager node metrics are collected via REST API.

Name Description
mapr.rm_cluster.active_nodes The number of nodes in the cluster where containers are running.
mapr.rm_cluster.total_nodes The number of nodes in the cluster.
mapr.rm_cluster.unhealthy_nodes The number of nodes in the cluster that are unable to accept applications.