Metric Aggregation and Storage

Using REST API, each collectd service aggregates and writes metrics to one OpenTSDB node at a time. In the event that an OpenTSDB node is unavailable, collectd can fail over metric aggregations and storage to another OpenTSDB node. All OpenTSDB nodes write to tables in the mapr.monitoring volume.

The collectd services can connect to every OpenTSDB node that is configured to aggregate and store metrics. The OpenTSDB nodes are set when you configure MapR Monitoring with the MapR installer or when you run configure.sh with the -OT parameter.

The collectd service on each node uses the following process when determining which OpenTSDB node to write metrics to:
  1. The collectd service determines how many metric entries it has written to the OpenTSBD node that it was most recently writing metrics to.
    • If it has written less that 10,000 entries to the current OpenTSDB node, it reopens a connection to the existing OpenTSDB node.
    • If it has already written 10,000 entries to the current OpenTSDB node, it opens a connection with a different OpenTSDB node.
  2. If the connection to OpenTSDB fails, it opens a connection with a different OpenTSDB node.

The collectd services do not require additional configuration to enable automatic failover to an available OpenTSDB instance. However, it is important that at least three OpenTSDB nodes are configured to aggregate and store metrics so that failure of one or two nodes does not prevent metrics from being used for monitoring purposes. Based on your environment, more OpenTSDB nodes may be required. See Example Cluster Designs.