MapR 5.0 Documentation : Using OpenTSDB with AsyncHBase 1.6 and MapR-DB 4.1

The OpenTSDB software package provides a time-series database that collects user-specified data. Because OpenTSDB depends on AsyncHBase, MapR provides a customized version of OpenTSDB that works with AsyncHBase for MapR-DB.

This document explains how to build OpenTSDB from source, using the Github repository, or install the RPM or Debian distribution for OpenTSDB.

See also Documentation for OpenTSDB.

Prerequisites

OpenTSDB for MapR-DB Version 4.1 requires:

  • Version 4.1.x of the MapR Distribution for Hadoop
  • The latest release of the mapr-hbase package (0.98.7 or 0.98.9): 
  • The latest release of the mapr-asynchbase package (1.6)

For information about OpenTSDB for Mapr-DB Version 4.0.x, see Using OpenTSDB with AsyncHBase and MapR-DB.

To Build OpenTSDB from Source:

  1. Clone the opentsdb.git project and check out the v2.0.0 branch:

    $ git clone https://github.com/mapr/opentsdb.git
    Cloning into 'opentsdb'...
    remote: Counting objects: 5625, done.
    remote: Compressing objects: 100% (76/76), done.
    remote: Total 5625 (delta 51), reused 64 (delta 30)
    Receiving objects: 100% (5625/5625), 27.15 MiB | 2.67 MiB/s, done.
    Resolving deltas: 100% (3755/3755), done.
    Checking connectivity... done.
    $ cd opentsdb
    $ git tag -l
    mapr-1.1.0-release+5
    v1.0.0
    v2.0.0
    ...
    $ git checkout v2.0.0
    Switched to a new branch 'v2.0.0'


  2. Open the opentsdb/tsdb.in file and add the following MapR dependencies:

    1. BASEMAPRDIR: the root directory of the MapR installation
    2. Hadoop core-site.xml in the classpath for Hadoop-0.2x/Hadoop-2.x:
      BASEMAPRDIR/hadoop/hadoop-0.20.2/conf or BASEMAPRDIR/hadoop/hadoop-2.x/conf

    3. MapR-specific jars in the classpath: BASEMAPRDIR/hadoop/hadoop-0.20.2/lib/*

      See tsdb.in Updates.

  3. Install dependencies for graph generation:

     $ yum install autoconf automake gnuplot
  4. Replace the asynchbase.jar file with the MapR version of that file:

    $ yum install mapr-asynchbase
  5. Run the build script:

    ./build.sh
  6. Edit the following file and add "/" before the table names so that MapR recognizes them as MapR-DB tables:

    <OPENTSDB_ROOT_INSTALL_DIR>/src/create_table.sh

    See create_table.sh Updates.


  7. Create tables in MapR-DB:

    env COMPRESSION=NONE;HBASE_HOME=/opt/mapr/hbase/hbase-0.98.9
    <OPENTSDB_ROOT_INSTALL_DIR>/src/create_table.sh
  8. Run the following command to verify that the tables are created successfully:

    hadoop fs -ls /
  9. Create a simple metric to store, such as “sys.cpu.user”:

    ./build/tsdb mkmetric sys.cpu.user --table=/tsdb --uidtable=/tsdb-uid
  10. Run the OpenTSDB daemon (tsd):

    ./build/tsdb tsd --port=4242 --staticroot=build/staticroot 
    --cachedir=/tmp/opentsdb_tmp --zkquorum=10.10.101.50:5181 --table=/tsdb 
    --uidtable=/tsdb-uid

    Note: Instead of providing these options on command line, you can configure the values in the opentsdb.conf file. This file must be in the root folder so the option settings are read when tsd is run. Also note that the staticroot argument points to the static UI files. You do not need to create cachedir because openstdb creates it automatically. Specifying the destination cachedir argument is enough. You do need to explicitly specify tsdb tables (tsdb, tsdb-uid) and Zookeeper quorum nodes.

  11. Log into the web UI: http://<TSD_Installed_Node_IP>:<Port>

    For example: http://10.10.10.230:4242/

  12. Run a simple test program that generates data and sends repeated puts for the metric over a socket connection: <UI-IP>:<UI-Port>
    See Data Generator Program.

  13. Check the plot in the UI.

    1. select From date and check autoreload.

    2. Fill in the metric (in this case, sys.cpu.user) and the Tag keys (cpu, host) values (webserver 0, webserver 1). You should see a graph with a random plot.

tsdb.in Updates

Note the changes to the Base of MapR installation section.

#!/bin/bash

set -e
me=`basename "$0"`
mydir=`dirname "$0"`
# Either:
#  abs_srcdir and abs_builddir are set: we're running in a dev tree
#  or pkgdatadir is set: we've been installed, we respect that.
abs_srcdir='@abs_srcdir@'
abs_builddir='@abs_builddir@'
pkgdatadir='@pkgdatadir@'
configdir='@configdir@'
# Either we've been installed and pkgdatadir exists, or we haven't been
# installed and abs_srcdir / abs_builddir aren't empty.
test -d "$pkgdatadir" || test -n "$abs_srcdir$abs_builddir" || {
  echo >&2 "$me: Uh-oh, \`$pkgdatadir' doesn't exist, is OpenTSDB properly installed?"
  exit 1
}

if test -n "$pkgdatadir"; then
  localdir="$pkgdatadir"
  for jar in "$pkgdatadir"/*.jar; do
    CLASSPATH="$CLASSPATH:$jar"
  done
  # Add pkgdatadir itself so we can find logback.xml
  CLASSPATH="$CLASSPATH:$pkgdatadir"

  if test -d "$pkgdatadir/bin"; then
    CLASSPATH="$CLASSPATH:$pkgdatadir/bin"
  fi

  if test -d "$pkgdatadir/lib"; then
    for jar in "$pkgdatadir"/lib/*.jar; do
      CLASSPATH="$CLASSPATH:$jar"
    done
  fi

  if test -n "$configdir" && test -d "$configdir"; then
    CLASSPATH="$CLASSPATH:$configdir"
  fi
else
  localdir="$abs_builddir"
  # If we're running out of the build tree, it's especially important that we
  # know exactly what jars we need to build the CLASSPATH.  Otherwise people
  # cannot easily pick up new dependencies as we might mix multiple versions
  # of the same dependencies on the CLASSPATH, which is bad.  Looking for a
  # specific version of each jar prevents this problem.
  # TODO(tsuna): Once we jarjar all the dependencies together, this will no
  # longer be an issue.  See issue #23.

  # Base of MapR installation
  BASEMAPR=${MAPR_HOME:-/opt/mapr}

  #add MapR specific asynchbase jar in classpath
  if test -d "$BASEMAPR/asynchbase"; then
    for jar in "$BASEMAPR"/asynchbase/asynchbase-*/*.jar; do
      if [[ `echo $jar | grep sources` != "" ]] || [[ `echo $jar | grep javadoc` != "" ]]; then
        continue
      fi
      CLASSPATH="$CLASSPATH:$jar"
     done
  fi

  for jar in `make -C "$abs_builddir" printdeps | sed '/third_party.*jar/!d'`; do
    if [[ `echo $jar | grep asynchbase` != "" ]]; then
      continue
    fi
    for dir in "$abs_builddir" "$abs_srcdir"; do
      test -f "$dir/$jar" && CLASSPATH="$CLASSPATH:$dir/$jar" && continue 2
    done
    echo >&2 "$me: error: Couldn't find \`$jar' either under \`$abs_builddir' or \`$abs_srcdir'."
    exit 2
  done
  # Add the src dir so we can find logback.xml
  CLASSPATH="$CLASSPATH:$abs_srcdir/src"
fi
# Remove any leading colon.
CLASSPATH="${CLASSPATH#:}"

# Add MapR hadoop jars to classpath
if test -d "$BASEMAPR/hadoop/hadoop-0.20.2/lib"; then
  # hadoop conf directory to beginning of classpath (for core-site.xml)
  CLASSPATH="$BASEMAPR/hadoop/hadoop-0.20.2/conf:$CLASSPATH"

  for jar in "$BASEMAPR"/hadoop/hadoop-0.20.2/lib/*.jar; do
    if [[ `echo $jar | grep slf4j` != "" ]]; then
      continue
    fi
    CLASSPATH="$CLASSPATH:$jar"
  done
fi

# MapR native library path
JVMARGS="${JVMARGS} -Djava.library.path=${BASEMAPR}/lib"

# TSD compactions are not required with M7
JVMARGS="${JVMARGS} -Dtsd.feature.compactions=false"

usage() {
  echo >&2 "usage: $me <command> [args]"
  echo 'Valid commands: fsck, import, mkmetric, query, tsd, scan, uid'
  exit 1
}

case $1 in
  (fsck)
    MAINCLASS=Fsck
    ;;
  (import)
    MAINCLASS=TextImporter
    ;;
  (mkmetric)
    shift
    set uid assign metrics "$@"
    MAINCLASS=UidManager
    ;;
  (query)
    MAINCLASS=CliQuery
    ;;
  (tsd)
    MAINCLASS=TSDMain
    ;;
  (scan)
    MAINCLASS=DumpSeries
    ;;
  (uid)
    MAINCLASS=UidManager
    ;;
  (*)
    echo >&2 "$me: error: unknown command '$1'"
    usage
    ;;
esac
shift

JAVA=${JAVA-'java'}
JVMARGS=${JVMARGS-'-enableassertions -enablesystemassertions'}
test -r "$localdir/tsdb.local" && . "$localdir/tsdb.local"
exec $JAVA $JVMARGS -classpath "$CLASSPATH" net.opentsdb.tools.$MAINCLASS "$@"


create_table.sh Updates

Note the changed sections for the *_TABLE variables.

#!/bin/sh
# Small script to setup the HBase tables used by OpenTSDB.

test -n "$HBASE_HOME" || {
  echo >&2 'The environment variable HBASE_HOME must be set'
  exit 1
}
test -d "$HBASE_HOME" || {
  echo >&2 "No such directory: HBASE_HOME=$HBASE_HOME"
  exit 1
}

TSDB_TABLE=${TSDB_TABLE-'/tsdb'}
UID_TABLE=${UID_TABLE-'/tsdb-uid'}
TREE_TABLE=${TREE_TABLE-'/tsdb-tree'}
META_TABLE=${META_TABLE-'/tsdb-meta'}
BLOOMFILTER=${BLOOMFILTER-'ROW'}
# LZO requires lzo2 64bit to be installed + the hadoop-gpl-compression jar.
COMPRESSION=${COMPRESSION-'LZO'}
# All compression codec names are upper case (NONE, LZO, SNAPPY, etc).
COMPRESSION=`echo "$COMPRESSION" | tr a-z A-Z`

case $COMPRESSION in
  (NONE|LZO|GZIP|SNAPPY)  :;;  # Known good.
  (*)
    echo >&2 "warning: compression codec '$COMPRESSION' might not be supported."
    ;;
esac

# HBase scripts also use a variable named `HBASE_HOME', and having this
# variable in the environment with a value somewhat different from what
# they expect can confuse them in some cases.  So rename the variable.
hbh=$HBASE_HOME
unset HBASE_HOME
exec "$hbh/bin/hbase" shell <<EOF
create '$UID_TABLE',
  {NAME => 'id', COMPRESSION => '$COMPRESSION', BLOOMFILTER => '$BLOOMFILTER'},
  {NAME => 'name', COMPRESSION => '$COMPRESSION', BLOOMFILTER => '$BLOOMFILTER'}

create '$TSDB_TABLE',
  {NAME => 't', VERSIONS => 1, COMPRESSION => '$COMPRESSION', BLOOMFILTER => '$BLOOMFILTER'}

create '$TREE_TABLE',
  {NAME => 't', VERSIONS => 1, COMPRESSION => '$COMPRESSION', BLOOMFILTER => '$BLOOMFILTER'}

create '$META_TABLE',
  {NAME => 'name', COMPRESSION => '$COMPRESSION', BLOOMFILTER => '$BLOOMFILTER'}
EOF

 

Data Generator Program

import java.io.PrintWriter;
import java.net.Socket;
import java.util.Date;
import java.util.Random;

public class TestOpenTsdbAPI {
   public static Random random = new Random();
   public static long timeStamp = new Date().getTime()/1000; //in secs
   public static void testTSDBConnection() throws Exception {
       Socket sock = null;
       PrintWriter pw = null;
       String hostname = "10.10.10.230";
       int port = 4242;
       int count=1;
       while(true) {
           if(null==sock) {
               sock = new Socket(hostname, port);
               pw = new PrintWriter(sock.getOutputStream(), true);
           }
           pw.println(dataGen(0, 0, count));
           pw.flush();
           pw.println(dataGen(0, 1, count));
           pw.flush();
           pw.println(dataGen(1, 0, count));
           pw.flush();
           pw.println(dataGen(1, 1, count));
           pw.flush();
           
           if(++count==Integer.MAX_VALUE) break;
           Thread.sleep(60000);
       }
   }
   public static void main(String [] args) {
       try {
           testTSDBConnection();
       } catch(Exception ex) {
           ex.printStackTrace();
       }
   }

   public static String dataGen(int web, int cpu, int count) {
       int Low = 1;
       int High = 99;
       int val = random.nextInt(High-Low) + Low;
       long timeStamp1 = new Date().getTime()/1000;
       String dat = "put sys.cpu.user "+(timeStamp1)+" "+val+" host=webserver"+ web +" cpu="+cpu;//(timeStamp+count)
       System.out.println(dat);
       return dat;
   }
}

Note: This program tries to put metrics for 2 hosts (webserver 0 and webserver 1). Each host has 2 CPUs (cpu 0 and cpu 1). Sample puts look like this:

put sys.cpu.user 1415300810 87 host=webserver0 cpu=0
put sys.cpu.user 1415300810 66 host=webserver0 cpu=1
put sys.cpu.user 1415300810 18 host=webserver1 cpu=0
put sys.cpu.user 1415300810 26 host=webserver1 cpu=1
 
put <metric> <timestamp> <value> <tag1>=<> <tag2>=<>

Note: When you run the program, you should see entries that indicate that the tags for the metric were created, and they should auto-complete on the UI.

UniqueId: Creating an ID for kind='tagv' name='webserver0'

You can also verify this from command line instead of the UI:

<OpenTSDB-Root>/build/tsdb query 1y-ago sum sys.cpu.user

Installing the RPM or Debian Distribution

Follow these steps to install the RPM or Debian distribution for OpenTSDB:

  1. Install MapR Version 4.1 and HBase 0.98.7 or 0.98.9.
  2. Install the latest mapr-asynchbase-1.6.0.* package.
  3. Install the OpenTSDB RPM:

    1. mkdir /root/opentsdbrpm
    2. cd /root/opentsdbrpm
    3. wget https://github.com/OpenTSDB/opentsdb/releases/download/v2.0.0/opentsdb-2.0.0.noarch.rpm -O opentsdb-2.0.0.noarch.rpm
    4. rpm -ivh opentsdb-2.0.0.noarch.rpm
  4. Configure OpenTSDB to work with MapR:
    1. Edit the following tsdb scripts to cover MapR-specific dependencies: /usr/share/opentsdb/bin/tsdb and /usr/bin/tsdb

      # Base of MapR installation
      BASEMAPR=${MAPR_HOME:-/opt/mapr}
      
      # Add MapR hadoop jars to classpath
      if test -d "$BASEMAPR/hadoop/hadoop-0.20.2/lib"; then
       # hadoop conf directory to beginning of classpath (for core-site.xml)
       CLASSPATH="$BASEMAPR/hadoop/hadoop-0.20.2/conf:$CLASSPATH"
      
       for jar in "$BASEMAPR"/hadoop/hadoop-0.20.2/lib/*.jar; do
         if [ "`echo $jar | grep slf4j`" != "" ]; then
           continue
         fi
         CLASSPATH="$CLASSPATH:$jar"
       done
      fi
    2. Replace the asynchbase jar file (provide the current jar file name in the cp command):

      cp 
      /opt/mapr/asynchbase/asynchbase-1.6.0/asynchbase-1.6.0-mapr-*.jar    
      /usr/share/opentsdb/lib/
      rm -f /usr/share/opentsdb/lib/asynchbase-1.5.0.jar
    3. Configure the opentsdb.conf files:

      /usr/share/opentsdb/etc/opentsdb/opentsdb.conf
      /etc/opentsdb/opentsdb.conf

      These files must have the following settings:

      tsd.network.port = 4242
      tsd.http.staticroot = /usr/share/opentsdb/static/
      tsd.core.auto_create_metrics = false (for testing purposes only)
      tsd.storage.hbase.data_table = /tsdb
      tsd.storage.hbase.uid_table = /tsdb-uid
      tsd.storage.hbase.zk_quorum = <zookeeperNode>:<zookeeperP>
    4. Edit the <OPENTSDB_ROOT_INSTALL_DIR>/src/create_table.sh file and add "/" before the table names so that MapR recognizes them as MapR-DB tables. 

      Then create tables in MapR-DB:

      export COMPRESSION=NONE; export HBASE_HOME=/opt/mapr/hbase/hbase-0.98.9; /usr/share/opentsdb/tools/create_table.sh

      See create_table.sh Updates.

       

    5. Confirm that the tables are created:

      hadoop fs -ls /
      tr--------   3 root root            2 2014-12-12 01:47 /tsdb
      tr--------   3 root root            2 2014-12-12 01:47 /tsdb-meta
      tr--------   3 root root            2 2014-12-12 01:47 /tsdb-tree
      tr--------   3 root root            2 2014-12-12 01:47 /tsdb-uid
  5. Start the tsd daemon. You can give executable permissions to the tsdb script in /usr/share/opentsdb/bin, or you can directly use tsdb (because of the dependencies you added earlier).

    chmod +x /usr/share/opentsdb/bin/tsdb
    /usr/share/opentsdb/bin/tsdb tsd --port=4242 
    --staticroot="/usr/share/opentsdb/static/" 
    --cachedir="/tmp/opentsdb" --auto-metric
  6. Create a metric: /usr/share/opentsdb/bin/tsdb mkmetric mymetric.stock
  7. Test the metric:
    1. Run a test program that reads from a tmp_input file and sends put requests to opentsdb, which saves the data to a MapR-DB table (tsdb/tsdb-uid).
    2. Run aggregation queries (such as SUM) from the command line:
      /usr/share/opentsdb/bin/tsdb query 1y-ago sum mymetric.stock
      or
      tsdb query 1y-ago sum mymetric.stock
    3. Check the results.

Test Program

public static void testTSDBConnection() throws Exception {
       Socket sock = null;
       PrintWriter pw = null;
       String hostname = "10.10.10.220"; //replace with the node where tsd runs
       int port = 4242; //replace with your port
       sock = new Socket(hostname, port);
       pw = new PrintWriter(sock.getOutputStream(), true);
       File dir = new File(".");
       File fin = new File(dir.getCanonicalPath() + File.separator + "tmp_input");
       BufferedReader br = new BufferedReader(new FileReader(fin));
String line = null;
while ((line = br.readLine()) != null) {
           System.out.println(line);
           pw.println(line);
           pw.flush();
}
br.close();
}


tmp_input File

=====
put mymetric.stock 1407165399 196.30 symbol=VOD.L
put mymetric.stock 1407165399 484.20 symbol=BP.L
put mymetric.stock 1407165401 224.15 symbol=BARC.L
put mymetric.stock 1407165402 196.30 symbol=VOD.L
put mymetric.stock 1407165403 484.15 symbol=BP.L
put mymetric.stock 1407165404 224.15 symbol=BARC.L
put mymetric.stock 1407165405 196.30 symbol=VOD.L
put mymetric.stock 1407165405 484.15 symbol=BP.L
put mymetric.stock 1407165406 224.15 symbol=BARC.L
put mymetric.stock 1407165407 196.30 symbol=VOD.L
put mymetric.stock 1407165408 484.15 symbol=BP.L
put mymetric.stock 1407165409 224.15 symbol=BARC.L
put mymetric.stock 1407165410 196.30 symbol=VOD.L
put mymetric.stock 1407165411 484.15 symbol=BP.L
put mymetric.stock 1407165412 224.15 symbol=BARC.L
put mymetric.stock 1407165413 196.30 symbol=VOD.L
put mymetric.stock 1407165414 484.15 symbol=BP.L
put mymetric.stock 1407165415 224.15 symbol=BARC.L
put mymetric.stock 1407165416 196.30 symbol=VOD.L
put mymetric.stock 1407165417 484.15 symbol=BP.L
put mymetric.stock 1407165417 224.15 symbol=BARC.L
put mymetric.stock 1407165418 196.30 symbol=VOD.L
put mymetric.stock 1407165419 484.15 symbol=BP.L
put mymetric.stock 1407165422 224.15 symbol=BARC.L
put mymetric.stock 1407165422 196.30 symbol=VOD.L
put mymetric.stock 1407165423 484.255 symbol=BP.L
====

Expected Results of SUM query

====
mymetric.stock 1407165399000 680.500015 {}
mymetric.stock 1407165401000 904.625000 {}
mymetric.stock 1407165402000 904.612495 {}
mymetric.stock 1407165403000 904.599991 {}
mymetric.stock 1407165404000 904.599991 {}
mymetric.stock 1407165405000 904.599991 {}
mymetric.stock 1407165406000 904.599991 {}
mymetric.stock 1407165407000 904.599991 {}
mymetric.stock 1407165408000 904.599991 {}
mymetric.stock 1407165409000 904.599991 {}
mymetric.stock 1407165410000 904.599991 {}
mymetric.stock 1407165411000 904.599991 {}
mymetric.stock 1407165412000 904.599991 {}
mymetric.stock 1407165413000 904.599991 {}
mymetric.stock 1407165414000 904.599991 {}
mymetric.stock 1407165415000 904.599991 {}
mymetric.stock 1407165416000 904.599991 {}
mymetric.stock 1407165417000 904.599991 {}
mymetric.stock 1407165418000 904.599991 {}
mymetric.stock 1407165419000 904.599991 {}
mymetric.stock 1407165422000 904.678749 {}
mymetric.stock 1407165423000 484.255005 {}
====