MapR 5.0 Documentation : Using Apache Phoenix on HBase

Apache Phoenix Overview

Apache Phoenix is an SQL layer on top of Apache HBase that enables you to run low latency SQL queries on HBase, a NoSQL data source. Phoenix is delivered as a client-embedded JDBC driver and uses native HBase APIs instead of MapReduce.

You can use standard JDBC clients, like SQuirreL, to connect to Phoenix and query HBase data or you can access Phoenix from SQLLine. SQLLine is a pure-Java console-based utility for connecting to relational databases and executing SQL commands. SQLLine is used as the shell for Phoenix.

When you issue a query to Phoenix, the Phoenix query engine transforms the query into one or more native HBase scans. The query engine executes the scans in parallel to produce regular JDBC result sets. During query execution, Phoenix uses the HBase API, HBase coprocessors, and custom filters to optimize query performance.

Installation Overview

To install Phoenix, download and extract the Phoenix archive file. Copy the required Phoenix JAR files to the classpath of each HBase region server and then restart the region servers. Copy the Phoenix client JAR file to classpath of the Phoenix client.

After you complete the installation steps, invoke SQLLine to run Phoenix. When you invoke SQLLine, you must indicate the Zookeeper node or Zookeeper quorum. You indicate the Zookeeper node or quorum to tell SQLLine which Phoenix cluster that you want to connect to.

Prerequisites

  • Apache Phoenix version 3.0 requires HBase 0.94 or later.
  • Apache Phoenix version 4.0 requires HBase 0.98.1 or later.

HBase Master and HBase RegionServer packages are prerequisites for installing and using Apache Phoenix on a MapR cluster. Make sure these packages are installed before following the procedure to install Phoenix. See Installing HBase for details.

Installing Phoenix

Complete the following steps to install Phoenix:

  1. Download the Phoenix archive file associated with the version that you want to install:
    http://phoenix.apache.org/download.html
  2. Run the following command to extract the content of the archive file to a directory:
    tar -xf phoenix-<version>-incubating.tar.gz
  3. Add the phoenix-<version>-incubating-client-minimal.jar and the phoenix-core-<version>-incubating.jar files to the classpath of every HBase region server.

    For example, navigate to the directory where you extracted phoenix-3.0.0-incubating.tar.gz, and copy the phoenix-core-3.0.0-incubating.jar and phoenix-core-3.0.0-incubating.jar into the /opt/mapr/hbase/hbase-<version>/lib directory.

    Note: The phoenix-core-3.0.0-incubating.jar and phoenix-core-3.0.0-incubating.jar files are located in phoenix-3.0.0-incubating/common.

  4. Run the following command to restart all of the HBase region servers:
    maprcli node services -hbregionserver restart -nodes <hostname>
  5. Optionally, you can run the following command to verify that the Phoenix JARs were copied to the HBase classpath:
    hbase classpath | grep 'phoenix'
  6. On client-only nodes, add the phoenix-<version>-incubating-client.jar to the classpath of the Phoenix client (only for use with Hadoop 1.x).

    For example, copy the JAR from the following directory:

    <phoenix installation directory>/phoenix-<version>-incubating/hadoop-1/phoenix-<version>-incubating-client.jar

    to

    /opt/mapr/hadoop/hadoop-<version>/lib

Now you can start Phoenix.

Starting Phoenix

To start Phoenix, complete the following steps:

  1. Optionally, run the following command to identify your Zookeeper quorum:
    maprcli node listzookeepers
    Example output: centos1:5181,centos2:5181,centos3:5181
  2. Navigate to the phoenix-3.0.0-incubating directory:

    cd phoenix-3.0.0-incubating
  3. Run the following command, indicating a Zookeeper node, to launch SQLLine:
    bin/sqlline.py <hostname>:<port>
    Example: bin/sqlline.py centos1:5181

When SQLLine starts, the following prompt displays:
0: jdbc:phoenix:<hostname>:<port>>

Now you can use Phoenix to query and manage HBase tables. 

Phoenix Example

In this example, you will use Phoenix to create a STOCK_SYMBOL table, load data into the table, and then query the table. 

SQLLine must be running to complete the steps in this example.

To try out Phoenix, complete the following steps:

  1. At the 0: jdbc:phoenix:<hostname>:<port> prompt, issue the following SQL statement to drop the table STOCK_SYMBOL:
    DROP TABLE STOCK_SYMBOL;

  2. Create a table named STOCK_SYMBOL with two columns, SYMBOL and COMPANY.
    CREATE TABLE STOCK_SYMBOL (SYMBOL VARCHAR NOT NULL PRIMARY KEY, COMPANY VARCHAR);
  3. Insert data into the STOCK_SYMBOL table.
    UPSERT INTO STOCK_SYMBOL VALUES ('Hadoop','MapR'); 
  4. Query the STOCK_SYMBOL table. 

    SELECT * FROM STOCK_SYMBOL;

    The query returns the following results:

    +------------+------------+
    |   SYMBOL   |  COMPANY   |
    +------------+------------+
    | Hadoop     | MapR       |
    +------------+------------+
    1 row selected (0.098 seconds)

More Information

For more information about Apache Phoenix, refer to the Apache Phoenix documentation: http://phoenix.incubator.apache.org/