MapR 5.0 Documentation : Getting Started with MapR Native Tables

We'll be working with MapR tables from the Linux shell. Open a terminal by selecting Applications > Accessories > Terminal (see A Tour of the MapR Virtual Machine).

Note: Although this tutorial was originally designed for users of the MapR Virtual Machine, you can easily adapt these instructions for a node in a cluster, for example by using a different directory structure.

In this tutorial, we'll create a MapR table on the cluster, enter some data, query the table, then clean up the data and exit.

MapR tables are organized by column, rather than by row. Furthermore, the columns are organized in groups called column families. A column family's name is known as the qualifier for that column family. When creating a MapR table, define the column families before inserting any data. Changing your column families can be difficult after creating the table, so it is important to think carefully about what column families will be useful for your particular data. Each column family can contain a very large number of columns. Columns are named using the format family:qualifier.

In a MapR table, columns don't exist for rows where they have no values, a quality called sparseness. Sparse tables save space, and different rows can have different columns. Use whatever columns you need for your data on a per-row basis.

Before you start: The user directory

MapR tables are stored natively in your cluster's filesystem, just as files are. The virtual machine's cluster is mounted over NFS in the /mapr/ directory. The cluster already has a /user directory in it. Make a /mapr directory for the MapR user under the /user directory with this command:

$ mkdir /mapr/

Now MapR can track the activity on the tables you create.

Example: Creating a MapR Table

You can create a MapR table from the HBase shell, from the MapR Control System, or with the MapR CLI interface. Expand any of the following sections for detailed instructions.

With the HBase shell

This example creates a table called development in directory /user/mapr with a column family called stage, using system defaults. In this example, we first start the HBase shell from the command line with hbase shell, and then use the create command to create the table.
After creating the table, we use the alter command to add a column family.

$ hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.92.2, rUnknown, Mon Dec 17 09:23:31 PST 2012

hbase(main):001:0> create '/user/mapr/development', 'stage'
hbase(main):002:0> alter '/user/mapr/development', {NAME => 'status'}
  1. Type get '/user/mapr/development', 'row1' to display the contents of row 1. Sample output:
    COLUMN                CELL
     stats:daily          timestamp=1321296699190, value=test-daily-value
     stats:weekly         timestamp=1321296715892, value=test-weekly-value
    2 row(s) in 0.0330 seconds
  2. Type drop '/user/mapr/development' to drop the table and delete all data.
  3. Type exit to exit the HBase shell.
With the MapR Control System
  1. From the terminal, create the directory /analysis/tables under the cluster's /user directory with the following command:
    $ mkdir /mapr/
  2. In the MCS Navigation pane under the MapR Data Platform group, click Tables. The Tables tab appears in the main window.
  3. Click the New Table button.
  4. Type a complete path for the new table: /user/analysis/tables/table01
  5. Click OK. The MCS displays a tab for the new table.

The screen-capture below demonstrates the creation of a table table01 in location /user/analysis/tables/.

To add a column family with the MapR Control System:

  1. In the MCS Navigation pane under the MapR Data Platform group, click Tables. The Tables tab appears in the main window.
  2. Find the table you want to work with, using one of the following methods.
    • Scan for the table under Recently Opened Tables on the Tables tab.
    • Enter a regular expression for part of the table pathname in the Go to table field and click Go.
  3. Click the desired table name. A Table tab appears in the main MCS pane, displaying information for the specific table.
  4. Click the Column Families tab.
  5. Click New Column Family. The Create Column Family dialog appears.
  6. Enter values for the following fields:
    • Column Family Name - Required.
    • Max Versions - The maximum number of versions of a cell to keep in the table.
    • Min Versions - The minimum number of versions of a cell to keep in the table.
    • Compression - The compression algorithm used on the column family's data. Select a value from the drop-down. The default value is Inherited, which uses the same compression type as the table. Available compression methods are LZF, LZ4, and ZLib. Select OFF to disable compression.
    • Time-To-Live - The minimum time-to-live for cells in this column family. Cells older than their time-to-live stamp are purged periodically.
    • In memory - Preference for a column family to reside in memory for fast lookup.

You can change any column family properties at a later time using the MCS or maprcli table cf edit from the command line.

The screen-capture below demonstrates the creation of a column family userinfo to table at location /user/analysis/tables/table01.

With the MapR CLI
  1. Use the maprcli table create command at a command line. For details, type maprcli table create -help at a command line. The following example demonstrates creation of a table table02 in cluster location /user/analysis/tables/. The cluster is mounted at /mapr/.
    $ maprcli table create -path /user/analysis/tables/table02
  2. List the tables in the directory to verify that table02 was successfully created:
    $ ls -l /mapr/
    lrwxr-xr-x 1 mapr mapr 2 Oct 24 16:14 table01 -> mapr::table::2056.62.17034
    lrwxr-xr-x 1 mapr mapr 2 Oct 24 16:13 table02 -> mapr::table::2056.56.17022
  3. Use the maprcli table listrecent command to show recent table activity.
    $ maprcli table listrecent
  4. Add a column family with the maprcli table cf create command. For details see table cf create or type maprcli table cf create -help at a command line. The following example demonstrates addition of a column family named casedata in table /user/analysis/tables/table01, using lz4 compression, and keeping a maximum of 5 versions of cells in the column family.
    $ maprcli table cf create -path /user/analysis/tables/table01 \
        -cfname casedata -compression lzf -maxversions 5
    $ maprcli table cf list -path /user/analysis/tables/table01
    inmemory  cfname    compression  ttl  maxversions  minversions
    true      userinfo  lz4          0    3            0
    false     casedata  lzf          0    5            0

You can change the properties of a column family with the maprcli table cf edit command.