MapR 5.0 Documentation : hadoop mfs

The hadoop mfs command performs operations on directories in the cluster. The main purposes of hadoop mfs are to display directory information and contents, to create symbolic links, and to set compression and chunk size on a directory.

Syntax

hadoop mfs
    [ -ln <target> <symlink> ]
    [ -ls <path> ]
    [ -lsd <path> ]
    [ -lsr <path> ] 
    [ -Lsr <path> ] 
    [ -lsrv <path> ]
    [ -lss <path> ]
    [ -setcompression  on|off|lzf|lz4|zlib <dir> ]
    [ -setaudit on|off <dir|file|table>]
    [ -setchunksize <size> <dir> ]
    [ -setnetworkencryption on|off <target> ]
    [ -help <command> ]

Parameters

The normal command syntax is to specify a single option from the following table, along with its corresponding arguments. If compression and chunk size are not set explicitly for a given directory, the values are inherited from the parent directory.

Parameter

Description

-ln <target> <symlink>

Creates a symbolic link <symlink> that points to the target path <target>, similar to the standard Linux ln -s command.

-ls <path>

Lists files in the directory specified by <path>. The hadoop mfs -ls command corresponds to the standard hadoop fs -ls command, but provides the following additional information:

  • Chunks used for each file
  • Server where each chunk resides
  • Whether compression is enabled for each file
  • Whether encryption is enabled for each file
  • Whether audit is enabled (A) or disabled (U) for each file

-lsd <path>

Lists files in the directory specified by <path>, and also provides information about the specified directory itself:

  • Whether compression is enabled for the directory (indicated by z )
  • The configured chunk size (in bytes) for the directory.

-lsr <path>

Lists files in the directory and subdirectories specified by <path>, recursively, including dereferencing symbolic links. The hadoop mfs -lsr command corresponds to the standard hadoop fs -lsr command, but provides the following additional information:

  • Chunks used for each file
  • Server where each chunk resides

-Lsr <path>

Equivalent to lsr, but additionally dereferences symbolic
links

-lsrv <path>

Lists all paths recursively without crossing volume links.

-lss <path>

Lists files in the directory specified by <path>, with an additional column that displays the number of disk blocks per file. Disk blocks are 8192 bytes.

-setcompression on|off|lzf|lz4|zlib <dir>

Turns compression on or off on the directory specified in <dir>, and sets the compression type:

  • on — turns on compression using the default algorithm (LZ4)
  • off — turns off compression
  • lzf — turns on compression and sets the algorithm to LZF
  • lz4 — turns on compression and sets the algorithm to LZ4
  • zlib — turns on compression and sets the algorithm to ZLIB
-setaudit on|off <dir|file|table>

Enables auditing of the specified directory, file, or MapR-DB table.

Enabling auditing of a directory does not enable auditing of files and subdirectories that exist in the directory. You must enable auditing on those existing files and subdirectories. However, any new files and subdirectories that you create will automatically be enabled for auditing. See Checking Whether Auditing is Enabled for a Directory, File, or MapR-DB Table.

For operations on the object to be logged, auditing also needs to be enabled on the cluster and the volume in which the object is located. See Enabling Auditing for details. If auditing is enabled for a directory, new files and directories created within that directory are also enabled for auditing.

-setchunksize <size> <dir>

Sets the chunk size in bytes for the directory specified in <dir>. The <size> parameter must be a multiple of 65536.

-setnetworkencryption on|off <target>
Sets network encryption on or off for the filesystem object defined in <target>. The cluster encrypts network target to or from a file, directory, or MapR table with network security enabled.

-help <command>

Displays help for the hadoop mfs command.

Examples

The hadoop mfs command is used to view file contents. You can use this command to check if compression is turned off in a directory or mounted volume. For example,

# hadoop mfs -ls /
Found 23 items
vrwxr-xr-x Z U U   3 root root         13 2012-04-29 10:24  268435456 /.rw
               p mapr.cluster.root writeable 2049.35.16584 -> 2049.16.2  scale-50.scale.lab:5660 scale-51.scale.lab:5660 scale-52.scale.lab:5660
vrwxr-xr-x U U U   3 root root          7 2012-04-28 22:16   67108864 /hbase
               p mapr.hbase default 2049.32.16578 -> 2050.16.2  scale-50.scale.lab:5660 scale-51.scale.lab:5660 scale-52.scale.lab:5660
drwxr-xr-x Z U U   3 root root          0 2012-04-29 09:14  268435456 /tmp
               p 2049.41.16596  scale-50.scale.lab:5660 scale-51.scale.lab:5660 scale-52.scale.lab:5660
vrwxr-xr-x Z U A   3 root root          1 2012-04-27 22:59  268435456 /user
               p users default 2049.36.16586 -> 2055.16.2  scale-50.scale.lab:5660 scale-52.scale.lab:5660 scale-51.scale.lab:5660
drwxr-xr-x Z U U   3 root root          1 2012-04-27 22:37  268435456 /var
               p 2049.33.16580  scale-50.scale.lab:5660 scale-51.scale.lab:5660 scale-52.scale.lab:5660 

In the above example, the letter Z indicates LZ4 compression on the directory; the letter U indicates that the directory is uncompressed. In the following example, the listed item is both uncompressed (first U) and unencrypted (second U).

[root@node1-302 ~]# hadoop mfs -ls /hbase
Found 10 items
drwxr-xr-x U U U   3 mapr mapr          3 2014-05-28 12:05   67108864 /hbase/-ROOT-
                 p 2050.34.3674200  node2-302:5660 node1-302:5660 node3-302:5660
...

Output

When used with -ls, -lsd, -lsr, or -lss, hadoop mfs displays information about files and directories. For each file or directory hadoop mfs displays a line of basic information followed by lines listing the chunks that make up the file, in the following format:

{mode} {compression} {encryption} {audit} {replication} {owner} {group} {size} {date} {chunk size} {name}
                          {chunk} {fid} {host} [{host}...]
                          {chunk} {fid} {host} [{host}...]
                          ...

Volume links are displayed as follows:

{mode} {compression} {encryption} {audit} {replication} {owner} {group} {size} {date} {chunk size} {name}

                          {chunk} {target volume name} {writability} {fid} -> {fid} [{host}...]

For volume links, the first fid is the chunk that stores the volume link itself; the fid after the arrow (->) is the first chunk in the target volume.

The following table describes the values:

mode

A text string indicating the read, write, and execute permissions for the owner, group, and other permissions. See also Managing Permissions.

compression

  • U: uncompressed
  • L: LZf
  • Z (Uppercase): LZ4
  • z (Lowercase): ZLIB
encryption U: unencrypted; E: encrypted
auditU: disabled; A: enabled

replication

The replication factor of the file (directories display a dash instead)

owner

The owner of the file or directory

group

The group of the file of directory

size

The size of the file or directory

date

The date the file or directory was last modified

chunk size

The chunk size of the file or directory

name

The name of the file or directory

chunk

The chunk number. The first chunk is a primary chunk labeled "p", a 64K chunk containing the root of the file. Subsequent chunks are numbered in order.

fid

The chunk's file ID, which consists of three parts:

  • The ID of the container where the file is stored
  • The inode of the file within the container
  • An internal version number

host

The host on which the chunk resides. When several hosts are listed, the first host is the first copy of the chunk and subsequent hosts are replicas.

target volume name

The name of the volume pointed to by a volume link.

writability

Displays whether the volume is writable.