gfsck

Describes how you can use the gfsck command, under the supervision of MapR Support or Engineering, to perform consistency checks and appropriate repairs on a volume, or a volume snapshot.

You can use the gfsck command when the local fsck either repairs or loses some containers at the highest epoch.

For an overview of using the GFSCK command, see Using Global File System Checking.

Permissions Required

Although you need to be the root user to run this command, checking tiering-enabled volumes requires you to be the mapr user.

Syntax

/opt/mapr/bin/gfsck
    [-h] [--help]
    [-c] [--clear]
    [-d] [--debug]
    [-b] [--dbcheck]
    [-r] [--repair]
    [-y] [--assume-yes]
    [-Gquick] [--check-gateway-consistency-only] [-Gfull] [--check-gateway-vcd-consistency] [-Dquick] [--check-object-presence] [-GdelLog] [--delete-gateway-log] [-Dfull] [--check-object-data] [-D] [--crc] [cluster=cluster-name (default=default)]
    [rwvolume=volume-name (default=null)]
    [snapshot=snapshot-name (default=null)]
    [snapshotid=snapshot-id (default=0)]
    [fid=fid (default=null)]
    [cid=cid (default=0)]
    [scanthreads=inode scanner threads count (default:10, max:1000)]

Parameters

Parameter

Description

User who must use this option
-h|--help Prints usage text. N.A
-c|--clear Clears previous warnings before performing the global file system check. N.A
-d|--debug Provides information for debugging. N.A
-b|--dbcheck Checks that every key in a tablet is within that tablet's startKey and endKey range. As this option is I/O intensive, use this option only if you suspect database inconsistency. root
-r --repair Indicates and repairs the inconsistencies detected by -GQuick, -GFull, and -DQuick. root
-y --assume-yes If specified, assumes that containers without valid copies (as reported by CLDB) are deleted automatically. If not specified, gfsck pauses for user input: yes to delete, no to exit gfsck, or ctrl-C to quit. N.A
-Gquick|--check-gateway-consistency-only Checks if the entries in the metadata tables maintained internally for objects and tiers (the mapping between the Virtual Cluster Descriptor (VCD) map and object map) , are consistent, and reports an error if not. mapr
-Gfull|--check-gateway-vcd-consistency Checks if the entries in the metadata tables maintained internally for objects and containers (the mapping between the VCD map and object map, along with the mapping between the VCD map and the MFS meta data), are consistent and reports an error if not. mapr
-Dquick|--check-object-presence Specified with either -Gquick or -Gfull, checks and reports if the object in the metadata tables exists in the tier or not. mapr
-GdelLog|--delete-Gateway-Log Deletes the metalog created by MAST Gateway if it crashes. mapr
cluster Specifies the name of the cluster (default: default cluster) N.A
rwvolume Specifies the name of the volume (default: null) N.A
scanthreads Specifies the number of threads for scanning inodes (default:10, max:1000) N.A
snapshot Specifies the name of the snapshot (default: null) N.A
snapshotid Specifies the snapshot ID (default: 0) N.A

Example (Debug mode)

In debug mode, run the gfsck command on the read/write volume named mapr.cluster.root:

/opt/mapr/bin/gfsck rwvolume=mapr.cluster.root -d

Sample output is as follows:

Starting GlobalFsck:
  clear-mode            = false
  debug-mode            = true
  dbcheck-mode          = false
  repair-mode           = false
  assume-yes-mode       = false
  cluster               = my.cluster.com
  rw-volume-name        = mapr.cluster.root
  snapshot-name         = null
  snapshot-id           = 0
  user-id               = 0
  group-id              = 0

  get volume properties ...
    rwVolumeName = mapr.cluster.root (volumeId = 205374230, rootContainerId = 2049, isMirror = false)

  put volume mapr.cluster.root in global-fsck mode ...

  get snapshot list for volume mapr.cluster.root ...

  starting phase one (get containers) for volume mapr.cluster.root(205374230) ...
    container 2049 (latestEpoch=3, fixedByFsck=false)
    got volume containers map
  done phase one

  starting phase two (get inodes) for volume mapr.cluster.root(205374230) ...
    get container inode list for cid 2049
      +inodelist: fid=2049.32.131224 pfid=-1.16.2 typ=4 styp=0 nch=0 dMe:false dRec: false
      +inodelist: fid=2049.33.131226 pfid=-1.16.2 typ=2 styp=0 nch=0 dMe:false dRec: false
      +inodelist: fid=2049.34.131228 pfid=-1.33.131226 typ=4 styp=0 nch=0 dMe:false dRec: false
      +inodelist: fid=2049.35.131230 pfid=-1.16.2 typ=4 styp=0 nch=0 dMe:false dRec: false
      +inodelist: fid=2049.36.131232 pfid=-1.16.2 typ=4 styp=0 nch=0 dMe:false dRec: false
      +inodelist: fid=2049.38.262312 pfid=-1.16.2 typ=2 styp=0 nch=0 dMe:false dRec: false
      +inodelist: fid=2049.39.262314 pfid=-1.38.262312 typ=1 styp=0 nch=0 dMe:false dRec: false
    got container inode lists (totalThreads=1)
  done phase two

  starting phase three (get fidmaps & tabletmaps) for volume mapr.cluster.root(205374230) ...
    got fidmap lists (totalFidmapThreads=0)
    got tabletmap lists (totalTabletmapThreads=0)
  done phase three

  === Start of GlobalFsck Report ===

  file-fidmap-filelet union --
   2049.39.262314:P     --> primary (nchunks=0)      --> AllOk
    no errors

  table-tabletmap-tablet union --
    empty

  orphan directories --
    none

  orphan kvstores --
    none

  orphan files --
    none

  orphan fidmaps --
    none

  orphan tables --
    none

  orphan tabletmaps --
    none

  orphan dbkvstores --
    none

  orphan dbfiles --
    none

  orphan dbinodes --
    none

  containers that need repair --
    none

  incomplete snapshots that need to be deleted --
    none

  user statistics --
    containers          = 1
    directories         = 2
    kvstores            = 0
    files               = 1
    fidmaps             = 0
    filelets            = 0
    tables              = 0
    tabletmaps          = 0
    schemas             = 0
    tablets             = 0
    segmaps             = 0
    spillmaps           = 0
    overflowfiles       = 0
    bucketfiles         = 0
    spillfiles          = 0

  === End of GlobalFsck Report ===

  remove volume mapr.cluster.root from global-fsck mode (ret = 0) ...

GlobalFsck completed successfully (7142 ms); Result: verify succeeded

To verify if the object is present on the tier, run the gfsck command on the tiering-enabled read/write volume named for_test5:

/opt/mapr/bin/gfsck rwvolume=for_test5 -Gfull -Dquick

Sample output is as follows:

Starting GlobalFsck:
  clear-mode            = false
  debug-mode            = false
  dbcheck-mode          = false
  repair-mode           = false
  assume-yes-mode       = false
  cluster               = Cloudpool19
  rw-volume-name        = for_test5
  snapshot-name         = null
  snapshot-id           = 0
  user-id               = 2000
  group-id              = 2000

  get volume properties ...

  put volume for_test5 in global-fsck mode ...

  get snapshot list for volume for_test5 ...

  starting phase one (get containers) for volume for_test5(16558233) ...  
    got volume containers map

done phase one

  starting phase two (get inodes) for volume for_test5(16558233) ...
    got container inode lists
  done phase two

  starting phase three (get fidmaps & tabletmaps) for volume for_test5(16558233) ...
    got fidmap lists
    got tabletmap lists
    completed secondary index field path info gathering
    completed secondary index consistency check
    Starting DeferMapCheck..
    completed DeferMapCheck
  done phase three

  === Start of GlobalFsck Report ===

  file-fidmap-filelet union --
    no errors

  table-tabletmap-tablet union --
    empty

  containers that need repair --
    none

  user statistics --
    containers          = 6
    directories         = 6
    files               = 1
    filelets            = 2
    tables              = 0
    tablets             = 0

  === End of GlobalFsck Report ===

Putting volume for_test5 in GWgfsckMode . . . . .
        Tier stats----
        numVerifiedVcds         5719
        numVerifiedObjects      32
        numCorruptedVcd         0
        numCorruptedObjects     0
Removing volume for_test5 from GWgfsckMode
  remove volume for_test5 from global-fsck mode (ret = 0) ...

GlobalFsck completed successfully (37039 ms); Result: verify succeeded