MapR-XD Object Tiering

Extend the Data Fabric to the Cloud with Transparent Tiering of Cold Data

Using Object Tiering, enterprises can address rapid data growth and optimize cluster resources on premises or off-premises by using the object storage as a highly economical storage tier for “cold” or “frozen” data that is rarely accessed. Object Tiering provides policy-based, automated tiering functionality that lets you seamlessly integrate and move “inactive” data to and from the cloud.

Reduce the Cost of Aging Data

Enterprises today are looking to leverage cost-effective cloud storage solutions to cut costs, simplify IT management, and gain virtually limitless storage capacity. Object Tiering allows enterprises to use valuable cluster resources for more active data, by moving the cold data to cloud storage, which can be retained at minimal cost. Additionally, customers can eliminate the need for archiving software and cloud gateways that add additional complexity.

Secure Data

Object Tiering automatically protects data transmitted to cloud storage with built-in data encryption functionality. Once enabled, all the data is always encrypted before it leaves the primary cluster so that it is secure, both over the network and at rest in cloud storage.

Transparent Cloud Integrations

Once set, Object Tiering is transparent to users and applications. Administrators have the flexibility to integrate with their choice of public cloud or private that expose the S3-Compatible API (including Amazon S3 services).

Simple & Automated

Object Tiering can be deployed in minutes and is easy for administrators to set up, configure, and manage, using simple polices. Object policies are dynamic, flexible, and scalable, which provides administrators with granular control of the data placement to meet the business objectives of your business. In a given policy, administrators can identify the data to be tiered, the criteria for tiering, and the choice of the public or private cloud target (e.g., Object Tiering may be used to tier all large-sized files from the “hot” tier, all files that have not changed in the last 12 months from the “warm” tier, or all files owned by “Alice”).

One Global Namespace

Object Tiering seamlessly integrates with the Global Namespace functionality. Customers can now transparently store HOT, WARM and COLD data under one global namespace and eliminate the need to create multiple silos that create segregated namespaces and operational complexity. With global namespace you can keep the path name intact irrespective of where the data is placed at any given time (e.g. today the data can be placed on Fast nodes on-prem and a month later can be moved to Cloud as it loses value).


Benefits

  • Reduce the cost of aging data
  • Leverage economical OPEX/CAPEX cloud pricing models
  • Have the flexibility to choose between public or private cloud storage solutions
  • Eliminate app reconfiguration with a single global namespace for hot/warm/cold data
  • Ensure security in the cloud with enterprise-grade encryption
  • Automatically move data, based on policies
  • Reduce operations overhead in managing cold data
  • Eliminate the need for silos or cloud gateways
  • Maintain snapshots and compression benefits, even for cold data

namespace-across-all-tiers

Key Features

File Rules & Policy Engine

These rules and policies are the essential control mechanism for Object Tiering. MapR runs these policies on a regular basis. Each specifies the volumes/files to be managed, actions to take on the files, and what schedule to follow. File-matching criteria is defined as a rule in the system and helps identify which files must be moved to the cloud. After defining a rule on a volume, administrators must specify Object Tiering actions to perform on the identified file, including the cloud storage target and encryption.

Offloading Files to Cloud

This process involves extracting the data from the file and placing it in one or more cloud objects. MapR automatically in the background identifies files to move, based on rules and policies, and then moves the data blocks of these files as objects to cloud storage (also referred to as “offloaded files”) leaving in place all the metadata on the local cluster.

External Cloud Tiers

To leverage cloud storage, administrators are required to set up one or more accounts with a third-party cloud provider. Once set up, Object Tiering allows customers to define these external cloud providers within the MapR Converged Data Platform, so that it can be used to set policies for offloading data.

Inline Transparent File Access

Object Tiering enables users and applications to transparently access offloaded data as if they were locally present. It allows reads, overwrites and all filesystem operations on offloaded files without causing disruption to the applications. When the user opens an offloaded file, Object Tiering retrieves and caches cloud data locally for a given period of time. The user can view and edit the file as usual. Object Tiering automatically retrieves and sends any updated file data back to the cloud so that the cloud always contains the latest version.

Snapshots & Mirroring Interoperability

Object Tiering is compatible with all MapR core features. For protection against disaster and accidental deletion, Object Tiering seamlessly integrates and extends mirroring and snapshot functionality. Snapshot and Mirrored data is moved and retrieved to the cloud on a need basis explicitly

Recalling From Cloud

Object tiering also offers a recall functionality that is designed to bring a portion of data back into the local cluster as need arises. When a recall of a volume/file is issued, Object Tiering asynchronously brings the copy of the offloaded data back into the local cluster for local access. Customers can also set the amount of time to keep the recalled data cached locally before offloading it again to the cloud.

Download PDF