Data tiers on MapR let you store, manage, and analyze data in different tiers, based on performance, cost, and capacity trade-offs, regardless of the underlying physical storage infrastructure. All your data has different characteristics and thereby mandates different requirements from the underlying data platform. With MapR data tiers, you can segregate data and easily balance between performance, cost, and capacity requirements.
We are currently in the exabyte era, where most of the data has been generated in the recent few years or is being generated on a daily basis. More and more organizations are embracing a continuous analysis model for their business decisions, which requires them to handle different types of data with varying SLAs. As organizations store more and more historical data, the characteristics of data change. Current data that is active will transition to not being actively used over time. Data platforms must be equipped to offer several capabilities to handle the data life cycle.
MapR data tiers allow you to store, manage, and analyze your ever-growing data, based on different SLAs. MapR introduces a three-tiered approach to placing and managing data:
Use replicas to protect and spread your extremely active data across the cluster, knowing it’s always available. Associate the tier with all-flash performance to achieve the maximum performance and availability. For example, an organization building a machine learning (ML) pipeline will invariably have large volumes of training data that are frequently accessed and updated for building initial and subsequent models. Depending on the rate with which the training data is updated, replicating and storing it on a high-performance tier will accelerate ML training jobs, resulting in faster building of models.
Apply erasure coding data protection while maximizing the capacity efficiency. Cost to store large volumes of data can vary widely, but with a MapR capacity tier, you can plan on reducing your overall cost in managing data. Highlighting the same ML example, using large volumes of training data will require efficient storage, while still allowing continuous updates and ingestion of new data.
Use the MapR integration with S3-Compatible APIs for storing data long-term. Typically used for archival purposes, a MapR archive tier allows you to move data to the cloud or any S3-Compatible cheap store with the ability to bring back the data into an active, operational mode quickly.
Easily migrate and manage movement of data across the tiers, based on policies. For example, you can move data if it is older than a certain number of days.
Associate any tier with all-flash performance or spinning disks to form a combination of highly performant or highly dense tiers. For long-term archiving, tier data to any S3-compatible object store, be it a public cloud or an on-premises third party vendor supported by MapR, for a highly efficient configuration. MapR allows a mix-and-match of these tiers within a single cluster, simplifying provisioning and management of data.