MapR Object Store with S3-Compatible API

Starting with MapR Ecosystem Pack (MEP) 6.0.0, MapR Object Store with S3-Compatible API (MapR Object Store) is included in MEP repositories. The MapR Object Store provides an S3 gateway to data in the MapR Data Platform.

The MapR Object Store manages all inbound S3 API requests to store data in or retrieve data from a MapR cluster. The MapR Data Platform stores data objects in the form of folders and files. Folders correlate with S3 buckets and files correlate with S3 data objects. A data object can be of any data type, but it must have a unique name as part of the S3-compatible API call. The data objects are grouped into a logical container called a bucket.

You can use the S3-compatible API to create, list, or delete a bucket. You can also use it to get, put, list, or delete a data object within a bucket.

The MapR Object Store also supports object notification through MapR Streams. See Using Kafka Streams for S3 Bucket Event Notifications.

The following image depicts an inbound S3 API request from a web application in the cloud to the MapR Data Platform:

How Object Tiering Differs from the MapR Object Store

The Object Tiering feature leverages its own outbound version of the S3 API to directly store archived data in the cloud. For more information about object tiering, see Data Tiering.

The following image depicts the outbound S3 API request to archive data in the cloud:

S3 Deployment Mode

The MapR Object Store only supports Amazon S3 standalone deployment mode because each instance of the MapR Object Store can only interact with one bucket or set of buckets at a time.

When you use the S3 API in standalone mode, each MapR Object Store instance must have its own backend directory in the MapR File System. You can either map a volume mount point to the directory or use the directory path itself. An S3 instance exclusively uses the allocated directory or volume in the MapR File System to serve an exclusive set of buckets.

If you need to migrate buckets to another S3 instance, you can move or copy the buckets to another directory or volume. See AWS CLI. If a bucket does not exist, an application can create a bucket through any of the S3 gateways; however, the bucket created will only be served through that gateway.

The following deployment scenario shows one MapR Object Store per cluster that supports multiple applications and multiple buckets with bucket sharing.

This scenario is useful when you want an application to access multiple buckets without knowing about bucket locations beforehand. The single MapR Object Store instance serves all requests without the need to partition any buckets.

The following deployment scenario shows two instances of the MapR Object Store in a cluster that supports multiple applications and buckets with bucket sharing. Note that bucket sharing across S3 instances is not supported.

Authorization to Access Data

By default, the MapR Object Store provides a two-tier authorization model that starts with an S3 bucket policies check at the S3 REST API level, followed by a file permissions check on the MapR Data Platform.

The following image shows the two tiers of authorization:

When a MapR Object Store instance receives a request from a tenant to access a bucket or object, it first checks for bucket policies that reference that particular tenant. If the tenant does not have access via the bucket policy, the request fails and no other checks are performed.

If the tenant has access via the bucket policy, the MapR File System performs the next check using the mapped UID and GID credentials for the tenant.

The Configuring the MapR Object Store with S3-Compatible API topic describes how to modify the type of authorization, configure tenants and credentials, and secure data. To see a configuration example, see Multi-Tiered Authorization Example.