If you want to convert source data (which is stored as byte arrays) to Elasticsearch types that MapR-DB supports, you can create each destination index explicitly with Elasticsearch’s
create index API and then define the mapping of data types with Elasticsearch’s
put mapping API. MapR gateways perform the data conversion.
Here is a list of the data types that gateways can convert your source data into by using this method:
- A base64 representation of binary data that can be stored in an index.
- Core Elasticsearch data types
- IP addresses
java.nio.ByteBufferto convert source data to boolean, byte, double, float, integer, long, short, and date data types. IP addresses and geolocations are passed as strings.
MapR-DB can convert these data types only if they meet to these requirements:
Boolean values must be represented by single bytes.
Timestamps must be long integers representing the time in milliseconds since the epoch.
Geolocations must be pairs of latitude and longitude coordinates or geohash data types encoded as UTF-8 strings.
IP addresses must be UTF-8 encoded strings.
If your data cannot meet these requirements, then you must write Java routines to tell MapR-DB how to perform custom conversions.
To specify how to convert source data to the Elasticsearch data types that MapR-DB supports for indexing in Elasticsearch, follow these steps for each source table:
- Create the index in Elasticsearch by calling Elasticsearch’s
create indexAPI. See Index API in the Elasticsearch documentation.
- Call Elasticsearch’s
put mappingAPI to register specific data-type mapping definitions for the type. When MapR-DB first puts data into the index, it calls Elasticsearch’s
get mappingAPI to retrieve the mapping definitions. See Put Mapping in the Elasticsearch documentation.
What to do next
If you have not done so already, register your Elasticsearch cluster or clusters with your MapR source cluster.
If you have already registered your Elasticsearch cluster, configure replication to types in Elasticsearch.
If you ever change how your source data is mapped to Elasticsearch data types, you must restart the MapR gateways that you are using for indexing. Follow these steps:
- Pause indexing of your MapR-DB source tables. To get a list of the Elasticsearch types that are used for each source table, use the
maprcli table replica elasticsearch listcommand. For each Elasticsearch type, issue the
maprcli table replica elasticsearch pausecommand to pause indexing.
- Restart the MapR gateways that you are using for indexing. See the section "On clusters where gateways are running" in Configuring MapR Gateways for Table Replication or Indexing.
- Resume indexing by issuing the command
maprcli table replica elasticsearch resumefor each Elasticsearch type that you are indexing your data in.