Tungsten Replicator 3.0 provides real-time replication functionality from MySQL and Oracle into Hadoop, translating row changes from your transactional store into change data within Hadoop. The can either be used as change information, or materialised into a view of the data that created carbon-copy tables within Hive.
Tungsten Replicator reads data from the transaction SQL store using either the binary log (MySQL) or Oracle Change Data Capture (CDC), converting the individual transactions into row-based events. This row data is then replicated into HDFS by creating change data within a CSV format. Tungsten Replicator includes DDL translation tools to convert MySQL or Oracle DDL into Hive format, and a materialization process that translates the source transactional table data into corresponded table data within Hive. Tungsten Replicator is open source software. These separate Hadoop specific tools are provided through Github
|MapR Distribution||3.0,3.1||HDFS API|
A basic outline for installation is:
Full details on the process are documented in the documentation: https://docs.continuent.com/tungsten-replicator-3.0/deployment-hadoop.html
Tungsten Replicator is an active, background, process. As long as the replicator is running, and data changes are being logged in your master transactional database, replicator will continue to replicate data into Hadoop.
The materialisation process must be run regularly to translate the change data into carbon copy tables. This can be managed through a simple cron job or Oozie workflow.
Support for open-source users and developers is provided through the mailing list.
Paid support options are available through the Continuent Support portal.