Read MapR Streams Using a Kafka Source

The following parameters are required when configuring Kafka source to read data from MapR Streams topics:
Property Name Description
channels If the source writes events to a channel, this is the name of the channel that the source writes events to.
type For Flume 1.7 this must be set to toorg. apache.flume.source.kafka.v09.KafkaSource For Flume 1.6 this must be set to org.apache.flume.source.kafka.v09.KafkaSource
kafka.consumer.group.id A unique identifier of the consumer group. This defaults to flume. Setting the same id in multiple sources or agents indicates that they are part of the same consumer group; when these sources or agents are running at the same time, they will each read a unique set of partitions for the topics in the group.
kafka.topics A comma separated list of MapR Streams topics where each topic is specified with the volume path and stream name. For example: /volume_path/stream_name:topic_name1, /volume_path/stream_name:topic_name2.
Note: It is critical that the path to the topic starts with a slash(/), as the slash is what distinguishes the topic as a MapR Streams topic.
kafka.topics.regex
Note: Flume 1.7 only
Regular expression that defines the set of topics the source is described on. This property overrides kafka.topics. Example: /streaming_data/flume_stream:topic[0-9]$

For additional properties that you may want to configure, see the Flume documentation. However, note that kafka.bootstrap.servers is not required for reading MapR Streams.

Tip: To increase throughput, set the batchSize to a higher value. The batchSize is the maximum number of messages written to Channel in one batch. By default, it is set to 1000.