Stream Design

Streams are created in volumes and contain topics which, in turn, contain messages. Security, replication, retention, and compression policies are applied at the stream-level.

When designing the architecture, take in account the following factors:
  • Security Permissions (using access-control expressions)

    Security permissions are set at the stream-level and, subsequently, topics inherit the stream permissions. See Stream Security for more information.

  • Data Replication

    MapR Event Store For Apache Kafka streams can be replicated to other streams in the same or different MapR clusters. For example, you can create a backup copy of a stream so that producers and consumers fail over to the backup if the original stream goes offline. See Stream Replication

  • Data Retention

    The time-to-live of a message is the elapse time (in seconds) between the publication of a message in a topic in this stream and the expiration of that message. See Time-to-Live for Messages for more information.

  • Data Compression

    Topic messages can be sent to the server compressed. When compression is implemented, the messages are stored compressed, replicated to other containers compressed, and (if stream replication is configured) replicated to replica streams compressed. See Managing Compression for more information.

Pollution Monitor Example

Suppose that you plan to create the stream pollution_monitors to collect various measurements about pollution levels in European cities. However, during a planning session, the representative from Amsterdam says that her country wants to analyze of the data for its cities and would like your company to replicate the data to its own MapR cluster, where its own consumers can read the replicated messages.

In this scenario, you might do the following:
  • Create a stream dedicated to the Netherland's pollution data or even for every country you are monitoring. For example, create streams named pollution_monitors_netherlands, pollution_monitors_sweden, pollution_monitors_france, and so on.
  • Within each stream, create topics for each city in that county. For example, a create topic named amsterdam which contains data from Amsterdam's pollution sensors.
  • Since, in this scenario, the Amsterdam representive also requested stream replication to their own MapR cluster, you would set up stream replication from your MapR cluster to Amsterdam's MapR cluster. See Managing Stream Replication for information about setting up and managing replication.

Alternatively, perhaps the Netherlands didn't request replication to their own MapR cluster, however, you want to restrict access to the pollution data where consumers could read only pollution data for their respective country.

In this scenario, you might do the following:
  • Create streams for each country.
  • Create topics for each city in that country.
  • Set each stream's consumeperm permission for consumers associated with that country. See Stream Security for more information about security permissions used with MapR Event Store For Apache Kafka streams. For general information about access-control expressions, see ACE Syntax.