Selecting an Erasure Coding Scheme

An erasure coding scheme is the stripe layout m+n. Each stripe in an erasure coded volume is created by the same number of data fragments from all containers in the group of EC containers.

Containers for erasure coded volumes are created in groups of m+n and each container is placed on a different physical node. The stripe is an array of m data fragments and n parity fragments with a fragment size of 4MB. In the event of disk failure, any m piece can be used to get back the original file. That is:

  • m is the number of data fragments
  • n is the number of redundant fragments (referred to as parity fragments)
  • m/(m+n) is the encoding rate
  • m+n is the number of encoded fragments
For example, suppose m=4 and n=2, stripe depth=4MB. Here:
  • The number of data fragments is 4 and the number of parity fragments is 2.
  • The number of encoded fragments is 6.
  • The stripe size is 16MB (4x4MB) of user data and 8MB (2x4MB) of parity fragments.
  • MapR can handle 2 failures and any chunk can be recovered from any 4 chunks.
The following illustration shows the distribution of data and parity fragments for the above example:


When specifying an erasure coding scheme, keep the following in mind:

  • The number of data fragments must be greater than or equal to 3 and less than or equal to 10.
  • The number of parity fragments must be greater than 1 and less than (not equal to) the number of data fragments.
  • The number of data fragments must be greater than (not equal to) the number of parity fragments.
  • The number of nodes must be greater than or equal to the sum of data and parity fragments.

For example, you can choose from the following schemes for erasure-coded volumes:

EC Scheme Description
CLI MCS
3+<x> 3, 2 Specifies 3 data fragments and parity fragments less than the number of data fragments. Ensure that the required number of nodes, which is calculated by adding the number of data fragments and parity fragments, are available on the cluster.

For example, MCS allows a setting of 3 data fragments and 2 parity fragments. For this scheme, a minimum of 5 nodes (3+2) is required. MapR can tolerate up to 2 failures.

4+2 4, 2 (Default) Specifies 4 data fragments and 2 parity fragments. A minimum of 6 nodes is required for this option. MapR can tolerate up to 2 failures.
5+2 5, 2 Specifies 5 data fragments and 2 parity fragments. A minimum of 7 nodes is required for this option. MapR can tolerate up to 2 failures.
6+3 6, 3 Specifies 6 data fragments and 3 parity fragments. A minimum of 9 nodes is required for this option. MapR can tolerate up to 3 failures.
10+<x> N/A Specifies 10 data fragments and parity fragments less than the number of data fragments. Ensure that the required number of nodes, which is calculated by adding the number of data fragments and parity fragments, are available on the cluster.

For example, suppose a scheme of 10 data fragments and 2 parity fragments. A minimum of 12 nodes (10+2) is required for this scheme. MapR can tolerate up to 2 failures.

Note: In the event of a node failover, the amount of time to rebuild the objects (stripes) on another node depends on the number of data fragments in the chosen erasure coding scheme. For example, rebuild of erasure-coding scheme 10+2 could take more time compared to the erasure-coding scheme 3+2.
Note: Although you can create a volume even without the required number of nodes for a specific scheme, volume offload will fail if the required number of nodes are not present.