Geo-Distribution of Big Data and Analytics

Data Where You Want It

Many organizations have begun to rethink the strategy of allowing regional teams to maintain independent databases that are periodically consolidated with the head office. As businesses extend their reach globally, these hierarchical approaches no longer work. Instead, an enterprise’s entire data infrastructure—including multiple types of data persistence—needs to be shared and updated everywhere at the same time with fine-grained control over who has access.

This practical report examines the requirements and challenges of constructing a geo-distributed data platform, including examples of specific technologies designed to meet them. Authors Ted Dunning and Ellen Friedman also provide real-world use cases that show how low-latency geo-distribution of very large-scale data and computation provide a competitive edge.

With this report, you’ll explore:

  • How replication and mirroring methods for data movement provide the large scale, low latency, and low cost that systems demand
  • The importance of multimaster replication of data streams and databases
  • Advantages (and disadvantages) of cloud neutrality, cloud bursting, and hybrid cloud architecture for transferring data
  • Why effective data governance is a complex process that requires the right tools for controlling and monitoring geo-distributed data
  • How to make containers work for geo-distributed data at scale, even where stateful applications are involved
  • Use cases that demonstrate how telecoms and online advertisers distribute large quantities of data