Persistence in the Age of Microservices: Introducing MapR Converged Data Platform for Docker

February 07, 2017 | BY Will Ochandarena

Today we took a big step in our convergence vision by announcing the MapR Converged Data Platform for Docker, which includes the MapR Persistent Application Client Container (PACC). This is noteworthy because the new generation of converged applications is as much about deploying operational, user-facing applications as performing analytics and machine learning, and containers are the future of application deployment. In this post I’ll talk about what exactly persistence means in the new age of microservice-based applications, contrasting with what persistence used to mean for applications.

Traditional Application Persistence

Historically, applications have been written assuming a stable, reliable infrastructure: server, storage, network. As such, it’s been common for applications to treat attached file storage as a state store, creating and maintaining files for configuration, history, user profiles, and working state. In the event the application restarts, it simply loads its working state from storage and picks up where it left off.

And then containers came along, making developers more efficient by offering a consistent operating environment from development to production and by making data centers more efficient with smaller footprints and faster startup times. Traditionally, when an application inside a container restarts, all file data written is wiped, and the container starts up fresh. This can be catastrophic to applications that had saved critical user data or configuration in storage.

The solution? Have your containerized application write state data to a reliable storage layer. By building your application container on top of the MapR PACC, you can connect it to MapR for reliable persistence. With this, containerized applications can survive application failure, hardware failure, or even whole data center failure.

Microservice Persistence

With the rise of microservices, developers are thinking differently about persistence. In this new world, a few things have changed:

  • Operational state isn’t stored in files; it’s stored in a database table, typically one that is dedicated to the microservice (to accomplish isolation of duties).
  • Units of work are no longer transactions; they’re events, typically received via a pub/sub channel by upstream microservices.
  • Application history (logs, metrics) is still produced and has to be persisted somewhere for monitoring. Logs are usually saved to a file, metrics to a stream.

What’s important to note here isn’t that these microservices are “stateless.” They have state - it’s just persisted in a new suite of stores. Therefore, persistence in the age of microservices means persistence database tables and event streams, in addition to traditional files. Furthermore, the state is distributed across these stores - the complete state of an application at time X is the collective state of what has been read off of the input stream, what has been stored in the operational database, and what has been sent to the output stream.

What happens when your platform doesn’t provide converged persistence? The load falls on application developers to bring their services - deploying databases and message brokers alongside their apps. When this happens, it’s the developer’s responsibility to ensure each data service is reliable, scalable, has a DR strategy, and all of the other things that are usually solved at the infrastructure level.

With the MapR Converged Data Platform for Docker, persistent data services, especially those appropriate for microservices, are universally available, reliable, and scalable. This means application developers can spend their time writing applications and business logic, instead of managing infrastructure, and all enterprise applications inherit consistent security and reliability from the platform.

Additional Resources