5 min read
Editor’s Note: This is the 4th blog post in a 5-part series that describes how modern enterprises are struggling with the handling of data, making it available to applications without creating new silos, and how MapR solves these challenges by introducing a new layer of abstraction called dataware. The previous three blog posts can be found here (#1), here (#2), and here (#3)
Persistence, that is, the storage and retrieval of data in applications, is not as simple as it used to be. In the past, there was often one database that served an application, usually based on SQL. But in the modern world, where applications have become systems of engagement, bringing together data and services from multiple sources, persistence has become a lot more complex.
Today, the question is where to put all the logic that grabs data from multiple sources and makes it useful for an application. If you put it in the application, the application becomes more complex, less efficient, and lots of code is duplicated. Dataware is a better choice. Dataware is an abstraction layer that allows data to be managed as a first-class enterprise resource decoupled from any other dependencies. Dataware effectively handles the diversity of data types, data access, and ecosystem tools needed to manage data as an enterprise resource, regardless of the underlying infrastructure and location.
Persistent data has a different life cycle than the applications that rely on it. It’s called persistent for a reason — it must persist and exist outside applications.
With today’s applications and the tasks required of them, much of the work of integrating and retrieving data has to occur at the persistence layer. Ideally, the persistence layer should be a platform that manages data integration and persistence for many applications and allows applications to communicate with one another. That is exactly what dataware enables. In this model, dataware becomes the new persistence layer in the application stack that replaces direct access to the database. Making this shift will have a dramatic impact on application development.
When you look at most applications in enterprises, you find it’s rare that the data for the application comes from a single repository. Usually, applications are using data from many underlying systems. These systems are systems of record that provide the fundamental truths of the enterprise, and the application essentially takes bits of new data, integrates it with all sources of truth, presents it for use through automation or user experience, and then creates new data in that process and writes any changes to that data back into the repositories.
What’s interesting about this structure is it assumes a lot of data integration must take place. That integration simply should not occur in the application itself, for both complexity and efficiency reasons. Thus, what dataware does is act as an intermediary layer that presents the data the application needs in the form it needs it.
With dataware, data integration can occur across applications. Dataware does this job once and presents those objects to the applications. It can also integrate or create new objects for the application. Under the dataware model, the application handles the smallest amount of integration possible. Access to all repositories is handled with a unified approach or API that allows for real-time analytics and processing. A global namespace ensures access to the latest data no matter where the application is running. Multi-tenancy is required for security as well as to ensure consistency and real-time synchronization across enterprise, without contention for accessing the same data at the same time.
New applications can therefore start with most integration done and can add new objects if needed. This type of data availability reduces the complexity of existing applications and accelerates the pace of creating new ones.
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.