Architect's Guide

to Implementing a Digital Transformation

by George Demarest

Introduction to Digital Transformation

This document is meant to provide enterprise architects, IT architects, and other IT strategists some guidance as to how organizations can progress through the various stages of becoming a data-driven business. This guide describes four phases of the journey toward a digital transformation.

The Four Phases:

Phase 1. Experimentation

Phase 2. Implementation

Phase 3. Expansion

Phase 4. Optimization

Where Did These Phases Come From?

These phases represent discernable patterns that we have observed from hundreds of engagements with MapR customers. While the phases are presented sequentially, individual experiences can vary significantly depending on the commitment of your organization, whether or not you have a motivated executive sponsor, external pressure from competitors in your industry, and other factors like budget, politics, and culture. MapR customer examples (both named and anonymized) will be used to illustrate key points of the big data journey.

The rapid growth of big data development is the result of major shifts in the broader information technology space. The combined impact of growing data volumes, accelerating adoption of open source technologies, and the trend in shortening the application development cycle have all contributed to the big data phenomenon. The rate of growth in the volume, variety, and velocity of data is accelerating. In the fixed Internet of the 90s, there were 1 Billion Internet connections. With the mobile Internet of the 2000s, it grew to 6 billion. And by 2020, according to Cisco there will be a projected 50 billion connections with the Internet of Things.

Digital Transformation, Open Source, and Cloud Economics

Legacy systems were never designed to handle this scale of data, yet a true digital transformation demands that you find a way. What many IT executives are discovering is that legacy practices, and more importantly, legacy economics, are being challenged by new digital platforms that exploit open source tools, and cost effective distributed computing. Couple that with the ability to develop these new applications on premises or in the cloud on virtualized infrastructure brings with it new financial models for IT that could be termed cloud economics.

More and more application architects and developers are asking questions like:

  • Does my application really need a RDBMS?
  • Is there a free or open source alternative to the commercial software I am using?
  • Can I run a mission-critical application without any commercial software?

In our own customer base, there are dozens of examples of re-platforming of applications or analytics from legacy platform such as mainframes, data warehouses/RDBMS, and premium storage arrays. The Hadoop/Spark ecosystem first grew in popularity because it provided an economical way to store massive amounts of data and do bulk processing on these large data sets.

A once-in-a-30-year shift is underway. Legacy enterprise applications, message busses, middleware, RDBMs, and expensive specialized compute/storage systems are being replaced by commodity hardware armed with big data applications.

This proliferation of new data, new tools, and new thinking offers organizations tremendous opportunities to reach and serve customers through new application architecture. The growing utility and influence of machine learning, artificial intelligence, advanced analytics, data engineering, statistical analysis, and data science vastly expand the lexicon of “business intelligence”. The impact of such a sea change in approach to data-driven business processes in the digital business suggests the ability to reimagine the role of IT in the business.

The Progression of Big Data Use Cases

While every organization has a unique entry point into the world of big data, we have been able to observe some fairly consistent patterns of use case development from our customer engagements. During the early stages of big data adoption, IT departments sensibly focus on easy wins that focus on cost savings, gaining valuable practical experience, and laying the groundwork for more ambitious and sophisticated projects.

The graphic below roughly outlines how use case development progresses in most customer environments. There are, of course, exceptions. Typically, those exceptions stem from the customer having a critical use case or competitive situation that accelerates adoption of the technology. Use cases such as security analytics, fraud detection, marketing use cases, or Internet of Things (IoT) projects can fall into this category.

Data implementation progress model for businesses

While use cases are a key measure of the maturity and sophistication of a digital transformation, this document presents a broader set of indicators that are meant to inform enterprise architects, IT leadership, and application architects. While it is difficult to generalize the economic benefits of big data or the MapR Converged Data Platform, we have hundreds of examples of customers achieving significant — sometimes radical — costs savings, new revenue streams, and high returns on investment. See the IDC report.

Finally, the fast pace of change in the big data ecosystem, the great strides being made in data engineering and data science, and the growing number of “data-driven” business and IT leaders mean that the phases described below will also evolve over time. This current version is based on conditions on the ground today and is based completely on real data and experience from MapR customers to date.