A Practical Guide to Microservices and Containers

by James A. Scott

In Search of Agility

This chapter will focus on the concept of agility as it relates to digital transformation. The various aspects of agility – namely data agility, application agility, and infrastructure agility – will be examined in far greater detail in the following chapters. The aim here is to separate agility from the many other buzzwords that flood the IT and business worlds, and demonstrate the intimate link between agility, digital transformation, and enterprise success. As noted in the introduction, digital transformation is a phrase frequently used throughout the organization today by managers of all stripes. And, as noted, converged infrastructure is a key on-ramp to digital transformation. The most coveted result or ‘output’ of this newly forming infrastructure is agility, which also happens to be one of the more overused terms in both the IT and business suites. Stripped of all else, agility describes how quickly an enterprise can respond to new opportunities and new threats.

In a recent major cloud study1, respondents were asked why cloud solutions were used on a variety of workloads, including email, analytics, big data, application development, and several others. For virtually every workload the top one or two reasons selected were ‘responding faster to changing business needs.’ In other words, organizations are seeking greater agility from cloud as well as other advanced technologies including big data analytics and containers. Cost savings, which for several years was the top justification offered to higher ups for cloud investments, is fading in importance as C-level executives grasp the business value of agility.

Data agility

Organizations have traditionally been hamstrung in their use of data by incompatible formats, rigid database limitations and the inability to flexibly combine data from multiple sources. Users who needed a new report would submit requirements to the IT organization, which would place them in a queue where they might sit for a month or more. Even worse is that users have had to know in advance precisely what data was needed. Ad hoc queries were only permitted within the confines of an extract database, which often contained incomplete and outdated information. Queries were limited to structured data. Data agility encompasses several components:

  • Business users are freed from rigidity and given the freedom to combine data from multiple sources in an ad hoc manner without long cleansing or preparation times.
  • The path between inquiries and answers is shortened so that decisions can be made on current data.
  • Structured and unstructured data can be combined in meaningful ways without extensive transformation procedures.
  • Data can be combined from both operational and analytical (historic) sources to enable immediate comparisons and to highlight anomalies.
  • Data can be combined from both streaming and static data sources in real time. Users can create their own integrations using visual programming tools without relying on time-consuming extract/transform/load procedures.
  • New data sources can be quickly integrated into existing analytical models. Schema-less data is supported in flexible formats like JSON.
  • Support for combinations of complex structures such as JSON documents with simple key-value constructs and tabular formats.
  • Block-level, file-level and object data can be combined in the same model
  • Rich visualization tools enable business users to create graphical representations of data that reveal trends and relationships that would be otherwise hidden.
  • Instead of specifying which data they need, users can access all available data for experimentation and discovery.
  • Users can create and share their own analytical models without disturbing production data.

In a nutshell, data agility is about removing barriers to data usage. The rigid structure of yesterday’s data warehouses made data a precious asset that could cost upwards of $10,000 per terabyte. With Hadoop, those costs fall by more than 90%. This removes many of the cost and technical barriers of enabling data agility. The most formidable barriers to data agility at many organizations aren’t technical, but rather cultural. Functional managers may jealously guard data within their groups, believing it to be a source of power, or the IT organization may see itself as data stewards and tightly limit access. Appropriate protections should always be applied to sensitive data, of course, but the difference between agility and rigidity often comes down to the organization’s willingness to trust its people to use data responsibly and strategically.

Another example of data agility is given by a major Europe-based telecommunications giant, which similarly collects veritable mountains of data from its far-flung network operations. The mountains are not important in and of themselves. Rather it is the mother lode of information locked within them, as forward thinking organizations have come to realize. According to an IT manager there, “Our applications allow mobile operators to proactively monitor and improve customer experience. We collect data from the mobile network, then process the collected data to come up with information about individual subscriber experience with mobile services like video usage, web browsing, file transfer, etc. We also create individual subscriber profiles from this data. All of the data is stored in our MapR Converged Data Platform. We want to enable mobile operators to do interactive ad-hoc analysis and build reports using BI tools. This is where Apache Drill comes into the picture. We’re using Drill to enable interactive ad-hoc analysis using BI tools like Tableau and Spotfire on the data we produce. Currently we’re building canned reports using Drill to show how the data we produce can be used to derive insights.”

Application agility

The containers section in chapter 2 above and the fuller treatment of containers in chapter 4 offer a front row seat to application development agility, which also explains the very rapid adoption of and enthusiasm for containers. In more fully appreciating the value of application agility, it is important first to understand just how completely the application development environment is changing.

As shown in Fig. 3-1 below, this environment has become far more complex and multifaceted in the last decade. The complexity has arisen chiefly due to the great difficulties in dealing with separate clusters or silos of data. This reality in some organizations has made application agility almost a misnomer. Clearly, a way is needed to seamlessly move data from one environment to another without having to customize the data for each environment. In fact, it is widely believed that within a few years developers leveraging containers will be able to do exactly that – move their test/dev projects into production without needing any major code rewrites to accommodate new data locations, such as on-premises or onto a cloud.

Enterprise applications have traditionally been built with a monolithic, vertically integrated structure. All application logic was contained within a single program, which often took months or even years to develop. Detailed functional specs were required and extensive testing and code reviews were part of the process. The resulting application may have fit the stated requirements, but there was little latitude for embellishment. Enhancing an application required the preparation of new functional specifications, extensive testing and more code reviews. Even modest enhancement requests could consume months. The legacy applications used by airlines, banks and credit card processors are typical of these very large, robust, but inflexible programs.

The rapidly changing business and technology landscape of today no longer tolerates this approach. Year-long development schedules create applications that are often irrelevant by the time they are complete. New technologies like bots, automated assistants and mobile devices must be accommodated quickly. New payment mechanisms such as bitcoin and Ethereum come seemingly out of nowhere to change the way customers conduct transactions. Organizations that want to expose data or services to business partners have no way to do so because those needs weren’t anticipated when the application was built.

Application agility reimagines development and deployment around a modular, loosely coupled and dynamic architecture. Like Lego blocks, the components of the application can be moved and reassembled as needed. Communication between components is provided by a messaging plane. New or enhanced functionality is provided by revising individual components and slipstreaming them back into the network without disrupting the application as a whole. These components, or services, can also be stored in a library and shared by multiple applications. Services are called only when needed, which reduces program size and complexity while improving performance.

Application Development and Deployment
Figure 3-1.

Application agility also changes the nature of the data that applications process. Less and less, it is data warehouse or data mart data finding its way into leading edge applications. The growth area is event-based data (Fig. 3-2). Whether it’s collecting machine sensors to predict and prevent failures, or providing key offers to customers, or identifying and preventing fraud before it happens – all such use cases are enabled by event based data flows and a converged platform.

Event-Based Data Drives Applications
Figure 3-2.

Infrastructure agility

As we noted earlier, converged infrastructure is the key on-ramp to digital transformation. The once-in-a-generation replatforming currently underway is synonymous with infrastructure agility, because enabling the next-gen applications requires a next-gen infrastructure.

IT infrastructure is undergoing a transformation that is no less radical than that being seen in data and applications. Virtualization has brought unprecedented flexibility to resource provisioning, a foundation that containers build upon. Software-defined everything is close to becoming a reality. In the future, infrastructure components such as storage, networks, security and even desktops will be defined in software, enabling resources to be quickly reconfigured and re-allocated according to capacity needs.

Agile infrastructure is also highly automated. Processes that once took weeks, such as the introduction of a new storage controller or network switch, can be reduced to minutes with minimal operator intervention. Policy-based automation predicts and adjusts resource requirements automatically. A data-intensive process requiring rapid response can be automatically provisioned with flash storage, while a batch process uses lower-cost spinning disk.

This kind of agility will be essential to achieving comparable nimbleness in applications and data. Users shouldn’t have to care whether software is running on a disk or flash. Networks should automatically re-provision according to capacity needs, so that a videoconference doesn’t break down for lack of bandwidth. Storage will be available in whatever quantity is needed. Containers will spin up fully configured with required services.

Most importantly, distinctions between on-premises and cloud infrastructure will fade. Open standards and on-premises mirrors of cloud environments such as Microsoft Azure Stack and Oracle Cloud on Customer are among the forces moving toward complete cloud transparency. Developers will be able to build on-premises and deploy in the cloud, or vice versa. Users should expect workloads to shift back and forth between environments without their knowledge or intervention. Infrastructure agility is effectively infrastructure transparency.

This kind of flexible, self provisioning infrastructure will be required to support big data and analytics. In that scenario, agile infrastructure includes the following five foundational principles:

  1. Massive, multi-temp, reliable, global storage with a single global namespace.
  2. High-scale, asynchronous, occasionally connected, global streaming data layer that is persistent.
  3. Support for multiple techniques of analytics or compute engines.
  4. Ability to operationalize whatever happens; operational applications combined in the same platform.
  5. Utility grade cloud architecture with DR, workload management, scale-out.

Seen this way, the next generation infrastructure is not an incremental improvement of existing approaches. It truly is a radical replatforming able to bridge new modern applications with legacy systems.

Big data platforms are changing the way we manage data. Legacy systems often require throwing away older data, making tradeoffs about which data to maintain, moving large data sets from one silo to another, or spending exorbitant amounts to handle growth. But those are becoming the modus operandi of the past. Scale, speed and agility are front and center with the modern data architectures that are designed for big data. Data integrity, security and reliability remain critical goals as well. The notion of a ‘converged application’ represents the next generation of business applications for today and the future.

1Research Voice of the Enterprise Cloud survey, Q2 2016, https://451research.com/dashboard/customer-insight/voice-of-the-enterprise/voice-of-the-enterprise_cloud