In Search of Data Agility: What It Actually Means and How to Attain Data Agility

Contributed by Jim Scott

Editor’s Note: This is an excerpt from the book, “A Practical Guide to Microservices and Containers: Mastering the Cloud, Data, and Digital Transformation” – you can download the ebook here.

Agility as it relates to digital transformation is what businesses are in search of. There are three areas of agility–data agility, application agility, and infrastructure agility–for us to focus our efforts. The aim here is to separate agility from the many other buzzwords that flood the IT and business worlds and demonstrate the intimate link between agility, digital transformation, and enterprise success.

Digital transformation is a phrase frequently overused and misused throughout the organization today by managers of all stripes. Converged infrastructure is a key on-ramp to digital transformation. The most coveted result or ‘output’ of this newly forming infrastructure is agility, which also happens to be one of the more overused terms in both the IT and business suites. Stripped of all else, agility describes how quickly an enterprise can respond to new opportunities and new threats. Do you want your business to be able to be steered like a cruise ship or like a speed boat, which can turn on a dime?

In a recent major cloud study, respondents were asked why cloud solutions were used on a variety of workloads, including email, analytics, big data, application development, and several others. For virtually every workload, the top one or two reasons selected were ‘responding faster to changing business needs.’ In other words, organizations are seeking greater agility from cloud as well as other advanced technologies, including big data analytics and containers. Cost savings, which for several years was the top justification offered to the C-level for cloud investments, is fading in importance as those executives grasp the business value of agility.

Data Agility

Organizations have traditionally been hamstrung in their use of data by incompatible formats, rigid database limitations, and the inability to flexibly combine data from multiple sources. Users who needed a new report would submit requirements to the IT organization, which would place them in a queue, where they might sit for a month or more. Even worse is that users have had to know in advance precisely what data were needed. Ad hoc queries were only permitted within the confines of an extract database, which often contained incomplete and outdated information. Queries were limited to structured data.

Data agility encompasses several components:

  • Business users are freed from rigidity and given the freedom to combine data from multiple sources in an ad hoc manner without long cleansing or preparation times.
  • The path between inquiries and answers is shortened so that decisions can be made on current data.
  • Structured and unstructured data can be combined in meaningful ways without extensive transformation procedures.
  • Data can be combined from both operational and analytical (historic) sources to enable immediate comparisons and to highlight anomalies.
  • Data can be combined from both streaming and static data sources in real time.
  • Users can create their own integrations using visual programming tools without relying on time-consuming extract/transform/load procedures.
  • New data sources can be quickly integrated into existing analytical models.
  • Schemaless data is supported in flexible formats like JSON.
  • Support for combinations of complex structures, such as JSON documents, is provided with simple key-value constructs and tabular formats.
  • Block-level, file-level, and object data can be combined in the same model.
  • Rich visualization tools enable business users to create graphical representations of data that reveal trends and relationships that would be otherwise hidden.
  • Instead of specifying which data they need, users can access all available data for experimentation and discovery.
  • Users can create and share their own analytical models without disturbing production data.

In a nutshell, data agility is about removing barriers to data usage. The rigid structure of yesterday’s data warehouses made data a precious asset that could cost upwards of $10,000 per terabyte. With Hadoop, those costs fell by more than 90%, which removes many of the cost and technical barriers to enabling data agility.

Another example of data agility is given by a major European-based telecommunications giant, which similarly collects veritable mountains of data from its far-flung network operations. The mountains are not important in and of themselves. Rather, it is the mother lode of information locked within them, as forward-thinking organizations have come to realize. According to an IT manager there:

“Our applications allow mobile operators to proactively monitor and improve customer experience. We collect data from the mobile network, then process the collected data to come up with information about individual subscriber experience with mobile services like video usage, web browsing, file transfer, etc. We also create individual subscriber profiles from this data. All of the data is stored in our MapR Converged Data Platform. We want to enable mobile operators to do interactive ad hoc analysis and build reports using BI tools. This is where Apache Drill comes into the picture. We’re using Drill to enable interactive ad hoc analysis using BI tools like Tableau and Spotfire on the data we produce. Currently, we’re building canned reports using Drill to show how the data we produce can be used to derive insights.”

The most formidable barriers to data agility at many organizations aren’t technical but rather cultural. Functional managers may jealously guard data within their groups trying to maintain their own little fiefdom, believing it to be a source of power, or the IT organization may see itself as data stewards and tightly limit access. Appropriate protections should always be applied to sensitive data, of course, but the difference between agility and rigidity often comes down to the organization’s willingness to trust its people to use data responsibly and strategically. As always, if you have any questions or comments, please put them in the comments section below.


This blog post was published February 05, 2018.
Categories

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.


Get our latest posts in your inbox

Subscribe Now