April 03, 2014 | BY Dr. Kirk Borne
The explosive growth in data and in big data technologies (that process and transform the data into knowledge) corresponds to a new industrial revolution. The raw materials and the machinery are different from past revolutions, but the fundamental features are not so different – new markets, new opportunities, new tools, and new wealth are being created at a remarkable pace.
It is said that “knowledge is power”. Given that data science and data mining are sometimes referred to as KDD (Knowledge Discovery from Data), it is therefore imperative for business executives to seize the day, to discover the knowledge encoded in their data collections, to become data-driven entrepreneurs, to apply the new data tools (such as Hadoop), and to develop capabilities, products, and branding that taps into this new oil: Big Data!
The World Economic Forum report “Personal Data: The Emergence of a New Asset Class” states the following: “Data records are collected on who we are, who we know, where we are, where we have been, and where we plan to go. Mining and analyzing these data give us the ability to understand and even predict where humans focus their attention and activity at the individual, group and global level. These personal data – digital data created by and about people – are generating a new wave of opportunity for economic and societal value creation.” The revolution is already underway on a worldwide scale, but the best is yet to come. This is because the data volumes are increasing at an exponential rate, and the new tools and technologies are similarly growing in capacity and capabilities.
Up until very recently, the field of high-performance computing (HPC) was primarily devoted to large-scale simulation and modeling efforts, such as weather prediction, hurricane prediction (tracking and intensification), climate change, cosmological evolution, nuclear fusion, aircraft design, and more. Now, a new era of HPC is emerging: data-intensive computing! The development of new computing paradigms is a part of this revolution, including MapReduce and Hadoop for big data processing (e.g., The MapR Big Data Platform), and graph computing architectures for networked (linked, graph) data (e.g., YarcData’s Urika Appliance).
Here are 5 comprehensive sources that deliver excellent insights and opportunities for the new data-driven business seeking competitive advantage in the era of big data and Hadoop:
1. In the article “9 Amazing Ways Big Data Is Used Today to Change the World”, we read about these business applications and opportunities:
- Understanding, targeting, and serving customers
- Understanding and optimizing business processes
- Personal quantification and performance optimization
- Improving health
- Improving sports performance
- Optimizing machine and device performance
- Improving security and law enforcement
- Improving and optimizing cities and countries
- Financial trading
2. In the summary report “Big Data Best Practices”, we see 45 different examples of big data exploitation by businesses and industries of all sorts: entertainment, manufacturing, retail, financial, food services, travel, sports, fashion, politics, gaming, and much more. This report refers to the now famous 2011 McKinsey Global Institute report (Big Data: The Next Frontier for Innovation, Competition, and Productivity) when making this declaration: “there are no [Big Data] best practices. I’d say there are emerging next practices.” Consequently, it is now time for all businesses to get on board that train and exploit big data for competitive advantage.
3. The white paper “25 Data Stories from GNIP” provides another rich compilation of stories that demonstrate the “unlimited value and near limitless application” of big data, with a focus on business growth and competitive advantage through social data exploitation. The use of social media data to drive customer interaction, engagement, conversion, and loyalty has reached a fever pitch. There is no reason for any business to stay outside of this arena – even small data collections (such as customer reviews, analytics of your web logs, and “voice of the customer” in social networks) can yield a rich harvest of insights and actionable intelligence that enable objective data-driven business decisions.
4. An excellent white paper on Marketing Automation by Marketo reveals 7 of the primary functions of a big data project that your business can utilize in order to convert data to knowledge to powerful insights:
- Lead generation
- Lead nurturing and lead scoring
- Relationship marketing
- Cross-sell and up-sell
- Marketing ROI measurement
Applications of data science methods to these tasks within a Hadoop environment will help you refine the “crude oil” of raw data into improved business capabilities, products, and branding.
5. Finally, the modern data-driven business can learn from and take advantage of lessons learned and experiences from their corporate peers who are trailblazers in this field. A conference in Boston in June 2014 provides a great opportunity to do exactly that – The Useful Business Analytics Summit – from the team at datadrivenbiz.com. Summit attendees can attend many different sessions that promote and facilitate data-driven decisions in your organization by helping you to:
- “Make fact-based decisions and maximize ROI - develop a data monetization strategy and implementation plan to extract latent value with Kayak and Angie's List;
- Build a data-driven organization to realize true value from analytics insights with Google, Aetna, and RBS Citizen;
- Understand how big data technologies continue to reshape business analytics and intelligence with Equifax;
- Utilize visualization and dashboards to make data available, understandable and actionable across multiple departments with McGraw-Hill Education and ESPN;
- Bring together the right data from multiple sources to create the most useful business insights with Nokia and JP Morgan;
- Implement business-focused predictive analytics for more strategic planning with Johnson&Johnson and MasterCard;
- Hear examples of using data to sell the right product at the right time at the right price to the right customer with Travelocity; and
- Use your CRM system data to drive successful loyalty programs and identify new products and services with Sears, Toyota and L'Oréal Paris.”
In a previous post, I presented a “Big Data A to Z Glossary of my Favorite Data Science Things”. One of those entries (for the letter H) was Hadoop. If Big Data is the “new oil”, then Hadoop is the new CDU (Crude Oil Distillation Unit) that distills the incoming raw data into various fractions for further processing into refined data products. The distillation process may employ the MapReduce algorithm, which performs the initial filtering and sorting of the incoming data during the Map stage. Those distilled products are then processed through aggregate and summary operations (similar to aggregate and ‘group by’ functions in SQL) during the Reduce stage. MapReduce is specifically designed to perform parallel, distributed processing of big data on clusters. Businesses can use this technology to refine their big data collections for competitive advantage. One of the best Hadoop implementations (as determined in an independent evaluation) is the MapR M7 Enterprise Database Edition for Hadoop.