5 Big Data Production Examples in Healthcare

Contributed by

8 min read

Editor's Note: This blog post is an excerpt from the MapR Guide to Big Data in Healthcare. To read more, you can download it here.

Healthcare costs are driving the demand for big data-driven healthcare applications. Technology decision-makers in healthcare systems cannot ignore the increased efficiencies, the attractive economics, and the rapid pace of innovation that can now be applied to delivering and paying for healthcare. Many are finding that new standards and incentives for the digitizing and sharing of healthcare data — along with improvements and decreasing costs in storage and parallel processing on commodity hardware — are causing a big data revolution in healthcare with the goal of better care at lower cost.

The healthcare industry can benefit immensely from the use of advanced analytics and big data technologies, and the MapR Converged Data Platform offers the perfect solution. In this post, we will look at 5 big data production examples in healthcare.

1. UnitedHealthcare: Fraud, Waste, and Abuse

UnitedHealthcare provides health benefits and services to nearly 51 million people. The company contracts with more than 850,000 physicians and care professionals and approximately 6,100 hospitals nationwide. Their Payment Integrity group has the tough job of ensuring that claims are paid correctly and on time. Their previous approach to managing more than one million claims every day (10 TB of data daily) was ad hoc, heavily rule-based and limited by data silos and a fragmented data environment. UnitedHealthcare came up with a unique dual model strategy, which meant focusing
on operationalizing savings, while at the same time pursuing innovation to constantly leverage the latest technologies.

Here’s how they are doing it: in terms of operationalizing savings, the group is building a predictive analytics “factory,” where they can identify inaccurate claims in a systematic, repeatable way. Hadoop is now the data framework for a single platform that’s equipped with tools to analyze a slew of information from claims, prescriptions, plan participants, contracted care providers, and associated claim review outcomes.

They integrated all this data from multiple data silos across the business, including over 36 data assets. And they now have multiple predictive models (PCR, True Fraud, Ayasdi, etc.) at their fingertips that provide a rank-ordered list of potentially fraudulent providers they can pursue in a targeted, systematic way.

2. Liason Technologies: Streaming System of Record for Healthcare

Liaison Technologies provides cloud-based solutions to help organizations integrate, manage, and secure data across the enterprise. One vertical solution they provide is for the healthcare and life sciences industry, which comes with two challenges — meeting HIPAA compliance requirements and the proliferation of data formats and representations. With MapR Streams, the data lineage portion of the compliance challenge is solved because the stream becomes a system of record by being an infinite, immutable log of each data change. To illustrate the latter challenge, a patient record may be consumed in different ways — a document representation, a graph representation, or search — by different users, such as pharmaceutical companies, hospitals, clinics, physicians, etc. By streaming data changes in real time to the MapR-DB, HBase, MapR-DB JSON document, graph, and search databases, users always have the most up-to-date view of data in the most appropriate format. Further, by implementing this service on the MapR Converged Data Platform, Liaison is able to secure all of the data components together, avoiding data and security silos that alternate solutions require.

3. Novartis Genomics

Next Generation Sequencing (NGS) is a classic big data application that deals with the dual challenge of vast amounts of raw heterogeneous data and the fact that best practices in NGS research are an actively moving target. Additionally, much of the cutting-edge research requires heavy interaction with diverse data from external organizations. It requires workflow tools that are robust enough to process vast amounts of raw NGS data yet flexible enough to keep up with quickly changing research techniques. It also requires a way to meaningfully integrate data from Novartis with data from these large external organizations — such as 1000 Genomes, NIH’s GTEx (Genotype-Tissue Expression), and TCGA (The Cancer Genome Atlas) — paying particular attention to clinical, phenotypical, experimental, and other associated data.

The Novartis team chose Hadoop and Apache Spark to build a workflow system that allows them to integrate, process, and analyze diverse data for Next Generation Sequencing (NGS) research, while being responsive to advances in the scientific literature.

4. Healthcare IoT Startup: Working to Classify Heart Conditions Faster

The current heart rhythm analysis process is slow and classification is done manually. They do batch uploads from the devices into the analysis software machines to have medical analysts look at the classification data, and then submit a report to the doctors and hospital who then make medical decisions about the patients. The process takes over 24 hours, a long lag before doctors can access the patient data, increasing the risk of medical emergencies.

With MapR-FS, Telemed will now be able to ingest data from various medical devices directly via NFS into their cluster for real-time patient insight. This solution needed to be High Availability and also provide multi-tenancy (due to HIPAA) as they start hosting various hospital patient data and medical device company data. Being able to segment that data by their customers was really important.

With the help of MapR Professional Services, they have been able to build out a solution to hit their July 18th HIPAA review deadline and provide an architecture that fits all the requirements in terms of HA, multi-tenancy, and real-time insights. The CEO has fulfilled his requirement and deadline to his investors and the company will be on track to start selling their SaaS solutions in Q3/Q4.

Conclusion

Improving patient outcomes at the same or even less cost is an extraordinarily tall order for any healthcare provider, given overall costs of healthcare are rising in the US at a lofty 15% clip. Full-scale digital transformation is the key to making this goal a reality, with digitization, enhanced communications, and big data analytics being the legs to support the transformation effort. The many emerging use cases for big data analytics are intimately tied to the ability of Hadoop-based solutions to acquire and store massive quantities of disparate data—structured and unstructured—from just about any source and present it for in-depth analysis.

In selecting a big data platform and in particular a Hadoop distribution, be sure the platform is highly adept at handling the mix of data types in healthcare typically housed in silos, with clinical data in one silo, pharmaceutical data in another, and logistics information on hospital supplies in yet another. This platform should be flexible enough so that caregivers can use complex data like doctors’ notes and imaging files for real patient analysis, not just for archiving.


This blog post was published February 27, 2017.
Categories

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.


Get our latest posts in your inbox

Subscribe Now