Big Data and Apache Hadoop for the Healthcare Industry

Big Data and Apache Hadoop for the Healthcare Industry

Challenges in Healthcare

All of the major segments of the healthcare industry—payers, providers, healthcare IT, and pharmaceutical companies—are under increased pressure to improve the quality of patient care at a lower cost.

Each of these organizations is being tasked with accessing and finding value in an ever-growing pool of patient data. The diversity of this data (claims data, mobile data, EMR notes, medical correspondence, output from health wearables, biomedical research, and social media conversations about health) implies that these are generated from multiple siloed data sources. Thus, seamless interoperability—the extent to which these different silos can exchange and interpret shared data—is quite a difficult undertaking. In addition, about 80 percent of healthcare data is unstructured, making it difficult for organizations to access and integrate with other data. The ability to securely integrate this wealth of data and apply predictive analytics will increase efficiency of care, reduce fraudulent claims, discover more efficacious therapies, and improve physician enablement.

Solution Overview

The MapR Distribution including Hadoop brings together the high volume of structured and unstructured healthcare data into one central repository that can utilize existing hardware and network components. Multiple groups in healthcare organizations can inexpensively store and access this data simultaneously within a secure HIPAA-compliant Hadoop-enabled architecture.

With access to comprehensive patient data and medical research, doctors can detect and diagnose diseases in their early stages, assign more effective therapies based on a patient’s genetic makeup, and adjust drug doses to minimize side effects and improve therapeutic effectiveness. This complete view also provides opportunities for improvement in care coordination and outcomes-based reimbursement, population health management, and patient engagement and outreach.

Fraud Detection

Payers need to be able to detect fraud based on analysis of anomalies in billing data, procedural benchmark data or patient records. Payers can analyze data to detect anomalies like a hospital’s overutilization of services in short time periods, patients receiving healthcare services from different hospitals at the same time, or identical prescriptions for the same patient filled in multiple locations. MapR uses anomaly detection to detect these incidents in real time and alert providers to investigate them before payment is made

Patient Monitoring

Healthcare providers want to provide more proactive care for their patients by constantly monitoring patient vital signs. The data from these monitors can be used in real time to alert care providers about changes in a patient’s condition. MapR can help collect this data and stream it in real time, which can help in detecting changes. Improved algorithms that run against larger sets of data can improve the likelihood of knowing when a particular patient might have an emergency, which helps providers plan for effective interventions.

Real-Time Analytics for Healthcare

Personalized Treatment

Personalized treatment planning is a way to customize treatment for a patient to continuously monitor the effects of medication. The medication or dosage can be changed based on how the medication is working. This analysis can be tailored to each patient’s specific needs. MapR provides real-time access, at both the summary and detailed level, so treatment decisions can be adjusted in a timely manner.

Assisted Diagnosis

Clinical researchers can access broad knowledge pools across multiple data sources to aid in the accuracy of diagnosing patient conditions. Bringing together individual data sets into a big data repository and applying algorithms for predictive modeling provides more accurate insights by identifying nuances in subpopulations. These nuances may be so rare that they are not seen in small research samples, but with the ability to apply algorithms to these individual data sets, nuances can now be clearly detectable.

About MapR

MapR delivers on the promise of Hadoop with a proven, enterprise-grade platform that supports a broad set of mission-critical and real-time production uses. MapR brings unprecedented dependability, ease-of-use and world-record speed to Hadoop, NoSQL, database and streaming applications in one unified distribution for Hadoop. MapR is used by more than 700 customers across financial services, government, healthcare, manufacturing, media, retail and telecommunications as well as by leading Global 2000 and Web 2.0 companies. Investors include Google Capital, Lightspeed Venture Partners, Mayfield Fund, NEA, Qualcomm Ventures and Redpoint Ventures

MapR Distribution Including Hadoop Highlights

  • High availability and disaster recovery. Enables business continuity and higher business-level SLAs.
  • Multi-tenancy. Supports multiple business groups and applications in one cluster without conflicts.
  • Integrated security. Built-in data access controls.
  • Direct Access NFS. Direct data ingestion, familiar access methods, existing tools/libraries continue to work.
  • High performance. Fast, responsive access to data, and higher throughput.
  • Volume support. Disparate user groups and data by logical volumes.
  • Job placement control and resource management. Jobs run simultaneously in the same cluster.
  • Data protection. Consistent snapshots with point-in-time audits and recovery.
  • Support for structured, semistructured, and unstructured data. All data in the enterprise data architecture.

MapR Key Benefits

  • Simplified architecture with easy data access to all enterprise data in a single repository
  • Fast, responsive access to data to enable real-time operations
  • Low-cost storage along with the benefits of high-end storage platforms
  • High uptime for the reliability to meet stringent SLAs and avoid costly downtime
  • Support for big data-driven operational applications
  • Built for extreme scalability at low costs