Informatica and MapR: Harness Real-Time Insight from Your Streaming Data

Informatica and MapR: Harness Real-Time Insight from Your Streaming Data

At a Glance

Informatica® provides data integration software and services that enable over 5,000 organizations worldwide to gain competitive advantage by empowering them with timely, relevant and trustworthy data for their top business initiatives.

Streaming Data Collection Requires a New Architecture

Businesses today have an unprecedented opportunity to gain insight from a steady stream of real-time data—clickstreams from web servers, application and infrastructure log data, real-time systems, and data coming from sensors or agents resident on the plethora of devices and machines comprising the “Internet of Things.” This continuous flow of small messages and events can drive decisionmaking and operational intelligence to new heights of agility and responsiveness. However, as many small pieces of data flow in at high rates and accumulate quickly into large volumes, organizations can only derive maximum value from it if they can gather and analyze it immediately. Traditional file-based, batchoriented collection architectures are not well-suited for streaming data because they use intermediate collection tiers, don’t support real-time processing, need to be carefully managed to prevent processing errors, and do not scale easily.

Production-Ready Architecture for Advanced Analysis

Informatica Vibe Data Stream™ for Machine Data efficiently collects all forms of streaming data and delivers it directly to both real-time and batch processing technologies, so companies can leverage it for holistic operational intelligence and big data analytics. Integration with the industry-leading MapR Distribution including Hadoop offers organizations a full production-ready architecture for the continuous collection, ingestion, processing and advanced analysis required to make commercial decisions and adjustments on-the-fly.

Informatica and MapR Combine to Support Real-Time Enterprise Data Hub

Built for Enterprise-Grade Reliability and Performance

Vibe Data Stream is also a distributed, scalable system that uses Informatica’s established, high performance brokerless messaging technology to greatly simplify streaming data collection via lightweight agents for an ecosystem of sources and targets. Embeddable agents on sources collect data in real time and stream millions of records per second into the MapR Distribution. Vibe Data Stream also streams data directly into Informatica PowerCenter™ Real Time Edition, Informatica RulePoint™ (CEP), and Apache Storm, enabling real-time event processing and operational intelligence via:

  • Lightweight agents for an ecosystem of sources and targets.
  • Brokerless messaging transport using a publish/subscribe model.
  • Flexibility to connect sources and targets in numerous patterns.
  • High performance delivery direct to targets over LAN/WAN.
  • Simplified configuration, deployment, administration and monitoring.

Product Snapshot: Vibe Virtual Data Machine / Vibe Data Stream

Vibe™ is an embeddable data management engine that can access, aggregate, and manage any type of data. Vibe gives developers the power to map data once and deploy anywhere—in any application and on any appliance or device.

Why MapR and Informatica

The MapR Distribution including Hadoop together with Informatica Vibe Data Stream are key components of a high speed, real-time data analytics platform. Environments with high speed data feeds are especially at risk for disastrous failures, and MapR and Informatica provide the enterprise-grade reliability to tolerate node failures with no impact to the overall system operation. Both are designed for easy administration, which helps to lower the risk for error. The unique support for billions of small files in MapR, along with the easy and scalable setup of Vibe Data Stream, make them ideal together for “Internet of Things” deployments that ingest potentially billions of files from millions of devices.

About Informatica

Informatica Corporation (Nasdaq:INFA) is the world’s number one independent provider of data integration software. Informatica Vibe, the industry’s first and only embeddable virtual data machine (VDM), powers the unique “Map Once. Deploy Anywhere.” capabilities of the Informatica Platform. Worldwide, over 5,000 enterprises depend on Informatica to fully leverage their information assets from devices to mobile to social to big data residing on premise, in the cloud and across social networks.

About MapR

MapR delivers on the promise of Hadoop with a proven, enterprise-grade platform that supports a broad set of mission-critical and real-time production uses. MapR brings unprecedented dependability, ease-of-use and world-record speed to Hadoop, NoSQL, database and streaming applications in one unified distribution for Hadoop. MapR is used by more than 500 customers across financial services, government, healthcare, manufacturing, media, retail and telecommunications as well as by leading Global 2000 and Web 2.0 companies. Investors include Google Capital, Lightspeed Venture Partners, Mayfield Fund, NEA, Qualcomm Ventures and Redpoint Ventures.

Solution Highlights

  • Rea-Time Operational Inteligence and Big Data Analytics Quickly transform data to intelligence for a holistic picture of business activities.
  • High Performance Streaming Data Collection. Reliable quality of service for data collection over LAN/ WAN. Adapts quickly to new streaming data between multiple sources.
  • Highly Available and FutureProof Big Data Solution. Enterprise-grade Apache™ Hadoop® distribution from MapR is easy, dependable and fast. Direct access through NFS interface for all business applications.
  • Simplifies Development While Increasing Operational Efficiency. Centralized GUI for simplified configuration, deployment, administration and monitoring.

Benefits of Informatica

  • Holistic Approach. Vibe is a foundational element in the Informatica Information Network Architecture™ that scales from individual to enterprise-wide needs globally
  • Improved Time to Market. Comprehensive IT portfolio enables business leaders to bring products/ services to market faster while improving business operations.
  • IT/Organizational Flexibility. Provides solutions to top CIO concerns, such as need for IT agility to support rapidly changing business landscape.

MapR Benefits

  • Top-Ranked Hadoop Distribution. One unified big data platform for Hadoop, NoSQL, database and streaming applications.
  • Proven Production Readiness. Benefit from both open source community innovation as well as MapR architectural enhancements.
  • Consistent High Performance. Eliminate downtime/bottlenecks while ensuring business continuity.

Try Informatica Today!

Get the MapR Sandbox for Hadoop, a fully functional Hadoop cluster running on a virtual machine. Visit

Download the Informatica Assessment tools from resource-library.