Partner App: Pentaho Business Analytics Platform

Pentaho Business Analytics Platform

Pentaho tightly couples data integration with full business analytics to solve data integration challenges while providing business analytics in a single, seamless platform.

Application Description

Pentaho is delivering the future of business analytics. Pentaho's open source heritage drives our continued innovation in a modern, integrated, embeddable platform built for the future of analytics, including diverse and big data requirements. Pentaho tightly couples data integration with full business analytics to solve data integration challenges while providing business analytics in a single, seamless platform.

As the first major BI vendor to introduce its big data capabilities in May 2010, Pentaho has led the charge in big data analytics. This first-mover advantage enabled Pentaho to engage with big data customers early and to continually rollout technology updates that keep its users ahead of the big data curve. With the ability to drastically reduce the time to design, develop and deploy big data analytics solutions, Pentaho counts numerous big data customers, both large and small, across the financial services, retail, travel, healthcare and government industries around the world.


Pentaho’s Java-based data integration engine integrates with the MapRHadoop cache for automatic deployment as a MapReduce task across every data node in a Hadoop cluster, making use of the massively parallel processing and high availability of Hadoop. Pentaho can natively connect to Hadoop in the following ways:

  1. HDFS – input and output directly to the Hadoop Distributed File System
  2. MapReduce – visual MapReduce for input and output directly to MapReduce programs
  3. HBase – input and output directly to HBase, a NoSQL database optimized for use with Hadoop that provides real-time response times
  4. Hive – a JDBC driver that enables interaction with Hadoop via the Hadoop Query Language (HQL), a SQL-like query and data description language (DDL)

The Pentaho Data Integration engine is multi-threaded, with each step in a job executing on one or multiple threads. Multi-core processors running on each data node of the cluster are fully leveraged, eliminating the need for specialized multi-threaded programming techniques.

In addition, the Pentaho Data Integration engine executes as a single MapReduce task, instead of the typical multiple tasks resulting from machine-generated or hand-coded MapReduce programs or Pig scripts. As a result, Pentaho MapReduce jobs typically execute many times faster than machine-generated or custom coded Hadoop MapReduce jobs or Pig scripts.

Click here for the latest Pentaho-MapR compatibility list. Application Version: Pentaho 5.3 is verified with MapR V4.0.1

Download App

Installation instructions

Click here for installation instructions.

Click here for configuration instructions.


Use Instructions

Click here for Pentaho Tutorials on getting started evaluating and learning Pentaho.


http://www.pentaho.com/big-data-blueprints

Support Information

Click here for InfoCenter documentation.

In addition, comprehensive support packages are available for different product levels and include issue identification, problem resolution and developer assistance.