A Simple Architecture to Use AI/ML for Predictive Maintenance Success

Contributed by

6 min read

Predictive maintenance (PdM) has emerged as a primary advanced analytics use case as manufacturers have sought increased operational efficiency and productivity and as a response to technological innovations like the Internet of Things (IoT) and edge computing.

Predictive maintenance techniques, in the abstract, are intended to determine the condition of operating equipment in order to predict when failure will occur or maintenance will be required. Ideally, they will also be able to recognize failure patterns in a way that is predictive. The ultimate aim is to provide cost savings over schedule-based preventative maintenance or unplanned reactive maintenance, which could result in machinery being unavailable during critical periods.


There are many approaches to predicting failure, but the most common rely on the following analysis:

  • Infrared Thermography (IRT): for seeing variations in temperature
  • Ultrasonic Analysis: for measuring changes in frequency that could indicate an issue
  • Current Analysis: tracking the voltage and current of electricity, usually as it's supplied to a motor
  • Vibration Analysis: an indicator of misalignment, wear, or imbalance
  • Oil Analysis: a way to measure the overall systemic lubrication

But these analyses can't stand by themselves. An architecture is needed to support the end-to-end workflow that allows for processing at the edge and moves the data from the sensors to a central repository for analysis and monitoring.

In his "Streaming Predictive Maintenance for IoT using TensorFlow" blogs (Part 1 and Part 2), Justin Brandenburg describes how to determine whether a failure is impending on a SCARA robotic arm performing some work along a manufacturing assembly line.

As part of this process, he must complete the following tasks:

Process and Prepare IoT Data for Analysis

In this blog, historical data was provided to use for training. So, the primary action is transformation, which is handled here by converting the imported data into a DataFrame to enable PySpark applications to easily interact with it.

But, in a real-world scenario, we'd be looking at thousands of IOT sensors streaming data into the central repository. Such a situation needs robust streaming pipelines at a minimum and could even benefit from an edge cluster to prep and transform data before transmission to cut down on bandwidth and latency. In a production environment, this could scale to thousands of sensors and thus the platform and streaming solution needs to scale as well.

MapR Data Platform - Pub/Sub

Model the Problem

In order to create a model that is able to predict failure or label anomalous behavior, you must first explore and understand the data that you're working with. In this case, it's a SCARA, which is a type of robotic arm that can perform a task before returning back to its original position.


Because the movement of the arm wasn't linearly consistent and the variance fluctuated with time, Justin chose to approach this with a time series forecasting model. This is useful for understanding trends and reacting to changes in behavioral patterns. For a deeper read on this topic, I recommend his companion blog, "Applying Deep Learning to Time Series Forecasting with TensorFlow."

To have the kind of variety you need in order to prototype different solutions, you need a platform with open APIs and POSIX compliance to allow all different types of algorithms to run on the data in place. This is important, because you're unlikely to know what solution is going to work best until you've prototyped the problem and should have as many options available to you as possible.

Core Services

Deploy the Solution

Once you have a model that is trained on the problem you plan to solve, it's important to have the capability to deploy this model close to the data. In this case, you want the capability to score new data as it arrives in real time. This requires a robust streaming system to feed data into the model and object storage to serve the model to the streams.


So, in order to support the end-to-end workflow for predictive maintenance use cases, it's critical to have a platform that works at the edge, over streams, and has a central processing cluster that can function at scale and enable all sorts of ML/AI engines. The right infrastructure enables the ability to get solutions deployed faster and to see results to ROI, which is where machine learning derives the most value.

MapR Stack

In addition, having versatile and open APIs enables the ability for enterprises to use the monitoring and alerting tools of their choice to enable the end user to act on the insights provided by their model.

This blog post was published May 22, 2018.

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.

Get our latest posts in your inbox

Subscribe Now