Predictive maintenance: before you even start coding

Dr. Marco Berta
4 min readJun 5, 2023

What is predictive maintenance?

Predictive maintenance is a hot topic in manufacturing. Being able to fix something before an actual failure reduces downtime cost and/or allows longer service time. Changing a healthy component in a machine is also a cost of course. Predictive maintenance can be achieved with a correct interpretation of a sensor signal, when it starts to get abnormal it is then the time to start worrying about a failure. The most common example is setting a vibration threshold for the signal amplitude.

Anomalies automatically detected

If the temperature of a cooling tower gets suddenly high, the cooling system may be faulty and it needs to be serviced. But it is not always that simple. A change in signal shape or seasonality may also be a valuable indication when the temperature is still below a given threshold. The term commonly used for these cases is “time series anomaly detection”.

Big hopes for AI

Several algorithms have been developed for this. “ARIMA” for example, is an established one. Packages have been developed both in Python and R [1] but even without a deep coding knowledge the task can be performed by cloud services such as Azure anomaly detection [2] or Google BigQuery ML [3]. These use unsupervised machine learning to extract information from a time signal and detect when that at a given time is significantly different than what the algorithm has been trained with.

But if you think that the main hurdle is just streaming a signal to the cloud and that afterwards a software powered with artificial intelligence (AI) will send you a message as soon as things start to go wrong.. think again. While it is true that even in this domain coding tasks are getting more and more automatized [5], a possible lack of understanding may lead to false alarms in the best case.

Some questions to ask before getting your hands dirty

Thing is, several sensors are placed on a single machine, and different machines may need completely different approaches. Different failures may affect measurements in a completely different way. Also, not always an anomaly indicates an actual failure or malfunction and for some of those detection is not even needed. For example, a sudden temperature increase may be due to the incoming liquid inside the cooling tower rather than to the cooling system itself. Problems may be in another part of the production plant. I would then pose the following questions to the machine guru before sending signals to an algorithm:

- What type of failure are you trying to prevent?

- Has any of these been recorded?

The latter is sometime an inconvenient one. Before starting the project, noting down at what time some failure happened or when the machine was serviced can save weeks if not months of data recording and analyzing. Most importantly, it can lead to spotting some useful anomaly even if unsupervised methods will be used when the sensor signals will be fed to an AI.

The temperature data from a cooling tower sensors may not come just as a raw signal such as that of the Raspberry Pi like that I had last summer [5]. Root mean square (RMS) and peak values are time series that may come from the same sensor, each with its own information and importance [6].

Discussing the data beforehand and selecting only significant signals is incredibly beneficial, rather than dumping all to the cloud and hoping that something will come out of AI calculation. One more question to ask is the very simple and often overlooked:

- What does this sensor measure?

- Does the sensor shut off or it continuously record data?

Also, not always the measurements are correct and in some (hopefully rare) cases they may be unrelated to relevant physical phenomena due to poor calibration, faulty equipment etc.

Last but not least:

- Are data streamed in real time or come as batches?

This is important also for the choice of an appropriate anomaly detection algorithm.

Final note

Understanding the data, assessing its quality and clarifying project goals should be done BEFORE coding, also for time series anomaly detection. That helps predictive maintenance much more than a better algorithm or a faster computer.

References

1. Introducing practical and robust anomaly detection in a time series (twitter.com)

2. What is Anomaly Detector? — Azure Cognitive Services | Microsoft Learn

3. What’s new with BigQuery ML: Unsupervised anomaly detection for time series and non-time series data | Google Cloud Blog

4. Anomaly Detection in Time Series using ChatGPT | by István Szatmári | Medium

5. IoT signal to MS Azure cloud: Raspberry Pi temperature data | by Dr. Marco Berta | Medium

6. (1) Why is True-peak measurement a better method for detecting bearing damage? | LinkedIn

--

--

Dr. Marco Berta

Senior Data Scientist @ ZF Wind Power, Ph.D. Materials Science in Manchester University