Leland Teschler – Executive Editor
On Twitter @DW_LeeTeschler
Teschler on Topic
I have had a vision of what “big data” will look like on the factory floor. And it is scary. Really scary.
Big data, of course, refers to the use of predictive analytics to extract value from data, usually lots of data. Big data is thought to hold a lot of promise for improving industrial operations and making them more productive. In a perfect world, big data techniques will help predict when complicated production equipment will breakdown well before there are obvious signs of trouble. They will also eliminate minute variations in manufacturing processes that degrade product quality.
In reality, most manufacturers could experience big data as an expensive exercise in chasing your own tail. To understand the reason why, consider what goes on at a National Science Foundation facility in Cincinnati called the IMS Center for Intelligent Maintenance Systems. The IMS Center directs a lot of high-powered research at eliminating “fix and fail” maintenance, replacing it with “predict and prevent” efforts where production machines will never go down and the products coming off manufacturing lines never have variations.
Big data is the heart and soul of the Center’s efforts. It’s not unusual for production machines the Center studies to each generate a terabyte of data every day.
The IMS Center had a meeting a while back that makes clear the kind of intellectual horsepower it takes to profit from big data coming out of factories. The meeting was basically a room full of super-sharp Ph.D. candidates. One applied something called Mahalanobis-Taguchi methods to predict the health of rolling element bearings on a CNC machine spindle. The math got pretty deep pretty fast. He looked at a virtual sea of sensor readings while using a “Mahalanobis distance” to gauge the abnormality of patterns within them, then applied Taguchi methods to evaluate the accuracy of machine health predictions.
Another researcher predicted defect levels in semiconductor wafers using sensor data massaged with autoregressive moving averages and ensemble methods. Ensemble methods are basically machine learning techniques using arcane numerical methods such as Bayes optimal classifiers, bootstrap aggregation and Bayesian parameter averaging.
If you get the idea these big-data projects involved a lot of rocket-science calculations, you’d be right.
Trouble is, most industrial concerns are likely to apply big data without having the insights of the IMS Center. Here’s the scene that could unfold: The people who will try to make sense of data coming from production-line machines will be engineers, not data scientists. These engineers all have had a basic statistics class in school. They have never heard of Mahalanobis-Taguchi or Bayesian parameter averaging. So they may just assume all the sensor readings they see follow bell curve distributions.
Real data scientists, like those at the IMS Center, would look at the bell curves and basic statistics going to these efforts and realize they are just too simplistic to ever work. But the people on the shop floor are unlikely to see the problem. The sinister thing about this scenario is that the underlying behavior models will be wrong but will occasionally lead to random correct predictions about machine behavior. The engineers involved will believe that they are just a few tweaks away from a system that works wonderfully. In reality, they will be spinning their wheels guided by a model filled with unknown unknowns.
Until expectations catch up with the reality of the resources it takes to mount a successful effort in factories, big data will be a big disappointment and possibly an expensive nightmare.
Filed Under: Commentary • expert insight, TECHNOLOGIES + PRODUCTS, ALL INDUSTRIES
Tell Us What You Think!