A Spurious Outlier Detection System For High Frequency Time Series Data


As we are living in the age of IoT, more and more processes are using information gathered from well-placed sensors to infer and predict better about their businesses. These sensor data are typically continuous and of enormous volume. Like any other data sources, they are also contaminated by noise (outliers) which may or may not be preventable. The presence of these outlier points will adversely affect the performance of any analytical model. Note that we are differentiating between contextual anomalies and noisy outliers. Former is of importance to us to build predictive models. Here we propose an integrated and scalable approach to detect spurious outliers. The main modules of this proposed system are taken from the literature. But to our knowledge, no such concerted approach exists where an end-to-end robust system is proposed like here. Even though this method was developed specifically using manufacturing IoT data, this is equally applicable for any domain dealing with time-series data like CPG, Retail, Healthcare, Agrotech, etc.


Soham Chakraborty is a Senior Data Scientist with a Statistical background. He works mostly in Manufacturing creating AI solutions using Machine Learning and Deep Learning techniques.