DIFusio@

2023-03-31; 10:30 DOKTOREGO TESI BATEN DEFENTSA JOSU IRCIO FERNÁNDEZ

Irudia

Josu Ircio Fernández:  "Anomaly Detection in Multivariate Time Series".

Zuzendariak_Directores: José Antonio  Lozano Alonso/Aizea  Lojo Novo.

2023_03_31, 10:30  Sala Ada Lovelace aretoa.

Abstract:

"The fourth Industrial Revolution has brought about many advances in the monitoring of industrial systems. The development of new types of sensors and their low cost make it possible to obtain information about machines’ performance more efficiently. Hence,tasks such as predictive maintenance have become crucial for the competitiveness of companies. Predictive maintenance consists of analyzing machines’ operation through the data reported by the sensors to try to detect anomalies that may indicate a possible forthcoming failure [1]. Anticipating these failures can be vital to reduce the chance of unexpected breakdowns, thereby reducing the associated maintenance costs, reducing the time required to repair or recondition malfunctioning equipment and mitigating the risk of accidents related to machine failures [2].
An anomaly in machines’ performance can be defined as a non permitted deviation of the system from the acceptable, usual, or standard condition [3]. In most cases, an anomaly is not only given by an abnormal value, but also by the time in which it occurs and by the inconsistency with previous or future values. Therefore, to exploit the information reported by these sensors over time when searching for anomalies, it is common to use time series theory [4, 5, 6].
A time series is defined as a sequence of observations ordered in time [7]. In general, observations close together in time will be more correlated than observations further apart [8]. This is one of the features that distinguishes time series data from non-temporal data, in which there is no natural ordering of the observations [7, 9].
Therefore, the techniques used in time series analysis must account for this temporal correlation. In addition, the nature of time series data also includes some characteristics that make their analysis difficult, such as, large volumes of data, high dimensionality and continuous updating [7].
It should be noted that in real-life scenarios, it is common that the system to be monitored is complex and needs to be described using more than one temporal variable. In this case, apart from the temporal correlation between the observations of each  variable, inter-correlations between the variables can also exist. Consequently, all the variables need to be considered together in order to analyze the complete system. To face these situations, multivariate time series (MTS) are defined. A MTS is a set of univariate time series which provides information about a complex system.
The importance of this kind of data lies in the fact that nowadays, they can be extracted from any component that contains sensors and whose operation is monitored.
For this reason, in the past few years, the use of techniques to extract the knowledge and useful information from this type of temporal data has increased. Specifically, a whole field of research, called time series data mining, has been devoted to  extending classical machine learning tasks and algorithms to time series data and to creating new specific algorithms. Among the most important tasks that have been studied with time series are the following: time series forecasting [10], clustering [7], classification [11], temporal pattern discovery [12], rule discovery [10], segmentation [13], and anomaly detection [4].
The research shown in this dissertation will focus on anomaly detection in multivariate time series. Although it is a general problem with diverse applications, in this case, the target will be the anomaly detection in the operation of industrial systems.
In these scenarios it is common to have a set of example time series where correct and abnormal operation series are identified (labeled). Consequently, the MTS anomaly detection problem will be approached as a supervised MTS classification (MTSC) problem.
Thus, the objective will be to learn a classifier that is able  to distinguish between correct operation and abnormal operation time series.
Apart from the inherent difficulty of the anomaly detection problem in MTS, anomaly detection in industrial systems involves the following additional challenges that must also be taken into account:
• Streaming data. In real system monitoring scenarios, data is continuously arriving and almost immediate processing of the incoming time series streams is required. Once the new series is examined and according to the obtained results, it is necessary to react in a particular way as soon as possible, e.g. by issuing a
malfunction alert. Therefore, the learning methodology should take this scenario into account.
• High dimensionality. In complex industrial systems it is common to have a high number of sensors to analyze. This high dimensionality will require high computational time and resources, which contrasts with the requirements of a streaming
scenario. In addition, the large number of variables can complicate the analysis and even make the anomaly detection results less accurate [14]. Several variables, represented as univariate time series in this case, might be redundant in the presence of others, not provide relevant information to perform the target task. Therefore, being able to select only the relevant time series for  classification can be decisive and can improve the final results [15]. For all these reasons, the existing methods for feature subset selection will be investigated, specifically in the field of multidimensional time series classification.
• Imbalance. The anomaly detection problem addressed from a supervised approach is usually inevitably linked to the class imbalance problem, since the malfunctions that cause anomalies are usually rare. Most of the available time series will refer to the machine in normal operation, as opposed to a very small
percentage of them in which the machine operation has been malfunctioning.
These imbalanced scenarios suppose a problem for traditional models, which implicitly assume equally distributed classes, and are prone to generate biased values in favor of the majority class. This implies that the prediction of the minority class presents low precision or, equivalently, that anomalies remain  largely undetected. Consequently, another key aspect of the research will be to work on multivariate time series classifiers that are able to cope with class imbalance.
• Degradation. In real scenarios where the aim is to predict a failure through continuous monitoring and anomaly detection, it must be taken into account that, prior to a failure, the system undergoes a certain degradation. At some point, the system’s normal operation is altered and gradually deteriorates until the failure occurs. In most cases, information about when the system breaks down is available.
However, we usually do not have information concerning when it began to malfunction. Being able to identify these moments is fundamental for anticipating failures. Hence, it is necessary to investigate this aspect and to develop  new  learning  methodologies that consider it.
This thesis includes an introductory chapter and 4 main chapters. Firstly, the introduction establishes the concepts necessary to contextualize and facilitate the understanding of the contributions made throughout the dissertation. First of all, time series data and the different problems that have been studied in the state of the art with this type of data are presented. Among them, the time series classification problem, which is the one considered in this thesis, is covered. Specifically, the time series classification problem is formally defined and a literature review of the existing methods and techniques is carried out. 
Since the main objective of this dissertation is the anomaly detection task, to conclude this first introductory chapter, the anomaly detection problem is presented and the approach used to solve it is detailed together with the different challenges faced. In particular, the anomaly detection problem is approached as a supervised multivariate time series classification problem.
Once the necessary concepts have been introduced, the following three chapters present the contributions made to solve the different challenges that arise when addressing the problem of anomaly detection with the proposed approach.
In Chapter 2, the high dimensionality in anomaly detection problems is addressed.
In order to do that, a filtering method is proposed to select the most representative subsets of variables (univariate time series) in a MTS classification problem. In order to measure the relevance of each subset of series, an approach is adopted whose key point
is the computation of the mutual information between features. Since in this case the variables are time series, an adaptation of existing nonparametric mutual information estimators based on k-nearest neighbor is used. Specifically, to extend these methods
to the time series scenario, "Dynamic time warping" dissimilarity is used. Finally, the proposed method is evaluated on benchmark datasets and the sensitivity of the method to the different parameters that define it is analyzed.
In Chapter 3, the remaining challenges are addressed: streaming scenario, degradation and unbalance. To this end, the research focuses on the use case of hard drives failure prediction. Therefore, the objective is to detect when a hard drive starts to operate abnormally indicating a possible future failure. The problem is approached as a supervised MTSC problem based on sliding windows. Despite having information about the moments when the disks completely breakdown, the moment when they start to work abnormally is not identified. In order to be able to anticipate failures, a new learning methodology is developed that takes into account the degradation that the hard drives suffer. Additionally, as a solution to the highly imbalanced situation between the two classes (there are not many failed hard drives available), a novel technique is implemented so that the classifier is trained to maximize the minimum recall of the classes.
Finally, the developments are evaluated on the Backblaze benchmark dataset and the results are compared with those obtained by other state-of-the-art solutions.
Chapter 4 focuses on the imbalanced time series classification problem. Departing from the solution provided in the previous chapter, in this chapter we improve this proposal by designing a more refined classifier that maximizes the minimum recall of
the classes rather than the accuracy. To this end, neural network classifiers are used to take advantage of the loss function that they explicitly define. Specifically, the aim is to change this loss function so that the classifiers learn to maximize the minimum recall of the classes. However, the neural networks are typically trained using gradient descent-based learning methods and these do not allow to use non differentiable functions such as the minimum recall. Consequently, several smooth (differentiable) 
approximations of the minimum recall function are applied and evaluated. In order to evaluate the proposed approach, it is compared with other state-of-the-art methods
used for imbalanced time series classification.
Finally, Chapter 5 summarizes the findings and solutions presented in the dissertation to cope with the different challenges raised in addressing the anomaly detection problem. To conclude, several proposals and new challenges are pointed out for further
research in the future.
References
[1] T. P. Carvalho, F. A. Soares, R. Vita, R. d. P. Francisco, J. P. Basto, S. G. Alcalá, A systematic literature review of machine learning methods applied to predictive maintenance, Computers Industrial Engineering 137 (2019) 106024.
[2] Z. Li, Y.Wang, K.-S.Wang, Intelligent predictive maintenance for fault diagnosis and prognosis in machine centers: Industry 4.0 scenario, Advances in Manufacturing 5 (4) (2017) 377–387.
[3] R. Isermann, Fault-diagnosis applications: model-based condition monitoring: actuators, drives, machinery, plants, sensors, and fault-tolerant systems, Springer Science & Business Media, 2011.
[4] M. Canizo, I. Triguero, A. Conde, E. Onieva, Multi-head cnn–rnn for multi-time series anomaly detection: An industrial case study, Neurocomputing 363 (2019) 246–260.
[5] K. Choi, J. Yi, C. Park, S. Yoon, Deep learning for anomaly detection in timeseries data: Review, analysis, and guidelines, IEEE Access (2021).
[6] A. A. Cook, G. Mısırlı, Z. Fan, Anomaly detection for iot time-series data: A survey, IEEE Internet of Things Journal 7 (7) (2019) 6481–6494.
[7] T.-c. Fu, A review on time series data mining, Engineering Applications of Artificial Intelligence 24 (1) (2011) 164–181.

[8] R. H. Shumway, D. S. Stoffer, D. S. Stoffer, Time series analysis and its applications,Vol. 3, Springer, 2000.
[9] O. L. Hasna, R. Potolea, Time series - A taxonomy based survey, in: 13th IEEE International Conference on Intelligent Computer Communication and Processing, ICCP 2017, Cluj-Napoca, Romania, September 7-9, 2017, 2017, pp. 231–2
38. doi:10.1109/ICCP.2017.8117009.


[10] A. Fakhrazari, H. Vakilzadian, A survey on time series data mining, in: 2017 IEEE International Conference on Electro  Information Technology (EIT), IEEE,2017, pp. 476–481.
[11] W. Jiang, Time series classification: Nearest neighbor versus deep learning models,SN Applied Sciences 2 (4) (2020) 1–17.
[12] A. K. Shekar, M. Pappik, P. I. Sánchez, E. Müller, Selection of relevant and non-redundant multivariate ordinal patterns for time series classification, in: Discovery Science - 21st International Conference, DS 2018, Limassol, Cyprus, October 29-31, 2018, Proceedings, 2018, pp. 224–240. doi:10.1007/978-3-030-01771-2\_15.
[13] P. Esling, C. Agon, Time-series data mining, ACM Computing Surveys (CSUR)45 (1) (2012) 1–34.
[14] H. Li, C. Lin, X. Wan, Z. Li, Feature representation and similarity measure based on covariance sequence for multivariate time series, IEEE Access 7 (2019)67018–67026.
[15] S. Sharmin, M. Shoyaib, A. A. Ali, M. A. H. Khan, O. Chae, Simultaneous feature selection and discretization based on mutual information, Pattern Recognit. 91(2019) 162–174. doi:10.1016/j.patcog.2019.02.016."


Gaika filtratu