Selective Imputation for Multivariate Time Series Datasets with Missing Values

Research output: Contribution to journalJournal articleResearchpeer-review

  • Ane Blazquez-Garcia
  • Kristoffer Wickstrom
  • Shujian Yu
  • Karl Oyvind Mikalsen
  • Ahcene Boubekki
  • Angel Conde
  • Usue Mori
  • Jenssen, Robert
  • Jose A. Lozano

Multivariate time series often contain missing values for reasons such as failures in data collection mechanisms. Since these missing values can complicate the analysis of time series data, imputation techniques are typically used to deal with this issue. However, the quality of the imputation directly affects the performance of downstream tasks. In this paper, we propose a selective imputation method that identifies a subset of timesteps with missing values to impute in a multivariate time series dataset. This selection, which will result in shorter and simpler time series, is based on both reducing the uncertainty of the imputations and representing the original time series as good as possible. In particular, the method uses multi-objective optimization techniques to select the optimal set of points, and in this selection process, we leverage the beneficial properties of the Multi-task Gaussian Process (MGP). The method is applied to different datasets to analyze the quality of the imputations and the performance obtained in downstream tasks, such as classification or anomaly detection. The results show that much shorter and simpler time series are able to maintain or even improve both the quality of the imputations and the performance of the downstream tasks.

Original languageEnglish
JournalIEEE Transactions on Knowledge and Data Engineering
Volume35
Issue number9
Pages (from-to)9490-9501
Number of pages12
ISSN1041-4347
DOIs
Publication statusPublished - 2023

Bibliographical note

Publisher Copyright:
© 1989-2012 IEEE.

    Research areas

  • imputation, irregular sampling, missing data, Multivariate time series

ID: 364497888