Invited speaker: Barbara Hammer - Symposium on Intelligent Data Analysis (IDA 2023)

Variable importance determination in a dynamic context

Bio: Barbara Hammer is a full Professor for Machine Learning at the CITEC Cluster at Bielefeld University, Germany. She received her Ph.D. in Computer Science in 1999 and her venia legendi (permission to teach) in 2003, both from the University of Osnabrueck, Germany, where she was head of an independent research group on the topic ‘Learning with Neural Methods on Structured Data’. In 2004, she accepted an offer for a professorship at Clausthal University of Technology, Germany, before moving to Bielefeld in 2010. Barbara’s research interests cover theory and algorithms in machine learning and neural networks and their application for technical systems and the life sciences, including explainability, learning with drift, nonlinear dimensionality reduction, recursive models, and learning with non-standard data. Barbara has been chairing the IEEE CIS Technical Committee on Data Mining and Big Data Analytics, the IEEE CIS Technical Committee on Neural Networks, and the IEEE CIS Distinguished Lecturer Committee. She has been elected as member of the IEEE CIS Administrative Committee and the INNS Board. She has been an associate editor of the IEEE Computational Intelligence Magazine, the IEEE TNNLS, and IEEE TPAMI. Currently, she is involved in a number of large scale projects including the DFG collaborative research center on Construng Explainability, the EU Doctoral Metwork on Learning with Multiple Representations (LEMUR) and the ERC Synergy Grant Smart Water Futures.

Abstract: Variable importance determination refers to the challenge to identify the most relevant input dimensions or features for a given learning task and quantify their respective relevance. It constitutes a foundation for feature selection, and it enables an intuitive insight into the rational of model decisions. Indeed, it constitutes one of the oldest and most prominent global explanation technologies for machine learning models. A huge number of measures have been proposed such as mutual information, permutation feature importance, or Shapley values, to name just a few. As models are increasingly used in everyday life, they face an open environment, possibly changing dynamics, and the necessity of model adaptation to account for changes of the underlying distribution. At present, feature relevance detemination almost solely focusses on static scenarios and batch training. In the talk, I will target two specific challenges which occur in feature relevance determination when learning with data streams: (1) How to efficiently and effectively accompany a model which learns incrementally by feature relevance determinaton methods, which are capable of providing insights into the moste relevant features at every point in time for an adaptive model? I will present inceremental variants of popular feature relevance determination approaches. (2) How to identify causes for an observed drift? I will present mechanisms which are capable of identifying features which are most characteristic for an observed drift, thereby aiming for their spatial features, on the one hand, and an identification of minimal feature subsets which characterize an observed drift, on the other hand.