LIDTA 2020 is a half day tutorial to be held at ECML/PKDD’2020. This tutorial is about Learning with Imbalanced Domains and Rare Event Detection. It targets both newcomers on the subject but also researchers/professionals with previous experience.
Many real-world data-mining applications involve obtaining and evaluating predictive models using data sets with strongly imbalanced distributions of the target variable. Frequently, the least-common values of this target variable are associated with rare events that are highly relevant for end-users. Examples include many diverse domains, such as diagnosis of rare diseases, intrusion detection or popularity prediction in social media. Tackling the issues raised by imbalanced domains is crucial to both academia and industry.
This tutorial clearly describes the full pipeline for rare event detection. This includes i) the fundamentals and principles, ii) methods and evaluation, iii) rare events detection in classified data, iv) explanation, v) a case study on fraud detection in data streams and vi) open challenges.