Colin Bellinger

Please feel free to browse the articles and reports that I have published over the years. They are organized by the type of publication, and sorted according to year, and are available for download. Papers marked with * have code available.

Book Contributions

AI Applied to Air Pollution and Environmental Health: A Case Study on Hypothesis Generation

C. Bellinger, M.S.M Jabbar, O. Wine, C. Nielsen, J.S. Lomelin, A. Osornio- Vargas, O. Zaiane

In Emerging Paradigms in Machine Learning (Ramanna, S. and Jain, L. C. and Howlett, R. J., ed.), Springer Berlin Heidelberg, pp. 161–195, 2013.

Emerging Trends in Machine Learning: Classification of Stochastically Episodic Events

B.J. Oommen and C. Bellinger

In Emerging Paradigms in Machine Learning (Ramanna, S. and Jain, L. C. and Howlett, R. J., ed.), Springer Berlin Heidelberg, pp. 161–195, 2013.

In this chapter we report some Machine Learning (ML) and Pattern Recognition (PR) techniques applicable for classifying Stochastically Episodic (SE) events1. Researchers in the field of Pattern Recognition (PR) have traditionally presumed the availability of a representative set of data drawn from the classes of interest, say u1 and u2 in a 2-class problem. These samples are typically utilized in the development of the system?s discriminant function. It is, however, widely recognized that there exists a particularly challenging class of PR problems for which a representative set is not available for the second class, which has motivated a great deal of research into the so-called domain of One Class (OC) classification. In this chapter, we primarily report the novel results found in [2, 4, 6], where we extend the frontiers of novelty detection by the introduction of a new field of problems open for analysis. In particular, we note that this new realm deviates from the standard set of OC problems based on the presence of three characteristics, which ultimately amplify the classification challenge. They involve the temporal nature of the appearance of the data, the fact that the data from the classes are ?interwoven?, and that a labelling procedure is not merely impractical - it is almost, by definition, impossible. As a first attempt to tackle these problems, we present two specialized classification strategies denoted by Scenarios S1 and S2 respectively. In Scenarios S1, the data is such that standard binary and one-class classifiers can be applied. Alternatively, in Scenarios S2, the labelling challenge prevents the application of binary classifiers, and instead dictates the novel application of one-class classifiers. The validity of these scenarios has been demonstrated for the exemplary domain involving the Comprehensive Nuclear Test-Ban-Treaty (CTBT), for which our research endeavour has also developed a simulation model. As far as we know, our research in this field is of a pioneering sort, and the results presented here are novel.

Journal Publications

The Class Imbalance Problem in Deep Learning

K. Ghosh, C. Bellinger, R. Corizzo, P. Branco, B. Krawczyk and N. Japkowicz

In Machine Learning, 2022

Deep learning has recently unleashed the ability for Machine learning (ML) to make unparalleled strides. It did so by confronting and successfully addressing, at least to a certain extent, the knowledge bottleneck that paralyzed ML and Artificial Intelligence (AI) for decades. The community is currently basking in deep learning’s success, but a question that comes to mind is: have all of the issues previously affecting machine learning systems been solved by deep learning or do some issues remain for which deep learning is not a bulletproof solution? This question in the context of the class imbalance becomes a motivation for this paper. Imbalance problem was first recognized almost three decades ago and has remained a critical challenge at least for traditional learning approaches. Our goal is to investigate whether the tight dependency between class imbalances, concept complexities, dataset size and classifier performance, known to exist in traditional learning systems, is alleviated in any way in deep learning approaches and to what extent, if any, network depth and regularization can help. To answer these questions we conduct a survey of the recent literature focused on deep learning and the class imbalance problem as well as a series of controlled experiments on both artificial and real-world domains. This allows us to formulate lessons learned about the impact of class imbalance on deep learning models, as well as pose open challenges that should be tackled by researchers in this field.

RB-CCR: Radial-Based Combined Cleaning and Resampling algorithm for imbalanced data classification

M. Koziarski, C. Bellinger, and M. Wozniak, M.

In Machine Learning, 2021

Real-world classification domains, such as medicine, health and safety, and finance, often exhibit imbalanced class priors and have asynchronous misclassification costs. In such cases, the classification model must achieve a high recall without significantly impacting precision. Resampling the training data is the standard approach to improving classification performance on imbalanced binary data. However, the state-of-the-art methods ignore the local joint distribution of the data or correct it as a post-processing step. This can causes sub-optimal shifts in the training distribution, particularly when the target data distribution is complex. In this paper, we propose Radial-Based Combined Cleaning and Resampling (RB-CCR). RB-CCR utilizes the concept of class potential to refine the energy-based resampling approach of CCR. In particular, RB-CCR exploits the class potential to accurately locate sub-regions of the data-space for synthetic oversampling. The category sub-region for oversampling can be specified as an input parameter to meet domain-specific needs or be automatically selected via cross-validation. Our 5×2 cross-validated results on 57 benchmark binary datasets with 9 classifiers show that RB-CCR achieves a better precision-recall trade-off than CCR and generally out-performs the state-of-the-art resampling methods in terms of AUC and G-mean.

Machine Learning for Work Disability Prevention: Introduction to the Special Series

D.P. Gross, I.A. Steenstra, F.E. Harrell, C. Bellinger, and O. Zaıane

In Journal of occupational rehabilitation, 2020

Rapid development in computer technology has led to sophisticated methods of analyzing large datasets with the aim of improving human decision making. Artificial Intelligence and Machine Learning (ML) approaches hold tremendous potential for solving complex real-world problems such as those faced by stakeholders attempting to prevent work disability. These techniques are especially appealing in work disability contexts that collect large amounts of data such as workers’ compensation settings, insurance companies, large corporations, and health care organizations, among others. However, the approaches require thorough evaluation to determine if they add value to traditional statistical approaches. In this special series of articles, we examine the role and value of ML in the field of work disability prevention and occupational rehabilitation.

SMOTEFUNA: Synthetic minority over-sampling technique based on furthest neighbour algorithm

A.S. Tarawneh, A.B.A. Hassanat, K. Almohammadi, D. Chetverikov and C. Bellinger

In IEEE Access, 2020

Class imbalance occurs in classification problems in which the “normal”cases, or instances, significantly outnumber the “abnormal”instances. Training a standard classifier on imbalanced data leads to predictive biases which cause poor performance on the class(es) with lower prior probabilities. The less frequent classes are often critically important events, such as system failure or the occurrence of a rare disease. As a result, the class imbalance problem has been considered to be of great importance for many years. In this paper, we propose a novel algorithm that utilizes the furthest neighbor of a candidate example to generate new synthetic samples. A key advantage of SOMTEFUNA over existing methods is that it does not have parameters to tune (such as K in SMOTE). Thus, it is significantly easier to utilize in real-world applications. We evaluate the benefit of resampling with SOMTEFUNA against state-of-the-art methods including SMOTE, ADASYN and SWIM using Naive Bayes and Support Vector Machine classifiers. Also, we provide a statistical analysis based on Wilcoxon Signed-rank test to validate the significance of the SMOTEFUNA results. The results indicate that the proposed method is an efficient alternative to the current methods. Specifically, SOMTEFUNA achieves better 5-fold cross validated ROC and precision-recall space performance.

A Framework for Extreme Imbalance Classification: Sampling WIth the Majority Class (SWIM)

Bellinger C., Sharma S., Japkowicz N., Zaiane O.

In Knowledge and Information Systems, 2019

The class imbalance problem is a pervasive issue in many real-world domains. Oversampling methods that inflate the rare class by generating synthetic data are amongst the most popular techniques for resolving class imbalance. However, they concentrate on the characteristics of the minority class and use them to guide the oversampling process. By completely overlooking the majority class, they lose a global view on the classification problem and, while alleviating the class imbalance, may negatively impact learnability by generating borderline or overlapping instances. This becomes even more critical when facing extreme class imbalance, where the minority class is strongly underrepresented and on its own does not contain enough information to conduct the oversampling process. We propose a framework for synthetic oversampling that, unlike existing resampling methods, is robust on cases of extreme imbalance. The key feature of the framework is that it uses the density of the well- sampled majority class to guide the generation process. We demonstrate implementations of the framework using the Mahalanobis distance and a radial basis function. We evaluate over 25 benchmark datasets, and show that the framework offers a distinct performance improvement over the existing state-of-the-art in oversampling techniques.

Interdisciplinary-driven hypotheses on spatial associations of mixtures of industrial air pollutants with adverse birth outcomes

Jesus Serrano-Lomelin, J., Nielsen, C., Jabbar, S.M., Wine, O., Bellinger, C., Villeneuve, P.J., Stieb, D., Aelicks, N., Aziz, K., Buka, I., Chandra, S., Crawford, S., Demers, P., Erickson, A.C., Hystad, P., Kumar, M., Phipps, E., Shah, P.S., Yuan, Y., Zaiane, O.R., Osornio-Vargas, A.R.

In Environmental International, 2019

Background: Adverse birth outcomes (ABO) such as prematurity and small for gestational age confer a high risk of mortality and morbidity. ABO have been linked to air pollution; however, relationships with mixtures of industrial emissions are poorly understood. The exploration of relationships between ABO and mixtures is complex when hundreds of chemicals are analyzed simultaneously, requiring the use of novel approaches.

Objectives: We aimed to generate robust hypotheses spatially linking mixtures and the occurrence of ABO using a spatial data mining algorithm and subsequent geographical and statistical analysis. The spatial data mining approach aimed to reduce data dimensionality and efficiently identify spatial associations between multiple chemicals and ABO.

Methods: We discovered co-location patterns of mixtures and ABO in Alberta, Canada (2006–2012). An ad-hoc spatial data mining algorithm allowed the extraction of primary co-location patterns of 136 chemicals released into the air by 6279 industrial facilities (National Pollutant Release Inventory), wind-patterns from 182 stations, and 333,247 singleton live births at the maternal postal code at delivery (Alberta Perinatal Health Program), from which we identified cases of preterm birth, small for gestational age, and low birth weight at term. We selected secondary patterns using a lift ratio metric from ABO and non-ABO impacted by the same mixture. The relevance of the secondary patterns was estimated using logistic models (adjusted by socioeconomic status and ABO-related maternal factors) and a geographic-based assignment of maternal exposure to the mixtures as calculated by kernel density.

Results: From 136 chemicals and three ABO, spatial data mining identified 1700 primary patterns from which five secondary patterns of three-chemical mixtures, including particulate matter, methyl-ethyl-ketone, xylene, carbon monoxide, 2-butoxyethanol, and n-butyl alcohol, were subsequently analyzed. The significance of the associations (odds ratio > 1) between the five mixtures and ABO provided statistical support for a new set of hypotheses.

Conclusion: This study demonstrated that, in complex research settings, spatial data mining followed by pattern selection and geographic and statistical analyses can catalyze future research on associations between air pollutant mixtures and adverse birth outcomes.

Validity of the Work Assessment Triage Tool for Selecting Rehabilitation Interventions for Workers’ Compensation Claimants with Musculoskeletal Conditions

Gross P.D., Steenstra A.I., Shaw W., Yousefi P., Bellinger C., Zaiane O.

In Journal of Occupational Rehabilitation, 2019

Background: The Work Assessment Triage Tool (WATT) is a clinical decision support tool developed using machine learning to help select interventions for patients with musculoskeletal disorders. The WATT categorizes patients based on individual characteristics according to likelihood of successful return to work following rehabilitation. A previous validation showed acceptable classification accuracy, but we re-examined accuracy using a new dataset drawn from the same system 2 years later.

Methods: A population-based cohort design was used, with data extracted from a Canadian compensation database on workers considered for rehabilitation between January 2013 and December 2016. Data were obtained on demographic, clinical, and occupational characteristics, type of rehabilitation undertaken, and return to work outcomes. Analysis included classification accuracy statistics of WATT recommendations.

Results: The sample included 28,919 workers (mean age 43.9 years, median duration 56 days), of whom 23,124 experienced a positive outcome within 30 days following return to work assessment. Sensitivity of the WATT for selecting successful program was 0.13 while specificity was 0.87. Overall accuracy was 71% while human recommendations were higher at 85%.

Conclusions: Overall accuracy of the WATT for selecting successful rehabilitation programs declined in a more recent cohort and proved less accurate than human clinical recommendations. Algorithm revision and further validation is needed.

The Index Lift in Data Mining has a Close Relationship with the Association Measure Relative Risk in Epidemiological Studies

Vu K., Clark R., Bellinger C., Erickson G., Osornio-Vargas A., Zaiane O., Yuan Y.

In BMC Medical Informatics and Decision Making, 2019

Background: Data mining tools have been increasingly used in health research, with the promise of accelerating discoveries. Lift is a standard association metric in the data mining community. However, health researchers struggle with the interpretation of lift. As a result, dissemination of data mining results can be met with hesitation. The relative risk and odds ratio are standard association measures in the health domain, due to their straightforward interpretation and comparability across populations. We aimed to investigate the lift-relative risk and the lift-odds ratio relationships, and provide tools to convert lift to the relative risk and odds ratio.

Methods: We derived equations linking lift-relative risk and lift-odds ratio. We discussed how lift, relative risk, and odds ratio behave numerically with varying association strengths and exposure prevalence levels. The lift-relative risk relationship was further illustrated using a high-dimensional dataset which examines the association of exposure to airborne pollutants and adverse birth outcomes. We conducted spatial association rule mining using the Kingfisher algorithm, which identified association rules using its built- in lift metric. We directly estimated relative risks and odds ratios from 2 by 2 tables for each identified rule. These values were compared to the corresponding lift values, and relative risks and odds ratios were computed using the derived equations.

Results: As the exposure-outcome association strengthens, the odds ratio and relative risk move away from 1 faster numerically than lift, i.e. |log(odds ratio)| ≥ |log(relative risk)| ≥ |log(lift)|. In addition, lift is bounded by the smaller of the inverse probability of outcome or exposure, i.e. lift≤ min(1/P(O), 1/P(E)). Unlike the relative risk and odds ratio, lift depends on the exposure prevalence for fixed outcomes. For example, when an exposure A and a less prevalent exposure B have the same relative risk for an outcome, exposure A has a lower lift than B.

Conclusions: Lift, relative risk, and odds ratio are positively correlated and share the same null value. However, lift depends on the exposure prevalence, and thus is not straightforward to interpret or to use to compare association strength. Tools are provided to obtain the relative risk and odds ratio from lift.

One-Class Classification - From Theory to Practice: A case-study in radioactive threat detection

Colin Bellinger, Shiven Sharma, Nathalie Japkowicz

In Expert Systems with Applications, 2018

Over the years, the acceptance of machine learning as a valuable tool in the real-world has caused much interest in the research community; this is particularly the case as the field of Big Data is coming into prominence. However, real-world data comes with a myriad of challenges, amongst the most prominent of which is the fact that it can exhibit a high level of imbalance. This can come in the form of both within- and between-class imbalance. While a significant amount of research has been devoted to the impact of within-class imbalance over binary classifiers, very little attention has been given to their impact on one-class classifiers, which are typically used in situations of extreme between-class imbalance. During our collaboration with Health Canada into the identification of anomalous gamma-ray spectra, the issue of within-class imbalance in a one-class classification setting was highly significant. In this setting, the imbalance comes from the fact that the background data that we wish to model is composed of two concepts (background no-rain and rain); the rain sub-concept is rare and corresponds to spectra affected by the presence of water in the environment. In this article, we present our work into developing systems for detecting anomalous gamma-rays that are able to handle both the inherent between-class and within-class imbalance present in the domain. We test and validate our system over data provided to us by Health Canada from three sites across Canada.

A Systematic Review of Data Mining and Machine Learning for Air Pollution Epidemiology

Colin Bellinger, Mohomed Shazan Mohomed Jabbar, Osmar Zaıane, and Alvaro Osornio-Vargas

In BMC Public Health, 2017

Background: Data measuring airborne pollutants, public health and environmental factors are increasingly being stored and merged. These big datasets offer great potential, but also challenge traditional epidemiological methods. This has motivated the exploration of alternative methods to make predictions, find patterns and extract information. To this end, data mining and machine learning algorithms are increasingly being applied to air pollution epidemiology.

Materials and Methods: We conducted a systematic literature review on the application of data mining and machine learning methods in air pollution epidemiology. We carried out our search process in PubMed, the MEDLINE database and Google Scholar. Research articles applying data mining and machine learning methods to air pollution epidemiology were queried and reviewed.

Results: Our search queries resulted in 400 research articles. Our fine-grained analysis employed our inclusion/exclusion criteria to reduce the results to 47 articles, which we separate into three primary areas of interest: 1) source apportionment; 2) forecasting/prediction of air pollution/quality or exposure; and 3) generating hypotheses. Early applications had a preference for artificial neural networks. In more recent work, decision trees, support vector machines, k-means clustering and the APRIORI algorithm have been widely applied. Our survey shows that the majority of the research has been conducted in Europe, China and the USA, and that data mining is becoming an increasingly common tool in environmental health. For potential new directions, we have identified that deep learning and geo-spacial pattern mining are two burgeoning areas of data mining that have good potential for future applications in air pollution epidemiology.

Conclusions: We carried out a systematic review identifying the current trends, challenges and new directions to explore in the application of data mining methods to air pollution epidemiology. This work shows that data mining is increasingly being applied in air pollution epidemiology. The potential to support air pollution epidemiology continues to grow with advancements in data mining related to temporal and geo-spacial mining, and deep learning. This is further supported by new sensors and storage mediums that enable larger, better quality data. This suggests that many more fruitful applications can be expected in the future.

Discovering Co-location Patterns with Aggregated Spatial Transactions and Dependency Rules

Mohomed Shazan Mohomed Jabbar, Colin Bellinger, Osmar R. Zaiane, Alvaro Osornio-Vargas

In International Journal of Data Science and Analytics, 2017

Co-location pattern mining focuses on finding associations among spatial features. Existing co-location pattern mining techniques mainly rely on frequency based thresholds which discard the rare patterns and find the noisy patterns. This could be avoided by evaluating co-location patterns based on their statistical significance. Recent studies focused on association rule mining have successfully adopted statistical tests to find significant rules. By transforming spatial data to transaction data, the co-location pattern mining problem can be reduced to an association rule mining problem and such methods can be used to find co-location patterns robustly. A transactionization mechanism has been recently proposed to achieve this. However, this method ignores the effect of general instances, with non-overlapping buffer regions, on the reference instances in their proximity. Addressing this, we propose a novel approach, AGT-Fisher, to robustly transform spatial data to transaction data and use statistically significant dependency rule searching methods to find co-location rules from them. Our work is motivated by an application in environmental health to investigate potential associations between air pollution and adverse birth outcomes in Canada. We used AGT-Fisher to find such associations from real datasets. The discovered co-location patterns were evaluated based on their statistical dependency and the empirical evidence, and results showed that our approach is more robust. Furthermore, we evaluated the resulting patterns to find spatial common and contrast sets, which are two special types of co-location patterns, to compare spatial regions and gain more insights.

Manifold-Based Synthetic Oversampling with Manifold Conformance Estimation

C. Bellinger, C. Drummond and N. Japkowicz

Machine Learning Journal, 2017

Classification domains such as those in medicine, national security and the environment regularly suffer from a lack of training instances for the class of interest. In many cases, classification models induced under these conditions have poor predictive performance on the important minority class. Synthetic oversampling can be applied to mitigate the impact of imbalance by generating additional training instances. In this field, the majority of research has focused on refining the SMOTE algorithm. We note, however, that the generative bias of SMOTE is not appropriate for the large class of learning problems that conform to the manifold property. These are high-dimensional problems, such as image and spectral classification, with implicit feature spaces that are lower than their physical data spaces. We show that ignoring this can lead to instances being generated in erroneous regions of the data space. We propose a general framework for manifold-based synthetic oversampling that helps users to select a domain-appropriate manifold learning method, such as PCA or autoencoder, and apply it to model and generate additional training samples. We evaluate data generation on theoretical distributions and image classification tasks that are standard in the manifold learning literature, and empirically show its positive impact on the classification of high-dimensional image and gamma-ray spectra tasks, along with 16 UCI datasets.

On the pattern recognition and classification of stochastically episodic events

C. Bellinger and B.J. Oommen

Transactions on Computational Collective Intelligence VI, vol. 6, 2012

Researchers in the field of Pattern Recognition (PR) have traditionally presumed the availability of a representative set of data drawn from the classes of interest, say u1 and u2 in a 2-class problem. These samples are typically utilized in the development of the system?s discriminant function. It is, however, widely recognized that there exists a particularly challenging class of PR problems for which a representative set is not available for the second class, which has motivated a great deal of research into the so-called domain of One Class (OC) classification. In this paper, we extend the frontiers of novelty detection by the introduction of a new field of problems open for analysis. In particular, we note that this new realm deviates from the standard set of OC problems based on the presence of three characteristics, which ultimately amplify the classification challenge. They involve the temporal nature of the appearance of the data, the fact that the data from the classes are ?interwoven?, and that a labelling procedure is not merely impractical - it is almost, by definition, impossible. As a first attempt to tackle these problems, we present two specialized classification strategies denoted by Scenarios S1 and S2 respectively. In Scenarios S1, the data is such that standard binary and one-class classifiers can be applied. Alternatively, in Scenarios S2, the labelling challenge prevents the application of binary classifiers, and instead dictates the novel application of one-class classifiers. The validity of these scenarios has been demonstrated for the exemplary domain involving the Comprehensive Nuclear Test-Ban-Treaty (CTBT), for which our research endeavour has also developed a simulation model. As far as we know, our research in this field is of a pioneering sort, and the results presented here are novel.

Best Paper Award

Synthetic Oversampling for Advanced Radioactive Threat Detection

C. Bellinger, N. Japkowicz, C. Drummond

n Proceedings of the 2015 IEEE 14th International Conference on Machine Learning and Applications, pp. 825–830, 2015.

Gamma-ray spectral classification requires the automatic identification of a large background class and a small minority class composed of instances that may pose a risk to humans and the environment. Accurate classification of such instances is required in a variety of domains, spanning event and port security to national monitoring for failures at industrial nuclear facilities. This work proposes a novel form of synthetic oversampling based on artificial neural network architecture and empirically demonstrates that it is superior to the state-of-the-art in synthetic oversampling on the target domain. In particular, we utilize gamma-ray spectral data collected for security purposes at the Vancouver 2010 winter Olympics and on a node of Health Canada’s national monitoring networks.

Clustering based one-class classification for the comprehensive nuclear test-ban treaty

S. Sharma, C. Bellinger, N. Japkowicz

In Proceedings of the 2012 25th Canadian Conference on Advances in Artificial Intelligence, pp. 181–193, 2012.

Monitoring the levels of radioxenon isotopes in the atmosphere has been proposed as a means of verifying the Comprehensive Nuclear-Test-Ban Treaty (CTBT). This translates into a classification problem, whereby the measured concentrations either belong to an explosion class or a background class. Instances drawn from the explosions class are extremely rare, if not non-existent. Therefore, the resulting dataset is extremely imbalanced, and inherently suited for one-class classification. Further exacerbating the problem is the fact that the background distribution can be extremely complex, and thus, modelling it using one-class learning is difficult. In order to improve upon the previous classification results, we investigate the augmentation of one-class learning methods with clustering. The purpose of clustering is to convert a complex distribution into simpler distributions, the clusters, over which more effective models can be built. The resulting model, built from one-class learners trained over the clusters, performs more effectively than a model that is built over the original distribution. This thesis is empirically tested on three different data domains; in particular, a number of artificial datasets, datasets from the UCI repository, and data modelled after the extremely challenging CTBT. The results offer credence to the fact that there is an improvement in performance when clustering is used with one-class classification on complex distributions.

Conference Publications

Balancing Information with Observation Costs in Deep Reinforcement Learning

C. Bellinger, A. Drozdyuk, M. Crowley, and I. Tamblyn

In Proceedings of the Canadian Conference on Advances in Artificial Intelligence, 2022

The use of reinforcement learning (RL) in scientific applications, such as materials design and automated chemistry, is increasing. A major challenge, however, lies in fact that measuring the state of the system is often costly and time consuming in scientific applications, whereas policy learning with RL requires a measurement after each time step. In this work, we make the measurement costs explicit in the form of a costed reward and propose the active-measure with costs framework that enables off-the-shelf deep RL algorithms to learn a policy for both selecting actions and determining whether or not to measure the state of the system at each time step. In this way, the agents learn to balance the need for information with the cost of information. Our results show that when trained under this regime, the Dueling DQN and PPO agents can learn optimal action policies whilst making up to 50% fewer state measurements, and recurrent neural networks can produce a greater than 50% reduction in measurements. We postulate the these reduction can help to lower the barrier to applying RL to real-world scientific applications.

Active Measure Reinforcement Learning for Observation Cost Minimization

C. Bellinger, R. Coles, M. Crowley, and I. Tamblyn

In Proceedings of the Canadian Conference on Advances in Artificial Intelligence, 2021

Standard reinforcement learning (RL) algorithms assume that the observation of the next state comes instantaneously and at no cost. In a wide variety of sequential decision making tasks ranging from medical treatment to scientific discovery, however, multiple classes of state observations are possible, each of which has an associated cost. We propose the active measure RL framework (Amrl) as an initial solution to this problem where the agent learns to maximize the costed return, which we define as the discounted sum of rewards minus the sum of observation costs. Our empirical evaluation demonstrates that Amrl-Q agents are able to learn a policy and state estimator in parallel during online training. During training the agent naturally shifts from its reliance on costly measurements of the environment to its state estimator in order to increase its reward. It does this without harm to the learned policy. Our results show that the Amrl-Q agent learns at a rate similar to standard Q-learning and Dyna-Q. Critically, by utilizing an active strategy, Amrl-Q achieves a higher costed return.

Explainable image analysis for decision support in medical healthcare

R. Corizzo, Y. Dauphin, C. Bellinger, E. Zdravevski and N. Japkowicz

In Proceedings of the IEEE International Conference on Big Data (Big Data), 2021

Recent advances in medical imaging and deep learning have enabled the efficient analysis of large databases of images. Notable examples include the analysis of computed tomography (CT), magnetic resonance imaging (MRI), and X-ray. While the automatic classification of images has proven successful, adopting such a paradigm in the medical healthcare setting is unfeasible. Indeed, the physician in charge of the detailed medical assessment and diagnosis of patients cannot trust a deep learning model’s decisions without further explanations or insights about their classification outcome. In this study, rather than relying on classification, we propose a new method that leverages deep neural networks to extract a representation of images and further analyze them through clustering, dimensionality reduction for visualization, and class activation mapping. Thus, the system does not make decisions on behalf of physicians. Instead, it helps them make a diagnosis. Experimental results on lung images affected by Pneumonia and Covid-19 lesions show the potential of our method as a tool for decision support in a medical setting. It allows the physician to identify groups of similar images and highlight regions of the input that the model deemed important for its predictions.

Remix: Calibrated resampling for class imbalance in deep learning

C. Bellinger, R. Corizzo and N. Japkowicz

In Proceedings of International Conference on Discovery Science, 2021

Long-tailed distributions and class imbalance are problems of significant importance in applied deep learning where trained models are exploited for decision support and decision automation in critical areas such as health and medicine, transportation and finance. The challenge of learning deep models from such data remains high, and the state-of-the-art solutions are typically data dependent and primarily focused on images. Important real-world problems, however, are much more diverse thus necessitating a general solution that can be applied to diverse data types. In this paper, we propose ReMix, a training technique that seamlessly leverages batch resampling, instance mixing and soft-labels to efficiently enable the induction of robust deep models from imbalanced and long-tailed datasets. Our results show that fully connected neural networks and Convolutional Neural Networks (CNNs) trained with ReMix generally outperform the alternatives according to the g-mean and are better calibrated according to the balanced Brier score.

Undersampling with support vectors for multi-class imbalanced data classification

B. Krawczyk, C. Bellinger, R. Corizzo and N. Japkowicz

In Proceedings of the International Joint Conference on Neural Networks (IJCNN), 2021

Learning from imbalanced data poses significant challenges for the classifier. This becomes even more difficult, when dealing with multi-class problems. Here relationships among classes are no longer well-defined and it is easy to loose performance on one of the classes while gaining on other. In last years this topic has gained increased interest from the machine learning community - however, still there is a need for developing new and efficient algorithms to handle this challenge. In this paper we propose a new approach for balancing multi-class imbalanced problems. It is based on a two-step undersampling methodology. In the first step, a one-class classifier is being trained on each of the classes, achieving skew-insensitive data description. Support vectors for each class are extracted and used as new class representatives, thus achieving significant reduction in the terms of used instances. In the second step, an evolutionary undersampling approach is being used on these support vectors in order to further balance the training set. By applying this technique on a set of support vectors and not on a full dataset, we achieve a significant reduction of the computational time and increased accuracy. Finally, a standard multi-class classifier is being trained on the balanced data set. A thorough experimental study proves the usefulness of the proposed approach in comparison with state-of-the-art approaches for handling multi-class imbalanced data.

On the combined effect of class imbalance and concept complexity in deep learning

K. Ghosh, C. Bellinger, R. Corizzo, B. Krawczyk and N. Japkowicz

In Proceedings of EEE Confrence on Big Data, 2021

Structural concept complexity, class overlap, and data scarcity are some of the most important factors influencing the performance of classifiers under class imbalance conditions. When these effects were uncovered in the early 2000s, understandably, the classifiers on which they were demonstrated belonged to the classical rather than Deep Learning categories of approaches. As Deep Learning is gaining ground over classical machine learning and is beginning to be used in critical applied settings, it is important to assess systematically how well they respond to the kind of challenges their classical counterparts have struggled with in the past two decades. The purpose of this paper is to study the behavior of deep learning systems in settings that have previously been deemed challenging to classical machine learning systems to find out whether the depth of the systems is an asset in such settings. The results in both artificial and real-world image datasets (MNIST Fashion, CIFAR-10) show that these settings remain mostly challenging for Deep Learning systems and that deeper architectures seem to help with structural concept complexity but not with overlap challenges in simple artificial domains. Data scarcity is not overcome by deeper layers, either. In the real-world image domains, where overfitting is a greater concern than in the artificial domains, the advantage of deeper architectures is less obvious: while it is observed in certain cases, it is quickly cancelled as models get deeper and perform worse than their shallower counterparts.

Reinforcement Learning in a Physics-Inspired Semi-Markov Environment

C. Bellinger, R. Cole, M. Crowley and I. Tamblyn

Canadian Conference on Artificial Intelligence, Ottawa, Canada. 2020

Reinforcement learning (RL) has been demonstrated to have great potential in many applications of scientific discovery and design. Recent work includes, for example, the design of new structures and compositions of molecules for therapeutic drugs. Much of the existing work related to the application of RL to scientific domains, however, assumes that the available state representation obeys the Markov property. For reasons associated with time, cost, sensor accuracy, and gaps in scientific knowledge, many scientific design and discovery problems do not satisfy the Markov property. Thus, something other than a Markov decision process (MDP) should be used to plan / find the optimal policy. In this paper, we present a physics-inspired semi-Markov RL environment, namely the phase change environment. In addition, we evaluate the performance of value-based RL algorithms for both MDPs and partially observable MDPs (POMDPs) on the proposed environment. Our results demonstrate deep recurrent Q-networks (DRQN) significantly outperform deep Q-networks (DQN). In addition, we show that DRQNs benefit from training with hindsight experience replay. Implications for the use of semi-Markovian RL and POMDPs for scientific laboratories are also discussed.

The CURE for Class Imbalance

C. Bellinger, P. Branco and L. Torgo

International Conference on Discovery Science, Split, Croatia. 2019

Addressing the class imbalance problem is critical for several real world applications. The application of pre-processing methods is a popular way of dealing with this problem. These solutions increase the rare class examples and/or decrease the normal class cases. However, these procedures typically only take into account the characteristics of each individual class. This segmented view of the data can have a negative impact. We propose a new method that uses an integrated view of the data classes to generate new examples and remove cases. ClUstered REsampling (CURE) is a method based on a holistic view of the data that uses hierarchical clustering and a new distance measure to guide the sampling procedure. Clusters generated in this way take into account the structure of the data. This enables CURE to avoid common mistakes made by other resampling methods. In particular, CURE prevents the generation of synthetic examples in dangerous regions and undersamples safe, non-borderline, regions of the majority class. We show the effectiveness of CURE in an extensive set of experiments with benchmark domains. We also show that CURE is a user-friendly method that does not require extensive fine-tuning of hyper-parameters.

Synthetic oversampling with the majority class: A new perspective on handling extreme imbalance

Shiven Sharma, Colin Bellinger, Bartosz Krawczyk, Nathalie Japkowicz, and Osmar Zaïane

In IEEE International Conference on Data Mining, 2018

The class imbalance problem is a pervasive issue in many real-world domains. Oversampling methods that inflate the rare class by generating synthetic data are amongst the most popular techniques for resolving class imbalance. However, they concentrate on the characteristics of the minority class and use them to guide the oversampling process. By completely overlooking the majority class, they loose a global view on the classification problem and, while alleviating the class imbalance, may increase the learning difficulty by generating borderline or overlapping instances. This becomes even more critical when facing extreme class imbalance, where the minority class is strongly underrepresented and does not contain enough information to solely conduct the oversampling process. We propose a novel method for synthetic oversampling that uses the rich information inherent in the majority class to synthesize minority class data. This is done by generating synthetic data that is at the same Mahalanbois distance from the majority class as the known minority instances. We evaluate our method on 26 benchmark datasets, and show that our method offers a distinct performance improvement over the existing state-of-the-art in oversampling techniques.

Performance Assessment of Classification Methods for the Inductance within a VSI

Diego Aldana, Yamisleydi Salgueiro, Colin Bellinger, Marco Rivera and César Astudillo

In IEEE International Conference on Automation, 2018

The non-intrusive monitoring of electrical systems has gained relevance, during the last years, due to its lower costs and space requirements. Machine learning techniques have proved their ability to predict the parameters under monitoring and consequently improve the performance of power electronics systems. The present work seeks to determine the combination between machine learning techniques and dimensionality reduction that efficiently predicts the inductance value for a Voltage Source Inverter's Modulate Model Predictive Control (VSI_M2PC). The problem, modelled as a classification one (three classes), has a high dimensionality (5000 attributes). Consequently, its reduction is needed to make it tractable at the cost of slightly sacrificing the accuracy of the model. Seven machine learning methods (Support Vector Machine, K-Nearest Neighbours, Naïve Bayes, Linear Discriminant Analysis, Classification and Regression Trees, C4.5 and Random Fores) and three dimension reduction methods (Correlation Elimination, Principal Component Analysis, and Boruta) were experimentally analyzed on VSI_M2PC Matlab simulations. It was found that Random Forest combined with the \emph{Boruta} provided the best results regarding classification efficiency..

Advantage of Integration in Big Data: Feature Generation in Multi-Relational Databases for Imbalanced Learning

F. Ahmed, M. Samorani†, C. Bellinger, O. R. Zaıane

In Proceedings of the 2016 IEEE International Conference on Big Data, pp. 532–539, 2016.

Most real world applications comprise databases having multiple tables. It becomes further complicated in the realm of Big Data where related information is spread over different data repositories. However, data mining techniques are usually applied on a single flat table. This work focuses on generating a mining table by aggregating information from multiple local tables and external data sources and automatically generating potentially discriminant features. It extends data aggregation techniques by navigating paths where a single table is traversed multiple times. Such paths are not considered by existing techniques, which results in the loss of several attributes. Our framework also prevents leakage of the class information by avoiding features built after the knowledge of the class label. Experiments are performed on transactional data of a U.S. consumer electronics retailer to predict causes of product returns. In addition, we augmented the dataset with Suppliers information and Reviews to show the value of data integration. The results show that our technique improves classification accuracy and generates discriminant features that mitigate the impact of class imbalance. datasets..

Beyond the Boundaries of SMOTE: A Framework for Manifold-Based Synthetically Oversampling

C. Bellinger, N. Japkowicz, C. Drummond

In Proceedings of the Machine Learning and Knowl- edge Discovery in Databases: European Conference, pp. 248–263, 2016.

Problems of class imbalance appear in diverse domains, ranging from gene function annotation to spectra and medical classification. On such problems, the classifier becomes biased in favour of the majority class. This leads to inaccuracy on the important minority classes, such as specific diseases and gene functions. Synthetic oversampling mitigates this by balancing the training set, whilst avoiding the pitfalls of random under and oversampling. The existing methods are primarily based on the SMOTE algorithm, which employs a bias of randomly generating points between nearest neighbours. The relationship between the generative bias and the latent distribution has a significant impact on the performance of the induced classifier. Our research into gamma-ray spectra classification has shown that the generative bias applied by SMOTE is inappropriate for domains that conform to the manifold property, such as spectra, text, image and climate change classification. To this end, we propose a framework for manifold-based synthetic oversampling, and demonstrate its superiority in terms of robustness to the manifold with respect to the AUC on three spectra classification tasks and 16 UCI datasets..

[code]

Active Learning for One-Class Classification

V. Barnabe-Lortie, C. Bellinger, N. Japkowicz

In Proceedings of the 2015 IEEE 14th International Conference on Machine Learning and Applications, pp. 390–395, 2015.

Active learning is a common solution for reducing labeling costs and maximizing the impact of human labeling efforts in binary and multi-class classification settings. However, when we are faced with extreme levels of class imbalance, a situation in which it is not safe to assume that we have a representative sample of the minority class, it has been shown effective to replace the binary classifiers with a one-class classifiers. In such a setting, traditional active learning methods, and many previously proposed in the literature for one-class classifiers, prove to be inappropriate, as they rely on assumptions about the data that no longer stand. In this paper, we propose a novel approach to active learning designed for one-class classification. The proposed method does not rely on many of the inappropriate assumptions of its predecessors and leads to more robust classification performance. The gist of this method consists of labeling, in priority, the instances considered to fit the learned class the least by previous iterations of a one-class classification model. We provide empirical evidence for the merits of the proposed method compared to the available alternatives, and discuss how the method may have an impact in an applied setting.

[code]

Multi-label Classification of Anemia Patients

C. Bellinger, A. Amid, N. Japkowicz, H. Victor

In Proceedings of the 2015 IEEE 14th International Conference on Machine Learning and Applications, pp. 825–830, 2015.

This work examines the application of machine learning to an important area of medicine which aims to diagnose paediatric patients with ß-thalassemia minor, iron deficiency anemia or the co-occurrence of these ailments. Iron deficiency anemia is a major cause of microcytic anemia and is considered an important task in global health. Whilst existing methods, based on linear equations, are proficient at distinguishing between the two classes of anemia, they fail to identify the co-occurrence of this issues. Machine learning algorithms, however, can induce non-linear decision boundaries that enable accurate classification within complex domains. Through a multi-label classification technique, known as problem transformations, we convert the learning task to one that is appropriate for machine learning and examine the effectiveness of machine learning algorithms on this domain. Our results show that machine learning classifiers produce good overall accuracy and are able to identify instances of the co-occurrence class unlike the existing methods.

Smoothing gamma ray spectra to improve outlier detection

V Barnabe-Lortie, C Bellinger, N Japkowicz

In Proceedings of the 2014 Seventh IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–8, 2014.

Rapid detection of radioisotopes in gamma-ray data can, in some situations, be an important security concern. The task of designing an automated system for this purpose is complex due to, amongst other factors, the noisy nature of the data. The method described herein consists of preprocessing the data by applying a smoothing method tailored to gamma ray spectra, hoping that this should decrease their variance. Given that the number of counts at a given energy level in a spectrum should follow a Poisson distribution, smoothing may allow us to estimate the true photon arrival rate. Our experiments suggest that the added data preprocessing step can have large impact on the performance of anomaly detection algorithms on this particular domain.

[code]

One-Class versus Binary Classification: Which and When?

C. Bellinger, S. Sharma, N. Japkowicz

In Proceedings of the 2012 11th International Conference on Machine Learning and Applications, pp. 102–106, 2012.

Binary classifiers have typically been the norm for building classification models in the Machine Learning community. However, an alternate to binary classification is one-class classification, which aims to build models using only a single class of data. This is particularly useful when there is an over-abundance of data of a particular class. In such imbalanced cases, binary classifiers may not perform very well, and one-class classifiers then become the viable option. In this paper, we are interested in investigating the performance of binary and one-class classifiers as the level of imbalance increases, and, thus, uncertainty in the second class. Our objective is to gain insight into which classification paradigm becomes more suitable as imbalance and uncertainty increase. To this end, we conduct experiments on various datasets, both artificial and from the UCI repository, and monitor the performance of the binary and one-class classifiers as the size of the second class gradually decreases, thus increasing the level of imbalance. The results show that as the level of imbalance increases, the performance of binary classifiers decreases, whereas one-class classifiers stay relatively stable.

[code]

Anomaly detection in gamma-ray spectra: a machine learning perspective

S. Sharma, C. Bellinger, N. Japkowicz, R. Berg and K. Ungar

In Proceedings of the 2012 IEEE Symposium on Computational Intelligence for Security and Defence Applications, pp. 1–8, 2012.

With Canadian security and the safety of the general public in mind, physicists at Health Canada (HC) have begun to develop techniques to identify persons concealing radioactive material that may represent a threat to attendees at public gatherings, such as political proceedings and sporting events. To this end, Health Canada has initiated field trials that include the deployment of gamma-ray spectrometers. In particular, a series of these detectors, which take measurements every minute and produce 1,024 channel gamma-ray spectrum, were deployed during the Vancouver 2010 olympics. Simple computerized statistics and human expertise were used as the primary line of defence. More specifically, if a measured spectrum deviated significantly from the background, an internal alarm was sounded and an HC physicist undertook further analysis into the nature of the alarming spectrum. This strategy, however, lead to a significant number of costly and time consuming false positives. This research applies sophisticated machine learning algorithms to reduce the number of false positives to an acceptable level, the results of which are detailed in this paper. In addition, we emphasize the primary findings of our work and highlight avenues available to further improve upon our current results.

[code]

Motivating the inclusion of meteorological indicators in the CTBT feature-space

C. Bellinger and N. Japkowicz

In Proceedings of the 2011 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 2329–6267, 2011

Verification of the Comprehensive Test-Ban-Treaty (CTBT), as a Pattern Recognition (PR) problem, has been proposed based on four radioxenon features. It has been noted, however, that in many cases this limited feature set is insufficient to distinguish radioxenon levels effected by an explosion from those that are solely products of industrial activities. As a means of improving the detectability of low-yield clandestine nuclear explosions, this paper motivates the inclusion of meteorological indicators in the CTBT feature-space, promotes further research into which meteorological indicators are most informative, and how they may be acquired. In doing so, we present classification results from four simulated scenarios. These results demonstrate that the inclusion of a simple wind direction feature can significantly increase the prospect of classifying challenging detonation events, and suggests the predictive power of meteorological features in general.

A new frontier in novelty detection: pattern recognition of stochastically episodic events

C. Bellinger and B.J. Oommen

In Proceedings of the 2011 Asian Conference on Intelligent Information and Database Systems, pp. 435–444, 2011

A particularly challenging class of PR problems in which the, generally required, representative set of data drawn from the second class is unavailable, has recently received much consideration under the guise of One-Class (OC) classification. In this paper, we extend the frontiers of OC classification by the introduction of a new field of problems open for analysis. In particular, we note that this new realm deviates from the standard set of OC problems based on the following characteristics: The data contains a temporal nature, the instances of the classes are “interwoven”, and the labelling procedure is not merely impractical - it is almost, by definition, impossible, which results in a poorly defined training set. As a first attempt to tackle these problems, we present two specialized classification strategies denoted by Scenarios S1 and S2 respectively. In Scenarios S1, the data is such that standard binary and one-class classifiers can be applied. Alternatively, in Scenarios S2, the labelling challenge prevents the application of binary classifiers, and instead, dictates a novel application of OC classifiers. The validity of these scenarios has been demonstrated for the exemplary domain involving the Comprehensive Nuclear Test-Ban-Treaty (CTBT), for which our research endeavour has also developed a simulation model. As far as we know, our research in this field is of a pioneering sort, and the results presented here are novel.

On simulating episodic events against a background of noise-like non-episodic events

C. Bellinger and B.J. Oommen

In Proceedings of the 2010 Summer Computer Simulation Conference, pp. 452–460, 2010.

Simulation, as an art and a science, deals with the issue of allowing the practitioner to model events using their respective probability distributions. Thus, it is customary for simulations to model the behaviour of accidents, telephone calls, network failures etc. In this paper, we consider a relatively new field, namely that of modelling episodic events such as earthquakes, nuclear explosions etc. The difficulty with such a modelling process is that most of the observations appear as noise. However, when the episodic event does occur, its magnitude and features far overshadow the background, as one observes after a seismic event. In this paper, we demonstrate how the effect of a particular form of episodic event can be modelled as it propagates through the underlying background noise. Furthermore, we illustrate how the subsequent decay of the event can also be modelled and simulated. In demonstrating this concept, we utilize the exemplar scenario posed by the Comprehensive Nuclear-Test-Ban Treaty (CTBT), and model the propagation and decay of radionuclides, emitted from clandestine, subterranean nuclear detonations, through the background levels resulting from the global nuclear industry.

Data Papers

Data for resistance and inductance estimation within a voltage source inverter

Diego Aldana, Yamisleydi Salgueiro, Colin Bellinger, Marco Rivera and César Astudillo

Data in Brief, 2018

Power converters are essential for the use of renewable energy resources. For example, a photovoltaic system produces DC energy that is transformed into AC by the voltage source inverter (VSI). This power is used by a motor drive that operates at different speeds, generating variable loads. Two parameters, namely, resistance and inductance are essential to correctly adjust the model predictive control (MPC) in a VSI. In this paper, we describe the data from a VSI that incorporates an MPC. We generate four datasets consisting of 399 cases or instances (rows) each one. Two data set comprises the simulations varying the inductance (continuous and discrete versions) and the other two varying the resistance (continuous and discrete versions). The motivation behind this data is to support the design and development of nonintrusive models to predict the resistance and inductance of a VSI under different conditions.

Abstract Publications

Validity of the Work Assessment Triage Tool for Selecting Rehabilitation Interventions for Workers’ Compensation Claimants with Musculoskeletal Conditions

Gross P.D., Steenstra A.I., Shaw W., Yousefi P., Bellinger C., Zaiane O.

5th Scientific Conference on Work Disability Prevention and Integration, Odense, Denmark, June 4-7, 2019

Background: Musculoskeletal disorders including as back and neck pain are leading causes of work disability. Effective interventions exist (i.e. functional restoration, multidisciplinary biopsychosocial rehabilitation, workplace-based interventions, etc.), but it is difficult to select the optimal intervention for specific patients. The Work Assessment Triage Tool (WATT) is a clinical decision support tool developed using machine learning to help select interventions. The WATT algorithm categorizes patients based on individual, occupational, and clinical characteristics according to likelihood of successful return-to-work following rehabilitation. The WATT was developed using a clinical and administrative data from workers’ compensation claimants being considered for rehabilitation between December 2009 and January 2011. Internal validation showed acceptable classification accuracy, but WATT has not been tested beyond the original development sample. Our purpose was to externally validate the WATT.

Methods and Results: Musculoskeletal disorders including as back and neck pain are leading causes of work disability. Effective interventions exist (i.e. functional restoration, multidisciplinary biopsychosocial rehabilitation, workplace-based interventions, etc.), but it is difficult to select the optimal intervention for specific patients. The Work Assessment Triage Tool (WATT) is a clinical decision support tool developed using machine learning to help select interventions. The WATT algorithm categorizes patients based on individual, occupational, and clinical characteristics according to likelihood of successful return-to-work following rehabilitation. The WATT was developed using a clinical and administrative data from workers’ compensation claimants being considered for rehabilitation between December 2009 and January 2011. Internal validation showed acceptable classification accuracy, but WATT has not been tested beyond the original development sample. Our purpose was to externally validate the WATT.

Conclusions: Accuracy of the WATT rules for selecting successful rehabilitation programs in the new dataset was modest. Algorithm revision and further validation is needed.

An Integrative and Collaborative Approach to Associating Adverse Birth Outcomes and Industrial Air Pollution

C. Nielsen, J. .Serrano Lomelin, MSM. Jabbar, O. Wine, O. Zaiane, AR. Osornio Vargas, C. Bellinger, and the DoMiNo Team

Canadian National Perinatal Research Meeting, 2018

Background: The relationship of many hazardous pollutants present in ambient air with adverse birth outcomes (ABO) is still unknown, particularly from those released by industry. The knowledge gap is even greater when considering the impact of mixtures of hazardous pollutants on pregnancy outcomes.

Objective: We aimed to collaboratively overcome methodological limitations to identify potential hazardous mixtures of industrial air pollutants spatially related to the occurrence of ABO in Alberta.

Method: We applied integrative data analyses on large existing databases (2006-2012) from (i) the National Pollutant Release Inventory on chemicals released into air by industry; (ii) the Alberta Perinatal Health Program on births, known maternal risk factors for ABO, and the maternal postal code at birth; (iii) an area-level socioeconomic status index (SES). We developed a novel spatial data mining (DM) algorithm to identify significant spatial colocation rules of combinations of emitted chemicals and ABO (i.e., preterm birth [PTB], small for gestational age [SGA], and low birth weight at term [LBWT]). We used a Geographic Information System (GIS) to assign exposures to residences within 10 km of emission sites. We used epidemiological analysis to calculate the corresponding odds ratios of exposures to chemical mixtures and the occurrence of ABO, adjusted by maternal covariates and SES.

Results: DM distinguished eleven rules combining three chemicals - gases, particulate matter and organics - as associated with ABO. GIS assigned exposure and epidemiological analysis identified seven rules with increased odds for induced PTB, SGA, and LBWT. One exclusively increased the odds for PTB (15%).

Conclusions: The combined interdisciplinary use of data mining, GIS, and epidemiology strategies enabled us to identify the risks associated with mixtures of hazardous chemicals and ABO - findings that proved elusive when applying methodologies individually. These discoveries enhance our understanding of the role of chemical mixtures in ABO.

Workshop Publications

Scientific Discovery and the Cost of Measurement--Balancing Information and Cost in Reinforcement Learning

C. Bellinger, A. Drozdyuk, M. Crowley, and I. Tamblyn

In Proceedings of the AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE), 2022

The use of reinforcement learning (RL) in scientific applications, such as materials design and automated chemistry, is increasing. A major challenge, however, lies in fact that measuring the state of the system is often costly and time consuming in scientific applications, whereas policy learning with RL requires a measurement after each time step. In this work, we make the measurement costs explicit in the form of a costed reward and propose a framework that enables off-the-shelf deep RL algorithms to learn a policy for both selecting actions and determining whether or not to measure the current state of the system at each time step. In this way, the agents learn to balance the need for information with the cost of information. Our results show that when trained under this regime, the Dueling DQN and PPO agents can learn optimal action policies whilst making up to 50\% fewer state measurements, and recurrent neural networks can produce a greater than 50\% reduction in measurements. We postulate the these reduction can help to lower the barrier to applying RL to real-world scientific applications.

Sampling a Longer Life: Binary versus One-class classification Revisited

Sharma, S. and Bellinger, C. and Zaiane, O. and Japkowicz, N.

In Proceedings of the 1st International Workshop on Learning with Imbalanced Domains: Theory and Application at ECML, 2017

When faced with imbalanced domains, practitioners have one of two choices; if the imbalance is manageable, sampling or other corrective measures can be utilized in conjunction with binary classifiers (BCs). Beyond a certain point, however, the imbalance becomes too extreme and one-class classifiers (OCCs) are required. Whilst the literature offers many advances in terms of algorithms and understanding, there remains a need to connect our theoretical advances to the most practical of decisions. Specifically, given a dataset with some level of complexity and imbalance, which classification approach should be applied? In this paper, we establish a relationship between these facets in order to help guide the decision regarding when to apply OCC versus BC. Our results show that sampling provides an edge over OCCs on complex domains. Alternatively, OCCs are a good choice on less complex domains that exhibit unimodal properties. Class overlap, on the other hand, has a more uniform impact across all methods.

An Evaluation of the value added by informative metrics

C. Bellinger and J. Lalonde, M.W. Floyd, V. Mallur, E. Elkanzi, D. Ghazi, J. He, A. Mouttham, M. Scaiano, E. Wehbe, and N. Japkowicz

In the 4th Workshop on Evaluation Methods for Machine Learning at the 26th International Conference on Machine Learning, 2009

We present an empirical study of the value added by four commonly employed classifier evaluation metrics and consider whether the informedness of a metric is correlated with its ability to predict the classifier?s performance on future datasets, in terms of the true and false positive rates. In doing so, a variety of UCI datasets are examined in order to test the metrics, in general and under specific domain conditions. Through the experiments, we show there is no direct correlation between the informedness of a metric and its ability to predict future performance.

Poster Presentations

VizAR: A Software Tool for Epidemiological Hypothesis Generation with Geo-Spatial Data Mining

Colin Bellinger, Mohomed Shazan Mohomed Jabbar, Saeed Hojjati, Osmar R. Zaiane, Alvaro Osornio-Vargas, and the DoMiNo Team

Canadian National Perinatal Research Meeting, 2018

Introduction: This work is a collaboration between epidemiologists, neonatal doctors, public health officials and researchers in AMII and the University of Alberta. The objective of our work is to support the development of hypotheses about the relationship between combinations of airborne chemical and adverse birth outcomes (ABOs). Chemical combinations are of interest as these are difficult to study with traditional methods. The desired outcome is achieved by building a software tool (VizAR) that enables users to explore associations discovered via geo-spatial data mining (DM), and identify valuable new research hypotheses.

Method: We acquired data on industrial chemical releases from the National Pollutant Inventory, ABO data from the Alberta Perinatal Health Program, and meteorological variables from Environment Canada. These are cleaned and integrated to form a table on which geo- spatial DM is performed to generate associations between chemicals and ABOs. Our interdisciplinary team established a set of functional and non-functional requirements for the software that would enable users to efficiently identify new hypotheses. An iterative cycle of software development and user feedback was applied to ensure VizAR met the requirements.

Results: A set of associations is identified from the integrated dataset using DM. VizAR, a web-based software program, is produced that enables users to explore associations, and generate new hypotheses. To facilitate this, VizAR includes functionality to sort associations according to standard DM importance metrics, such as lift, to filter them, and plot their occurrences on a Google map. The tool facilitated the discovery of twelve associations that may serve as hypotheses for new studies in public health.

Conclusions: The application of DM to integrated data enables the generation of associations not typically possible with standard epidemiological methods. VizAR enables efficient and effective knowledge transfer from the DM results to the users to facilitate hypothesis generation.

Under Review