Portrait of Ulrich Aivodji

Ulrich Aivodji

Associate Academic Member
Assistant Professor, École de technologie supérieure (ETS), Department of Software and Information Technology Engineering
École de technologie supérieure
Research Topics
Data Mining
Deep Learning
Optimization
Representation Learning

Biography

Ulrich Aïvodji is an assistant professor of computer science in the Software and Information Technology Engineering Department of the École de technologie supérieure (ÉTS) in Montréal. He also leads the Trustworthy Information Systems Lab (TISL).

Aïvodji’s research areas are computer security, data privacy, optimization and machine learning. His current research focuses on several aspects of trustworthy machine learning, such as fairness, privacy-preserving machine learning and explainability.

Before his current position, he was a postdoctoral researcher at Université du Québec à Montréal, where he worked with Sébastien Gambs on machine learning ethics and privacy.

He earned his PhD in computer science from Université Paul-Sabatier (Toulouse) under the supervision of Marie-José Huguet and Marc-Olivier Killijian. He was affiliated with two research groups at the Systems Analysis and Architecture Laboratory–CNRS, one on dependable computing, fault tolerance and operations research, and another on combinatorial optimization and constraints.

Current Students

PhD - École de technologie suprérieure
Master's Research - École de technologie suprérieure
Postdoctorate - École de technologie suprérieure
Master's Research - École de technologie suprérieure
Master's Research - École de technologie suprérieure
Co-supervisor :
PhD - École de technologie suprérieure
Co-supervisor :
Research Intern - École de technologie suprérieure (ÉTS)
Research Intern - École de technologie suprérieure (ÉTS)
PhD - École de technologie suprérieure

Publications

GradTune: Last-layer Fine-tuning for Group Robustness Without Group Annotation
Patrik Joslin Kenfack
This work addresses the limitations of deep neural networks (DNNs) in generalizing beyond training data due to spurious correlations. Recent… (see more) research has demonstrated that models trained with empirical risk minimization learn both core and spurious features, often upweighting spurious ones in the final classification, which can frequently lead to poor performance on minority groups. Deep Feature Reweighting alleviates this issue by retraining the model's last classification layer using a group-balanced held-out validation set. However, relying on spurious feature labels during training or validation limits practical application, as spurious features are not always known or costly to annotate. Our preliminary experiments reveal that ERM-trained models exhibit higher gradient norms on minority group samples in the hold-out dataset. Leveraging these insights, we propose an alternative approach called GradTune, which fine-tunes the last classification layer using high-gradient norm samples. Our results on four well-established benchmarks demonstrate that the proposed method can achieve competitive performance compared to existing methods without requiring group labels during training or validation.
GradTune: Last-layer Fine-tuning for Group Robustness Without Group Annotation
Patrik Joslin Kenfack
This work addresses the limitations of deep neural networks (DNNs) in generalizing beyond training data due to spurious correlations. Recent… (see more) research has demonstrated that models trained with empirical risk minimization learn both core and spurious features, often upweighting spurious ones in the final classification, which can frequently lead to poor performance on minority groups. Deep Feature Reweighting alleviates this issue by retraining the model's last classification layer using a group-balanced held-out validation set. However, relying on spurious feature labels during training or validation limits practical application, as spurious features are not always known or costly to annotate. Our preliminary experiments reveal that ERM-trained models exhibit higher gradient norms on minority group samples in the hold-out dataset. Leveraging these insights, we propose an alternative approach called GradTune, which fine-tunes the last classification layer using high-gradient norm samples. Our results on four well-established benchmarks demonstrate that the proposed method can achieve competitive performance compared to existing methods without requiring group labels during training or validation.
Towards personalized healthcare without harm via bias modulation
Frank Ngaha
Patrik Joslin Kenfack
Personalized machine learning models have gained significant importance in various domains, including healthcare. However, designing efficie… (see more)nt personalized models remains a challenge. Traditional approaches often involve training multiple sub-models for different population sub-groups, which can be costly and does not always guarantee improved performance across all sub-groups. This paper presents a novel approach to improving model performance at the sub-group level by leveraging bias and training a joint model. Our method involves a two-step process: first, we train a model to predict group attributes, and then we use this model to learn data-dependent biases to modulate a second model for diagnosis prediction. Our results demonstrate that this joint architecture achieves consistent performance gains across all sub-groups in the Heart dataset. Furthermore, in the mortality dataset, it improves performance in two of the four sub-groups. A comparison of our method with the traditional decoupled personalization method demonstrated a greater performance gain in the sub-groups with less harm. This approach offers a more effective and scalable solution for personalization of models, which could have positive impact in healthcare and other areas that require predictive models which take sub-group information into account.
Towards personalized healthcare without harm via bias modulation
Frank Ngaha
Patrik Joslin Kenfack
Clinical prediction models are often personalized to target heterogeneous sub-groups by using demographic attributes such as race and gender… (see more) to train the model. Traditional personalization approaches involve using demographic attributes in input features or training multiple sub-models for different population subgroups (decoupling model). However, these methods often harm the performance at the subgroup level compared to non-personalized models. This paper presents a novel personalization method to improve model performance at the sub-group level. Our method involves a two-step process: first, we train a model to predict group attributes, and then we use this model to learn data-dependent biases to modulate a second model for diagnosis prediction. Our results demonstrate that this joint architecture achieves consistent performance gains across all sub-groups in the Heart dataset. Furthermore, in the mortality dataset, it improves performance in two of the four sub-groups. A comparison of our method with the traditional decoupled personalization method demonstrated a greater performance gain in the sub-groups with less harm. This approach offers a more effective and scalable solution for personalized models, which could have a positive impact in healthcare and other areas that require predictive models that take sub-group information into account.
Learning Hybrid Interpretable Models: Theory, Taxonomy, and Methods
Julien Ferry
gabriel laberge
A hybrid model involves the cooperation of an interpretable model and a complex black box. At inference, any input of the hybrid model is as… (see more)signed to either its interpretable or complex component based on a gating mechanism. The advantages of such models over classical ones are two-fold: 1) They grant users precise control over the level of transparency of the system and 2) They can potentially perform better than a standalone black box since redirecting some of the inputs to an interpretable model implicitly acts as regularization. Still, despite their high potential, hybrid models remain under-studied in the interpretability/explainability literature. In this paper, we remedy this fact by presenting a thorough investigation of such models from three perspectives: Theory, Taxonomy, and Methods. First, we explore the theory behind the generalization of hybrid models from the Probably-Approximately-Correct (PAC) perspective. A consequence of our PAC guarantee is the existence of a sweet spot for the optimal transparency of the system. When such a sweet spot is attained, a hybrid model can potentially perform better than a standalone black box. Secondly, we provide a general taxonomy for the different ways of training hybrid models: the Post-Black-Box and Pre-Black-Box paradigms. These approaches differ in the order in which the interpretable and complex components are trained. We show where the state-of-the-art hybrid models Hybrid-Rule-Set and Companion-Rule-List fall in this taxonomy. Thirdly, we implement the two paradigms in a single method: HybridCORELS, which extends the CORELS algorithm to hybrid modeling. By leveraging CORELS, HybridCORELS provides a certificate of optimality of its interpretable component and precise control over transparency. We finally show empirically that HybridCORELS is competitive with existing hybrid models, and performs just as well as a standalone black box (or even better) while being partly transparent.
A Survey on Fairness Without Demographics
Patrik Joslin Kenfack
Éts Montréal
The issue of bias in Machine Learning (ML) models is a significant challenge for the machine learning community. Real-world biases can be em… (see more)bedded in the data used to train models, and prior studies have shown that ML models can learn and even amplify these biases. This can result in unfair treatment of individuals based on their inherent characteristics or sensitive attributes such as gender, race, or age. Ensuring fairness is crucial with the increasing use of ML models in high-stakes scenarios and has gained significant attention from researchers in recent years. However, the challenge of ensuring fairness becomes much greater when the assumption of full access to sensitive attributes does not hold. The settings where the hypothesis does not hold include cases where (1) only limited or noisy demographic information is available or (2) demographic information is entirely unobserved due to privacy restrictions. This survey reviews recent research efforts to enforce fairness when sensitive attributes are missing. We propose a taxonomy of existing works and, more importantly, highlight current challenges and future research directions to stimulate research in ML fairness in the setting of missing sensitive attributes.
Probabilistic Dataset Reconstruction from Interpretable Models
Julien Ferry
Sébastien Gambs
Marie-José Huguet
Mohamed Siala
Interpretability is often pointed out as a key requirement for trustworthy machine learning. However, learning and releasing models that are… (see more) inherently interpretable leaks information regarding the underlying training data. As such disclosure may directly conflict with privacy, a precise quantification of the privacy impact of such breach is a fundamental problem. For instance, previous work have shown that the structure of a decision tree can be leveraged to build a probabilistic reconstruction of its training dataset, with the uncertainty of the reconstruction being a relevant metric for the information leak. In this paper, we propose of a novel framework generalizing these probabilistic reconstructions in the sense that it can handle other forms of interpretable models and more generic types of knowledge. In addition, we demonstrate that under realistic assumptions regarding the interpretable models' structure, the uncertainty of the reconstruction can be computed efficiently. Finally, we illustrate the applicability of our approach on both decision trees and rule lists, by comparing the theoretical information leak associated to either exact or heuristic learning algorithms. Our results suggest that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones, for a given accuracy level.
Fairness Under Demographic Scarce Regime
Patrik Joslin Kenfack
Most existing works on fairness assume the model has full access to demographic information. However, there exist scenarios where demographi… (see more)c information is partially available because a record was not maintained throughout data collection or due to privacy reasons. This setting is known as demographic scarce regime. Prior research have shown that training an attribute classifier to replace the missing sensitive attributes (proxy) can still improve fairness. However, the use of proxy-sensitive attributes worsens fairness-accuracy trade-offs compared to true sensitive attributes. To address this limitation, we propose a framework to build attribute classifiers that achieve better fairness-accuracy trade-offs. Our method introduces uncertainty awareness in the attribute classifier and enforces fairness on samples with demographic information inferred with the lowest uncertainty. We show empirically that enforcing fairness constraints on samples with uncertain sensitive attributes is detrimental to fairness and accuracy. Our experiments on two datasets showed that the proposed framework yields models with significantly better fairness-accuracy trade-offs compared to classic attribute classifiers. Surprisingly, our framework outperforms models trained with constraints on the true sensitive attributes.
Fooling SHAP with Stealthily Biased Sampling
gabriel laberge
Satoshi Hara
Mario Marchand
SHAP explanations aim at identifying which features contribute the most to the difference in model prediction at a specific input versus a … (see more)background distribution. Recent studies have shown that they can be manipulated by malicious adversaries to produce arbitrary desired explanations. However, existing attacks focus solely on altering the black-box model itself. In this paper, we propose a complementary family of attacks that leave the model intact and manipulate SHAP explanations using stealthily biased sampling of the data points used to approximate expectations w.r.t the background distribution. In the context of fairness audit, we show that our attack can reduce the importance of a sensitive feature when explaining the difference in outcomes between groups, while remaining undetected. These results highlight the manipulability of SHAP explanations and encourage auditors to treat post-hoc explanations with skepticism.
Leveraging Integer Linear Programming to Learn Optimal Fair Rule Lists
Julien Ferry
Sébastien Gambs
Marie-José Huguet
Mohamed
Siala
Washing The Unwashable : On The (Im)possibility of Fairwashing Detection
A. Shamsabadi
Mohammad Yaghini
Natalie Dullerud
Sierra Calanda Wyllie
Aisha Alaagib
Sébastien Gambs
Nicolas Papernot
Washing The Unwashable : On The (Im)possibility of Fairwashing Detection
Ali Shahin Shamsabadi
Mohammad Yaghini
Natalie Dullerud
Sierra Wyllie
Aisha Alaagib Alryeh Mkean
Sébastien Gambs
Nicolas Papernot
The use of black-box models (e.g., deep neural networks) in high-stakes decision-making systems, whose internal logic is complex, raises the… (see more) need for providing explanations about their decisions. Model explanation techniques mitigate this problem by generating an interpretable and high-fidelity surrogate model (e.g., a logistic regressor or decision tree) to explain the logic of black-box models. In this work, we investigate the issue of fairwashing, in which model explanation techniques are manipulated to rationalize decisions taken by an unfair black-box model using deceptive surrogate models. More precisely, we theoretically characterize and analyze fairwashing, proving that this phenomenon is difficult to avoid due to an irreducible factor---the unfairness of the black-box model. Based on the theory developed, we propose a novel technique, called FRAUD-Detect (FaiRness AUDit Detection), to detect fairwashed models by measuring a divergence over subpopulation-wise fidelity measures of the interpretable model. We empirically demonstrate that this divergence is significantly larger in purposefully fairwashed interpretable models than in honest ones. Furthermore, we show that our detector is robust to an informed adversary trying to bypass our detector. The code implementing FRAUD-Detect is available at https://github.com/cleverhans-lab/FRAUD-Detect.