Publications

Piecewise Linear Parametrization of Policies: Towards Interpretable Deep Reinforcement Learning

Maxime Wabartha

Learning inherently interpretable policies is a central challenge in the path to developing autonomous agents that humans can trust. We argu… (see more)e for the use of policies that are piecewise-linear. We carefully study to what extent they can retain the interpretable properties of linear policies while performing competitively with neural baselines. In particular, we propose the HyperCombinator (HC), a piecewise-linear neural architecture expressing a policy with a controllably small number of sub-policies. Each sub-policy is linear with respect to interpretable features, shedding light on the agent’s decision process without needing an additional explanation model. We evaluate HC policies in control and navigation experiments, visualize the improved interpretability of the agent and highlight its trade-off with performance.

2024-01-01

International Conference on Learning Representations (published)

openreview.net

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

Prakash Panangaden

Sahand Rezaei-Shoshtari

Rosie Zhao

David Meger

Doina Precup

arxiv.org

Population Monte Carlo With Normalizing Flow

Soumyasundar Pal

Antonios Valkanas

Mark Coates

Adaptive importance sampling (AIS) methods provide a useful alternative to Markov Chain Monte Carlo (MCMC) algorithms for performing inferen… (see more)ce of intractable distributions. Population Monte Carlo (PMC) algorithms constitute a family of AIS approaches which adapt the proposal distributions iteratively to improve the approximation of the target distribution. Recent work in this area primarily focuses on ameliorating the proposal adaptation procedure for high-dimensional applications. However, most of the AIS algorithms use simple proposal distributions for sampling, which might be inadequate in exploring target distributions with intricate geometries. In this work, we construct expressive proposal distributions in the AIS framework using normalizing flow, an appealing approach for modeling complex distributions. We use an iterative parameter update rule to enhance the approximation of the target distribution. Numerical experiments show that in high-dimensional settings, the proposed algorithm offers significantly improved performance compared to the existing techniques.

2024-01-01

IEEE Signal Processing Letters (published)

doi.org

arxiv.org

Precise Accuracy / Robustness Tradeoffs in Regression: Case of General Norms

Elvis Dohmatob

Meyer Scetbon

2024-01-01

International Conference on Machine Learning (published)

proceedings.mlr.press

Predicting Drug Effects from High-Dimensional, Asymmetric Drug Datasets by Using Graph Neural Networks: A Comprehensive Analysis of Multitarget Drug Effect Prediction

Joey Bose

Guojing Cong

Graph neural networks (GNNs) have emerged as one of the most effective ML techniques for drug effect prediction from drug molecular graphs. … (see more)Despite having immense potential, GNN models lack performance when using datasets that contain high-dimensional, asymmetrically co-occurrent drug effects as targets with complex correlations between them. Training individual learning models for each drug effect and incorporating every prediction result for a wide spectrum of drug effects are impractical. Therefore, an opportunity exists to address this challenge as multitarget prediction problems and predict all drug effects at a time. We developed standard and hybrid GNNs to perform two separate tasks: multiregression for continuous values and multilabel classification for categorical values contained in our datasets. Because multilabel classification makes the target data even more sparse and introduces asymmetric label co-occurrence, learning these models becomes difficult and heavily impacts the GNN's performance. To address these challenges, we propose a new data oversampling technique to improve multilabel classification performances on all the given imbalanced molecular graph datasets. Using the technique, we improve the data imbalance ratio of the drug effects while protecting the datasets' integrity. Finally, we evaluate the multilabel classification performance of the best-performing hybrid GNN model on all the oversampled datasets obtained from the proposed oversampling technique. In all the evaluation metrics (i.e., precision, recall, and F1 score), this model significantly outperforms other ML models, including GNN models when they are trained on the original datasets or oversampled datasets with MLSMOTE, which is a well-known oversampling technique.

2024-01-01

ICMLA (published)

doi.org

arxiv.org

Preserving Privacy in GANs Against Membership Inference Attack

Mohammadhadi Shateri

Francisco Messina

Fabrice Labeau

Pablo Piantanida

Generative Adversarial Networks (GANs) have been widely used for generating synthetic data for cases where there is a limited size real-worl… (see more)d data set or when data holders are unwilling to share their data samples. Recent works showed that GANs, due to overfitting and memorization, might leak information regarding their training data samples. This makes GANs vulnerable to Membership Inference Attacks (MIAs). Several defense strategies have been proposed in the literature to mitigate this privacy issue. Unfortunately, defense strategies based on differential privacy are proven to reduce extensively the quality of the synthetic data points. On the other hand, more recent frameworks such as PrivGAN and PAR-GAN are not suitable for small-size training data sets. In the present work, the overfitting in GANs is studied in terms of the discriminator, and a more general measure of overfitting based on the Bhattacharyya coefficient is defined. Then, inspired by Fano’s inequality, our first defense mechanism against MIAs is proposed. This framework, which requires only a simple modification in the loss function of GANs, is referred to as the maximum entropy GAN or MEGAN and significantly improves the robustness of GANs to MIAs. As a second defense strategy, a more heuristic model based on minimizing the information leaked from the generated samples about the training data points is presented. This approach is referred to as mutual information minimization GAN (MIMGAN) and uses a variational representation of the mutual information to minimize the information that a synthetic sample might leak about the whole training data set. Applying the proposed frameworks to some commonly used data sets against state-of-the-art MIAs reveals that the proposed methods can reduce the accuracy of the adversaries to the level of random guessing accuracy with a small reduction in the quality of the synthetic data samples.

2024-01-01

IEEE Transactions on Information Forensics and Security (published)

doi.org

arxiv.org

Probabilistic Dataset Reconstruction from Interpretable Models

Julien Ferry

Ulrich Aivodji

Sébastien Gambs

Marie-José Huguet

Mohamed Siala

Interpretability is often pointed out as a key requirement for trustworthy machine learning. However, learning and releasing models that are… (see more) inherently interpretable leaks information regarding the underlying training data. As such disclosure may directly conflict with privacy, a precise quantification of the privacy impact of such breach is a fundamental problem. For instance, previous work have shown that the structure of a decision tree can be leveraged to build a probabilistic reconstruction of its training dataset, with the uncertainty of the reconstruction being a relevant metric for the information leak. In this paper, we propose of a novel framework generalizing these probabilistic reconstructions in the sense that it can handle other forms of interpretable models and more generic types of knowledge. In addition, we demonstrate that under realistic assumptions regarding the interpretable models' structure, the uncertainty of the reconstruction can be computed efficiently. Finally, we illustrate the applicability of our approach on both decision trees and rule lists, by comparing the theoretical information leak associated to either exact or heuristic learning algorithms. Our results suggest that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones, for a given accuracy level.

2024-01-01

SaTML (published)

doi.org

openreview.net

Proving Linear Mode Connectivity of Neural Networks via Optimal Transport

Damien Ferbach

Baptiste Goujaud

Gauthier Gidel

Aymeric Dieuleveut

The energy landscape of high-dimensional non-convex optimization problems is crucial to understanding the effectiveness of modern deep neura… (see more)l network architectures. Recent works have experimentally shown that two different solutions found after two runs of a stochastic training are often connected by very simple continuous paths (e.g., linear) modulo a permutation of the weights. In this paper, we provide a framework theoretically explaining this empirical observation. Based on convergence rates in Wasserstein distance of empirical measures, we show that, with high probability, two wide enough two-layer neural networks trained with stochastic gradient descent are linearly connected. Additionally, we express upper and lower bounds on the width of each layer of two deep neural networks with independent neuron weights to be linearly connected. Finally, we empirically demonstrate the validity of our approach by showing how the dimension of the support of the weight distribution of neurons, which dictates Wasserstein convergence rates is correlated with linear mode connectivity.

2024-01-01

AISTATS (published)

doi.org

arxiv.org

Quantifying learning-style adaptation in effectiveness of LLM teaching

Ruben Weijers

Gabrielle Fidelis de Castilho

Jean-François Godbout

Reihaneh Rabbany

Kellin Pelrine

This preliminary study aims to investigate whether AI, when prompted based on individual learning styles, can effectively improve comprehens… (see more)ion and learning experiences in educational settings. It involves tailoring LLMs baseline prompts and comparing the results of a control group receiving standard content and an experimental group receiving learning style-tailored content. Preliminary results suggest that GPT-4 can generate responses aligned with various learning styles, indicating the potential for enhanced engagement and comprehension. However, these results also reveal challenges, including the model’s tendency for sycophantic behavior and variability in responses. Our findings suggest that a more sophisticated prompt engineering approach is required for integrating AI into education (AIEd) to improve educational outcomes.

Reinforcement Learning for Blind Stair Climbing with Legged and Wheeled-Legged Robots

Simon Chamorro

Victor Klemm

Miguel de La Iglesia Valls

Chris Pal

Roland Siegwart

2024-01-01

ICRA (published)

doi.org

arxiv.org

Reinforcement Learning Informed Evolutionary Search for Autonomous Systems Testing

Dmytro Humeniuk

Foutse Khomh

Giuliano Antoniol

Evolutionary search-based techniques are commonly used for testing autonomous robotic systems. However, these approaches often rely on compu… (see more)tationally expensive simulator-based models for test scenario evaluation. To improve the computational efficiency of the search-based testing, we propose augmenting the evolutionary search (ES) with a reinforcement learning (RL) agent trained using surrogate rewards derived from domain knowledge. In our approach, known as RIGAA (Reinforcement learning Informed Genetic Algorithm for Autonomous systems testing), we first train an RL agent to learn useful constraints of the problem and then use it to produce a certain part of the initial population of the search algorithm. By incorporating an RL agent into the search process, we aim to guide the algorithm towards promising regions of the search space from the start, enabling more efficient exploration of the solution space. We evaluate RIGAA on two case studies: maze generation for an autonomous ant robot and road topology generation for an autonomous vehicle lane keeping assist system. In both case studies, RIGAA converges faster to fitter solutions and produces a better test suite (in terms of average test scenario fitness and diversity). RIGAA also outperforms the state-of-the-art tools for vehicle lane keeping assist system testing, such as AmbieGen and Frenetic.

2024-01-01

ACM Trans. Softw. Eng. Methodol. (published)

doi.org

arxiv.org

Reproducible Spinal Cord Quantitative MRI Analysis with the Spinal Cord Toolbox.

Jan Valošek

Julien Cohen-Adad

2024-01-01

Magnetic Resonance in Medical Sciences (published)

doi.org

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications