Publications

Sample Compression for Self Certified Continual Learning

Continual learning algorithms aim to learn from a sequence of tasks, making the training distribution non-stationary. The majority of existi… (see more)ng continual learning approaches in the literature rely on heuristics and do not provide learning guarantees. In this paper, we present a new method called Continual Pick-to-Learn (CoP2L), which is able to retain the most representative samples for each task in an efficient way. CoP2L combines the Pick-to-Learn algorithm (rooted in the sample compression theory) and the experience replay continual learning scheme. This allows us to provide non-vacuous upper bounds on the generalization loss of the learned predictors, numerically computable after each task. We empirically evaluate our approach on several standard continual learning benchmarks across Class-Incremental, Task-Incremental, and Domain-Incremental settings. Our results show that CoP2L is highly competitive across all setups, often outperforming existing baselines, and significantly mitigating catastrophic forgetting compared to vanilla experience replay in the Class-Incremental setting. It is possible to leverage the bounds provided by CoP2L in practical scenarios to certify the predictor reliability on previously learned tasks, in order to improve the trustworthiness of the continual learning algorithm.

2025-03-13

ArXiv (preprint)

arxiv.org

Sample Compression for Self Certified Continual Learning

Continual learning algorithms aim to learn from a sequence of tasks, making the training distribution non-stationary. The majority of existi… (see more)ng continual learning approaches in the literature rely on heuristics and do not provide learning guarantees. In this paper, we present a new method called Continual Pick-to-Learn (CoP2L), which is able to retain the most representative samples for each task in an efficient way. CoP2L combines the Pick-to-Learn algorithm (rooted in the sample compression theory) and the experience replay continual learning scheme. This allows us to provide non-vacuous upper bounds on the generalization loss of the learned predictors, numerically computable after each task. We empirically evaluate our approach on several standard continual learning benchmarks across Class-Incremental, Task-Incremental, and Domain-Incremental settings. Our results show that CoP2L is highly competitive across all setups, often outperforming existing baselines, and significantly mitigating catastrophic forgetting compared to vanilla experience replay in the Class-Incremental setting. It is possible to leverage the bounds provided by CoP2L in practical scenarios to certify the predictor reliability on previously learned tasks, in order to improve the trustworthiness of the continual learning algorithm.

2025-03-13

ArXiv (preprint)

arxiv.org

Exploiting Instruction-Following Retrievers for Malicious Information Retrieval

Parishad BehnamGhader

Nicholas Meade

Siva Reddy

2025-03-11

ArXiv (preprint)

arxiv.org

Learning Decision Trees as Amortized Structure Inference

Mohammed Mahfoud

Ghait Boukachab

Michał Koziarski

Alex Hernandez-Garcia

Stefan Bauer

Yoshua Bengio

Nikolay Malkin

2025-03-10

ArXiv (preprint)

arxiv.org

Learning Decision Trees as Amortized Structure Inference

Mohammed Mahfoud

Ghait Boukachab

Michał Koziarski

Alex Hernandez-Garcia

Stefan Bauer

Yoshua Bengio

Nikolay Malkin

Building predictive models for tabular data presents fundamental challenges, notably in scaling consistently, i.e., more resources translati… (see more)ng to better performance, and generalizing systematically beyond the training data distribution. Designing decision tree models remains especially challenging given the intractably large search space, and most existing methods rely on greedy heuristics, while deep learning inductive biases expect a temporal or spatial structure not naturally present in tabular data. We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data, formulating decision tree construction as a sequential planning problem. We train a deep reinforcement learning (GFlowNet) policy to solve this problem, yielding a generative model that samples decision trees from the Bayesian posterior. We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks derived from real-world data, robustness to distribution shifts, and anomaly detection, all while yielding interpretable models with shorter description lengths. Samples from the trained DT-GFN model can be ensembled to construct a random forest, and we further show that the performance of scales consistently in ensemble size, yielding ensembles of predictors that continue to generalize systematically.

2025-03-10

ArXiv (preprint)

doi.org

arxiv.org

Relative biological effectiveness of 31 meV thermal neutrons in peripheral blood lymphocytes

Laura C Paterson

Fawaz Ali

Mohsen Naseri

David Perez Loureiro

Amy Festarini

Marilyne Stuart

Chad Boyer

Ronald Rogge

Christie Costello

Norma Ybarra

John Kildea

Richard B Richardson

2025-03-10

Radiation Protection Dosimetry (published)

doi.org

SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection

Shamsuddeen Hassan Muhammad

Nedjma OUSIDHOUM

Idris Abdulmumin

Seid Muhie Yimam

Jan Philip Wahle

Terry Lima Ruas

Meriem Beloucif

Christine de Kock

Tadesse Belay

Ibrahim Ahmad

Nirmal Surange

Daniela Teodorescu

David Ifeoluwa Adelani

Alham Fikri Aji

Felermino Ali

Vladimir Araujo

Abinew Ayele

Oana Ignat

Alexander Panchenko

Yi Zhou … (see 1 more)

Saif M. Mohammad

2025-03-10

ArXiv (preprint)

arxiv.org

Understanding the impact of IoT security patterns on CPU usage and energy consumption: a dynamic approach for selecting patterns with deep reinforcement learning

Saeid Jamshidi

Amin Nikanjam

Kawser Wazed Nafi

Foutse Khomh

2025-03-10

International Journal of Information Security (published)

doi.org

Spectral State Space Model for Rotation-Invariant Visual Representation Learning

Sahar Dastani

Ali Bahri

Moslem Yazdanpanah

Mehrdad Noori

David Osowiechi

Gustavo Adolfo Vargas Hakim

Farzad Beizaee

Milad Cheraghalikhani

Arnab Kumar Mondal

Hervé Lombaert

Christian Desrosiers

2025-03-09

ArXiv (preprint)

arxiv.org

Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning

Samuel Garcin

Trevor McInroe

Pablo Samuel Castro

Prakash Panangaden

Christopher G. Lucas

David Abel

Stefano V Albrecht

Extracting relevant information from a stream of high-dimensional observations is a central challenge for deep reinforcement learning agents… (see more). Actor-critic algorithms add further complexity to this challenge, as it is often unclear whether the same information will be relevant to both the actor and the critic. To this end, we here explore the principles that underlie effective representations for the actor and for the critic in on-policy algorithms. We focus our study on understanding whether the actor and critic will benefit from separate, rather than shared, representations. Our primary finding is that when separated, the representations for the actor and critic systematically specialise in extracting different types of information from the environment -- the actor's representation tends to focus on action-relevant information, while the critic's representation specialises in encoding value and dynamics information. We conduct a rigourous empirical study to understand how different representation learning approaches affect the actor and critic's specialisations and their downstream performance, in terms of sample efficiency and generation capabilities. Finally, we discover that a separated critic plays an important role in exploration and data collection during training. Our code, trained models and data are accessible at https://github.com/francelico/deac-rep.

2025-03-08

ArXiv (preprint)

doi.org

arxiv.org

A Taxonomy of Inefficiencies in LLM-Generated Python Code

Altaf Allah Abbassi

Leuson Da Silva

Amin Nikanjam

Foutse Khomh

Large Language Models (LLMs) are widely adopted for automated code generation with promising results. Although prior research has assessed L… (see more)LM-generated code and identified various quality issues -- such as redundancy, poor maintainability, and sub-optimal performance a systematic understanding and categorization of these inefficiencies remain unexplored. Without such knowledge, practitioners struggle to optimize LLM-generated code for real-world applications, limiting its adoption. This study can also guide improving code LLMs, enhancing the quality and efficiency of code generation. Therefore, in this study, we empirically investigate inefficiencies in LLM-generated code by state-of-the-art models, i.e., CodeLlama, DeepSeek-Coder, and CodeGemma. To do so, we analyze 492 generated code snippets in the HumanEval++ dataset. We then construct a taxonomy of inefficiencies in LLM-generated code that includes 5 categories General Logic, Performance, Readability, Maintainability, and Errors) and 19 subcategories of inefficiencies. We then validate the proposed taxonomy through an online survey with 58 LLM practitioners and researchers. Our study indicates that logic and performance-related inefficiencies are the most popular, relevant, and frequently co-occur and impact overall code quality inefficiency. Our taxonomy provides a structured basis for evaluating the quality LLM-generated code and guiding future research to improve code generation efficiency.

2025-03-08

ArXiv (preprint)

arxiv.org

Unveiling Inefficiencies in LLM-Generated Code: Toward a Comprehensive Taxonomy

Altaf Allah Abbassi

Leuson Da Silva

Amin Nikanjam

Foutse Khomh

2025-03-08

ArXiv (preprint)

arxiv.org

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

Supervision Requests

Publications

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

Supervision Requests

Popular keywords:

Publications