Publications

Bio-Mechanical Poet: An Immersive Audiovisual Playground for Brain Signals and Generative AI.

Philipp Thölke

Antoine Bellemare‐Pepin

Yann Harel

François Lespinasse

Karim Jerbi CoCo Lab

2023-12-31

ICCC (published)

dblp.uni-trier.de

Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers.

Xiuying Wei

Skander Moalla

Razvan Pascanu

Caglar Gulçehre

2023-12-31

Advances in Neural Information Processing Systems 37 (published)

doi.org

CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization

Wenzheng Hu

Ning Liu

Zhengping Che

Mingyang Li

Jian Tang

Changshui Zhang

Jianqiang Wang

Deep convolutional neural networks are shown to be overkill with high parametric and computational redundancy in many application scenarios,… (see more) and an increasing number of works have explored model pruning to obtain lightweight and efficient networks. However, most existing pruning approaches are driven by empirical heuristics and rarely consider the joint impact of channels, leading to unguaranteed and suboptimal performance. In this article, we propose a novel channel pruning method via class-aware trace ratio optimization (CATRO) to reduce the computational burden and accelerate the model inference. Utilizing class information from a few samples, CATRO measures the joint impact of multiple channels by feature space discriminations and consolidates the layerwise impact of preserved channels. By formulating channel pruning as a submodular set function maximization problem, CATRO solves it efficiently via a two-stage greedy iterative optimization procedure. More importantly, we present theoretical justifications on convergence of CATRO and performance of pruned networks. Experimental results demonstrate that CATRO achieves higher accuracy with similar computation cost or lower computation cost with similar accuracy than other state-of-the-art channel pruning algorithms. In addition, because of its class-aware property, CATRO is suitable to prune efficient networks adaptively for various classification subtasks, enhancing handy deployment and usage of deep networks in real-world applications.

2023-12-31

IEEE Trans. Neural Networks Learn. Syst. (published)

doi.org

arxiv.org

Causal Adversarial Perturbations for Individual Fairness and Robustness in Heterogeneous Data Spaces

Ahmad-reza Ehyaei

Kiarash Mohammadi

Amir-Hossein Karimi

Samira Samadi

Golnoosh Farnadi

As responsible AI gains importance in machine learning algorithms, properties such as fairness, adversarial robustness, and causality have r… (see more)eceived considerable attention in recent years. However, despite their individual significance, there remains a critical gap in simultaneously exploring and integrating these properties. In this paper, we propose a novel approach that examines the relationship between individual fairness, adversarial robustness, and structural causal models in heterogeneous data spaces, particularly when dealing with discrete sensitive attributes. We use causal structural models and sensitive attributes to create a fair metric and apply it to measure semantic similarity among individuals. By introducing a novel causal adversarial perturbation and applying adversarial training, we create a new regularizer that combines individual fairness, causality, and robustness in the classifier. Our method is evaluated on both real-world and synthetic datasets, demonstrating its effectiveness in achieving an accurate classifier that simultaneously exhibits fairness, adversarial robustness, and causal awareness.

2023-12-31

AAAI (published)

doi.org

arxiv.org

Caustics: A Python Package for Accelerated Strong Gravitational Lensing Simulations

Connor Stone

Alexandre Adam

Adam Coogan

M. J. Yantovski-Barth

Andreas Filipp

Landung Setiawan

Cordero Core

Ronan Legin

Charles Wilson

Gabriel Missael Barco

Yashar Hezaveh

Laurence Perreault-Levasseur

2023-12-31

Journal of Open Source Software (published)

doi.org

arxiv.org

ChainBuddy: An AI-assisted Agent System for Helping Users Set up LLM Pipelines

jingyue zhang

Ian Arawjo

2023-12-31

UIST (Adjunct Volume) (published)

doi.org

Challenges in multi-task learning for fMRI-based diagnosis: Benefits for psychiatric conditions and CNVs would likely require thousands of patients

Annabelle Harvey

Clara A. Moreau

Kuldeep Kumar

Guillaume Huguet

Sebastian G.W. Urchs

Hanad Sharmarke

Khadije Jizi

Charles-Olivier Martin

Nadine Younis

Petra Tamer

Jean-Louis Martineau

Pierre Orban

Ana Isabel Silva

Jeremy Hall

Marianne B.M. van den Bree

Michael J. Owen

David E.J. Linden

Sarah Lippé

Carrie E. Bearden

Guillaume Dumas … (see 2 more)

Sébastien Jacquemont

Pierre Bellec

There is a growing interest in using machine learning (ML) models to perform automatic diagnosis of psychiatric conditions; however, general… (see more)ising the prediction of ML models to completely independent data can lead to sharp decrease in performance. Patients with different psychiatric diagnoses have traditionally been studied independently, yet there is a growing recognition of neuroimaging signatures shared across them as well as rare genetic copy number variants (CNVs). In this work, we assess the potential of multi-task learning (MTL) to improve accuracy by characterising multiple related conditions with a single model, making use of information shared across diagnostic categories and exposing the model to a larger and more diverse dataset. As a proof of concept, we first established the efficacy of MTL in a context where there is clearly information shared across tasks: the same target (age or sex) is predicted at different sites of data collection in a large functional magnetic resonance imaging (fMRI) dataset compiled from multiple studies. MTL generally led to substantial gains relative to independent prediction at each site. Performing scaling experiments on the UK Biobank, we observed that performance was highly dependent on sample size: for large sample sizes (N > 6000) sex prediction was better using MTL across three sites (N = K per site) than prediction at a single site (N = 3K), but for small samples (N 500) MTL was actually detrimental for age prediction. We then used established machine-learning methods to benchmark the diagnostic accuracy of each of the 7 CNVs (N = 19–103) and 4 psychiatric conditions (N = 44–472) independently, replicating the accuracy previously reported in the literature on psychiatric conditions. We observed that MTL hurt performance when applied across the full set of diagnoses, and complementary analyses failed to identify pairs of conditions which would benefit from MTL. Taken together, our results show that if a successful multi-task diagnostic model of psychiatric conditions were to be developed with resting-state fMRI, it would likely require datasets with thousands of patients across different diagnoses.

2023-12-31

Imaging Neuroscience (published)

doi.org

ChatGPT: What Every Pediatric Surgeon Should Know About Its Potential Uses and Pitfalls

Raquel González

Dan Poenaru

Russell Woo

A Francois Trappey

Stewart Carter

David Darcy

Ellen Encisco

Brian Gulack

Doug Miniati

Edzhem Tombash

Eunice Y. Huang

2023-12-31

Journal of Pediatric Surgery (published)

doi.org

CL-MASR: A Continual Learning Benchmark for Multilingual ASR

Luca Della Libera

Pooneh Mousavi

Salah Zaiem

Yusuf Cem Sübakan

Mirco Ravanaelli

2023-12-31

IEEE ACM Trans. Audio Speech Lang. Process. (published)

doi.org

arxiv.org

Combine and Conquer: A Meta-Analysis on Data Shift and Out-of-Distribution Detection

Eduardo Dadalto Câmara Gomes

Florence Alberge

Pierre Duhamel

Pablo Piantanida

2023-12-31

Trans. Mach. Learn. Res. (published)

doi.org

openreview.net

Common Challenges of Deep Reinforcement Learning Applications Development: An Empirical Study

Mohammad Mehdi Morovati

Florian Tambon

Mina Taraghi

Amin Nikanjam

Foutse Khomh

Machine Learning (ML) is increasingly being adopted in different industries. Deep Reinforcement Learning (DRL) is a subdomain of ML used to … (see more)produce intelligent agents. Despite recent developments in DRL technology, the main challenges that developers face in the development of DRL applications are still unknown. To fill this gap, in this paper, we conduct a large-scale empirical study of 927 DRL-related posts extracted from Stack Overflow, the most popular Q&A platform in the software community. Through the process of labeling and categorizing extracted posts, we created a taxonomy of common challenges encountered in the development of DRL applications, along with their corresponding popularity levels. This taxonomy has been validated through a survey involving 65 DRL developers. Results show that at least 45% of developers experienced 18 of the 21 challenges identified in the taxonomy. The most frequent source of difficulty during the development of DRL applications are Comprehension, API usage, and Design problems, while Parallel processing, and DRL libraries/frameworks are classified as the most difficult challenges to address, with respect to the time required to receive an accepted answer. We hope that the research community will leverage this taxonomy to develop efficient strategies to address the identified challenges and improve the quality of DRL applications.

2023-12-31

Empir. Softw. Eng. (published)

doi.org

arxiv.org

Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning

Tianyu Li

Doina Precup

Guillaume Rabusseau

In this paper, we present connections between three models used in different research fields: weighted finite automata~(WFA) from formal lan… (see more)guages and linguistics, recurrent neural networks used in machine learning, and tensor networks which encompasses a set of optimization techniques for high-order tensors used in quantum physics and numerical analysis. We first present an intrinsic relation between WFA and the tensor train decomposition, a particular form of tensor network. This relation allows us to exhibit a novel low rank structure of the Hankel matrix of a function computed by a WFA and to design an efficient spectral learning algorithm leveraging this structure to scale the algorithm up to very large Hankel matrices.We then unravel a fundamental connection between WFA and second-orderrecurrent neural networks~(2-RNN): in the case of sequences of discrete symbols, WFA and 2-RNN with linear activationfunctions are expressively equivalent. Leveraging this equivalence result combined with the classical spectral learning algorithm for weighted automata, we introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous input vectors.This algorithm relies on estimating low rank sub-blocks of the Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed learning algorithm are assessed in a simulation study on both synthetic and real-world data.

2023-12-31

Mach. Learn. (published)

doi.org

arxiv.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications