Publications

Implementing a Hierarchical Deep Learning Approach for Simulating multilevel Auction Data
Marcelin Joanis
Igor Sadoune
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Oleksiy Ostapenko
Zhan Su
Edoardo Ponti
Matheus Pereira
Lucas Caccia
The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trai… (voir plus)ned adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approaches to build this library and introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters, indirectly optimizing for transfer across the multi-task dataset. To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters for new inputs without the need for retraining. We experiment with several LLMs, such as Phi-2 and Mistral, on a wide array of held-out tasks, verifying that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. We make steps towards creating modular, adaptable LLMs that can match or outperform traditional joint training.
Automatic segmentation of Organs at Risk in Head and Neck cancer patients from CT and MRI scans
Sébastien Quetin
Andrew Heschl
Mauricio Murillo
Rohit Murali
Farhad Maleki
Background and purpose: Deep Learning (DL) has been widely explored for Organs at Risk (OARs) segmentation; however, most studies have focus… (voir plus)ed on a single modality, either CT or MRI, not both simultaneously. This study presents a high-performing DL pipeline for segmentation of 30 OARs from MRI and CT scans of Head and Neck (H&N) cancer patients. Materials and methods: Paired CT and MRI-T1 images from 42 H&N cancer patients alongside annotation for 30 OARs from the H&N OAR CT&MR segmentation challenge dataset were used to develop a segmentation pipeline. After cropping irrelevant regions, rigid followed by non-rigid registration of CT and MRI volumes was performed. Two versions of the CT volume, representing soft tissues and bone anatomy, were stacked with the MRI volume and used as input to an nnU-Net pipeline. Modality Dropout was used during the training to force the model to learn from the different modalities. Segmentation masks were predicted with the trained model for an independent set of 14 new patients. The mean Dice Score (DS) and Hausdorff Distance (HD) were calculated for each OAR across these patients to evaluate the pipeline. Results: This resulted in an overall mean DS and HD of 0.777 +- 0.118 and 3.455 +- 1.679, respectively, establishing the state-of-the-art (SOTA) for this challenge at the time of submission. Conclusion: The proposed pipeline achieved the best DS and HD among all participants of the H&N OAR CT and MR segmentation challenge and sets a new SOTA for automated segmentation of H&N OARs.
GFETM: Genome Foundation-based Embedded Topic Model for scATAC-seq Modeling
Yimin Fan
Adrien Osakwe
Yu Li
Supervised latent factor modeling isolates cell-type-specific transcriptomic modules that underlie Alzheimer’s disease progression
Liam Hodgson
Yasser Iturria-Medina
Jo Anne Stratton
Guy Wolf
Smita Krishnaswamy
David A. Bennett
Data Selection for Transfer Unlearning
Nazanin Mohammadi Sepahvand
Vincent Dumoulin
Eleni Triantafillou
Towards a framework selection for assessing the performance of photovoltaic solar power plants: criteria determination
Meryam Chafiq
Loubna Benabbou
Ismail Belhaj
Abdelali Djdiaa
Hicham Bouzekri
Abdelaziz Berrado
Mastery of Key Performance Indicators (KPIs) in the realm of photovoltaic solar power plants is pivotal for evaluating their effectiveness a… (voir plus)nd fine-tuning their operational efficiency. The assessment of these plants' performance has con-sistently stood as a focal point in scientific research. Nevertheless, the investigation into the process of selecting a framework for classifying KPIs, particularly through their categorization based on criteria, sub-criteria, or aspects, has been relatively limited in research. This article addresses this gap by conducting a comprehensive literature review on various KPIs and, drawing upon both literature and practical experience, formulating a set of criteria to serve as the foundation for a Multi-Criteria Decision Analysis (MCDA) method. This intricate taxonomic framework enhances the understanding of infrastructure performance for stakeholders in the solar industry. By streamlining decision-making, it simplifies the selection of KPIs tailored to specific requirements, thus mitigating the complexity arising from the abundance of KPIs in the literature. As a result, decision-makers can make well-informed choices regarding the monitoring and evaluation framework that best suits the performance goals of their solar plant.
LLMs can learn self-restraint through iterative self-reflection
Alexandre Piché
Aristides Milios
Chris Pal
Unmasking Efficiency: Learning Salient Sparse Models in Non-IID Federated Learning
Riyasat Ohib
Bishal Thapaliya
Jingyu Liu 0001
Vince D. Calhoun
Sergey M. Plis
In this work, we propose Salient Sparse Federated Learning (SSFL), a streamlined approach for sparse federated learning with efficient commu… (voir plus)nication. SSFL identifies a sparse subnetwork prior to training, leveraging parameter saliency scores computed separately on local client data in non-IID scenarios, and then aggregated, to determine a global mask. Only the sparse model weights are communicated each round between the clients and the server. We validate SSFL's effectiveness using standard non-IID benchmarks, noting marked improvements in the sparsity--accuracy trade-offs. Finally, we deploy our method in a real-world federated learning framework and report improvement in communication time.
Preface of UniReps: the First Workshop on Unifying Representations in Neural Models
Marco Fumero
Emanuele Rodolá
Clementine Domine
Francesco Locatello
Karolina Dziugaite
Caron Mathilde
Discover why, when and how distinct learning processes yield similar representations, and the degree to which these can be unified.
Protocol to perform integrative analysis of high-dimensional single-cell multimodal data using an interpretable deep learning technique
Manqi Zhou
Hao Zhang
Zilong Bai
Dylan Mann-Krzisnik
Fei Wang
What Mechanisms Does Knowledge Distillation Distill?
Cindy Wu
Ekdeep Singh Lubana
Bruno Mlodozeniec
Robert Kirk
Knowledge distillation is a commonly-used compression method in ML due to the popularity of increasingly large-scale models, but it is uncle… (voir plus)ar if all the information a teacher model contains is distilled into the smaller student model. We aim to formalize the concept of ‘knowledge’ to investigate how knowledge is transferred during distillation, focusing on shared invariant outputs to counterfactual changes of dataset latent variables (we call these latents mechanisms). We define a student model to be a good stand-in model for a teacher if it shares the teacher’s learned mechanisms, and find that Jacobian matching and contrastive representation learning are viable methods by which to train such models. While these methods do not result in perfect transfer of mechanisms, we show they often improve student fidelity or mitigate simplicity bias (as measured by the teacher-to-student KL divergence and accuracy on various out-of-distribution test datasets), especially on datasets with spurious statistical correlations.