Publications

Implementing a Hierarchical Deep Learning Approach for Simulating multilevel Auction Data

Marcelin Joanis

Andrea Lodi

Igor Sadoune

2024-05-18

Computational Economics (publié)

doi.org

arxiv.org

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Oleksiy Ostapenko

Zhan Su

Edoardo Ponti

Laurent Charlin

Nicolas Le Roux

Matheus Pereira

Lucas Caccia

Alessandro Sordoni

The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trai… (voir plus)ned adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approaches to build this library and introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters, indirectly optimizing for transfer across the multi-task dataset. To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters for new inputs without the need for retraining. We experiment with several LLMs, such as Phi-2 and Mistral, on a wide array of held-out tasks, verifying that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. We make steps towards creating modular, adaptable LLMs that can match or outperform traditional joint training.

2024-05-18

ArXiv (prépublication)

arxiv.org

Automatic segmentation of Organs at Risk in Head and Neck cancer patients from CT and MRI scans

Sébastien Quetin

Andrew Heschl

Mauricio Murillo

Rohit Murali

Shirin A. Enger

Farhad Maleki

Background and purpose: Deep Learning (DL) has been widely explored for Organs at Risk (OARs) segmentation; however, most studies have focus… (voir plus)ed on a single modality, either CT or MRI, not both simultaneously. This study presents a high-performing DL pipeline for segmentation of 30 OARs from MRI and CT scans of Head and Neck (H&N) cancer patients. Materials and methods: Paired CT and MRI-T1 images from 42 H&N cancer patients alongside annotation for 30 OARs from the H&N OAR CT&MR segmentation challenge dataset were used to develop a segmentation pipeline. After cropping irrelevant regions, rigid followed by non-rigid registration of CT and MRI volumes was performed. Two versions of the CT volume, representing soft tissues and bone anatomy, were stacked with the MRI volume and used as input to an nnU-Net pipeline. Modality Dropout was used during the training to force the model to learn from the different modalities. Segmentation masks were predicted with the trained model for an independent set of 14 new patients. The mean Dice Score (DS) and Hausdorff Distance (HD) were calculated for each OAR across these patients to evaluate the pipeline. Results: This resulted in an overall mean DS and HD of 0.777 +- 0.118 and 3.455 +- 1.679, respectively, establishing the state-of-the-art (SOTA) for this challenge at the time of submission. Conclusion: The proposed pipeline achieved the best DS and HD among all participants of the H&N OAR CT and MR segmentation challenge and sets a new SOTA for automated segmentation of H&N OARs.

2024-05-17

ArXiv (prépublication)

doi.org

arxiv.org

GFETM: Genome Foundation-based Embedded Topic Model for scATAC-seq Modeling

Yimin Fan

Adrien Osakwe

Yu Li

Jun Ding

Yue Li

2024-05-17

Lecture Notes in Computer Science (publié)

doi.org

Supervised latent factor modeling isolates cell-type-specific transcriptomic modules that underlie Alzheimer’s disease progression

Liam Hodgson

Yue Li

Yasser Iturria-Medina

Jo Anne Stratton

Guy Wolf

Smita Krishnaswamy

David A. Bennett

Danilo Bzdok

2024-05-17

Communications Biology (publié)

doi.org

Data Selection for Transfer Unlearning

Nazanin Mohammadi Sepahvand

Vincent Dumoulin

Eleni Triantafillou

Gintare Karolina Dziugaite

2024-05-16

ArXiv (prépublication)

doi.org

arxiv.org

Towards a framework selection for assessing the performance of photovoltaic solar power plants: criteria determination

Meryam Chafiq

Loubna Benabbou

Hanane Dagdougui

Ismail Belhaj

Abdelali Djdiaa

Hicham Bouzekri

Abdelaziz Berrado

Mastery of Key Performance Indicators (KPIs) in the realm of photovoltaic solar power plants is pivotal for evaluating their effectiveness a… (voir plus)nd fine-tuning their operational efficiency. The assessment of these plants' performance has con-sistently stood as a focal point in scientific research. Nevertheless, the investigation into the process of selecting a framework for classifying KPIs, particularly through their categorization based on criteria, sub-criteria, or aspects, has been relatively limited in research. This article addresses this gap by conducting a comprehensive literature review on various KPIs and, drawing upon both literature and practical experience, formulating a set of criteria to serve as the foundation for a Multi-Criteria Decision Analysis (MCDA) method. This intricate taxonomic framework enhances the understanding of infrastructure performance for stakeholders in the solar industry. By streamlining decision-making, it simplifies the selection of KPIs tailored to specific requirements, thus mitigating the complexity arising from the abundance of KPIs in the literature. As a result, decision-makers can make well-informed choices regarding the monitoring and evaluation framework that best suits the performance goals of their solar plant.

2024-05-16

2024 4th International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET) (publié)

doi.org

LLMs can learn self-restraint through iterative self-reflection

Alexandre Piché

Aristides Milios

Dzmitry Bahdanau

Chris Pal

2024-05-15

ArXiv (prépublication)

doi.org

arxiv.org

Unmasking Efficiency: Learning Salient Sparse Models in Non-IID Federated Learning

Riyasat Ohib

Bishal Thapaliya

Gintare Karolina Dziugaite

Jingyu Liu 0001

Vince D. Calhoun

Sergey M. Plis

In this work, we propose Salient Sparse Federated Learning (SSFL), a streamlined approach for sparse federated learning with efficient commu… (voir plus)nication. SSFL identifies a sparse subnetwork prior to training, leveraging parameter saliency scores computed separately on local client data in non-IID scenarios, and then aggregated, to determine a global mask. Only the sparse model weights are communicated each round between the clients and the server. We validate SSFL's effectiveness using standard non-IID benchmarks, noting marked improvements in the sparsity--accuracy trade-offs. Finally, we deploy our method in a real-world federated learning framework and report improvement in communication time.

2024-05-15

ArXiv (prépublication)

doi.org

arxiv.org

Preface of UniReps: the First Workshop on Unifying Representations in Neural Models

Marco Fumero

Emanuele Rodolá

Clementine Domine

Francesco Locatello

Karolina Dziugaite

Caron Mathilde

Discover why, when and how distinct learning processes yield similar representations, and the degree to which these can be unified.

2024-05-14

Proceedings of UniReps: the First Workshop on Unifying Representations in Neural Models (publié)

proceedings.mlr.press

Protocol to perform integrative analysis of high-dimensional single-cell multimodal data using an interpretable deep learning technique

Manqi Zhou

Hao Zhang

Zilong Bai

Dylan Mann-Krzisnik

Fei Wang

Yue Li

2024-05-14

STAR Protocols (publié)

doi.org

What Mechanisms Does Knowledge Distillation Distill?

Cindy Wu

Ekdeep Singh Lubana

Bruno Mlodozeniec

Robert Kirk

David Scott Krueger

Knowledge distillation is a commonly-used compression method in ML due to the popularity of increasingly large-scale models, but it is uncle… (voir plus)ar if all the information a teacher model contains is distilled into the smaller student model. We aim to formalize the concept of ‘knowledge’ to investigate how knowledge is transferred during distillation, focusing on shared invariant outputs to counterfactual changes of dataset latent variables (we call these latents mechanisms). We define a student model to be a good stand-in model for a teacher if it shares the teacher’s learned mechanisms, and find that Jacobian matching and contrastive representation learning are viable methods by which to train such models. While these methods do not result in perfect transfer of mechanisms, we show they often improve student fidelity or mitigate simplicity bias (as measured by the teacher-to-student KL divergence and accuracy on various out-of-distribution test datasets), especially on datasets with spurious statistical correlations.

2024-05-14

Proceedings of UniReps: the First Workshop on Unifying Representations in Neural Models (publié)

proceedings.mlr.press

openreview.net

Le traitement du langage naturel à l'ère de l'IA générative

Boussole des politiques en IA

Vie étudiante et ressources

Publications

Le traitement du langage naturel à l'ère de l'IA générative

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications