Publications

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Oleksiy Ostapenko

Zhan Su

Edoardo Ponti

Laurent Charlin

Nicolas Le Roux

Matheus Pereira

Lucas Caccia

Alessandro Sordoni

The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trai… (voir plus)ned adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approaches to build this library and introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters, indirectly optimizing for transfer across the multi-task dataset. To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters for new inputs without the need for retraining. We experiment with several LLMs, such as Phi-2 and Mistral, on a wide array of held-out tasks, verifying that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. We make steps towards creating modular, adaptable LLMs that can match or outperform traditional joint training.

2024-05-18

ArXiv (prépublication)

arxiv.org

Automatic segmentation of Organs at Risk in Head and Neck cancer patients from CT and MRI scans

Sébastien Quetin

Andrew Heschl

Mauricio Murillo

Rohit Murali

Shirin A. Enger

Farhad Maleki

Background and purpose: Deep Learning (DL) has been widely explored for Organs at Risk (OARs) segmentation; however, most studies have focus… (voir plus)ed on a single modality, either CT or MRI, not both simultaneously. This study presents a high-performing DL pipeline for segmentation of 30 OARs from MRI and CT scans of Head and Neck (H&N) cancer patients. Materials and methods: Paired CT and MRI-T1 images from 42 H&N cancer patients alongside annotation for 30 OARs from the H&N OAR CT&MR segmentation challenge dataset were used to develop a segmentation pipeline. After cropping irrelevant regions, rigid followed by non-rigid registration of CT and MRI volumes was performed. Two versions of the CT volume, representing soft tissues and bone anatomy, were stacked with the MRI volume and used as input to an nnU-Net pipeline. Modality Dropout was used during the training to force the model to learn from the different modalities. Segmentation masks were predicted with the trained model for an independent set of 14 new patients. The mean Dice Score (DS) and Hausdorff Distance (HD) were calculated for each OAR across these patients to evaluate the pipeline. Results: This resulted in an overall mean DS and HD of 0.777 +- 0.118 and 3.455 +- 1.679, respectively, establishing the state-of-the-art (SOTA) for this challenge at the time of submission. Conclusion: The proposed pipeline achieved the best DS and HD among all participants of the H&N OAR CT and MR segmentation challenge and sets a new SOTA for automated segmentation of H&N OARs.

2024-05-17

ArXiv (prépublication)

doi.org

arxiv.org

GFETM: Genome Foundation-based Embedded Topic Model for scATAC-seq Modeling

Yimin Fan

Adrien Osakwe

Shi Han

Yu Li

Jun Ding

Yue Li

2024-05-17

Lecture Notes in Computer Science (publié)

doi.org

Inferring Metabolic States from Single Cell Transcriptomic Data via Geometric Deep Learning

Holly Steach

Siddharth Viswanath

Yixuan He

Xitong Zhang

Natalia Ivanova

Matthew Hirn

Michael Perlmutter

Smita Krishnaswamy

2024-05-17

Lecture Notes in Computer Science (publié)

doi.org

Supervised latent factor modeling isolates cell-type-specific transcriptomic modules that underlie Alzheimer’s disease progression

Liam Hodgson

Yue Li

Yasser Iturria-Medina

Jo Anne Stratton

Guy Wolf

Smita Krishnaswamy

David A. Bennett

Danilo Bzdok

2024-05-17

Communications Biology (publié)

doi.org

Cognitive, interpersonal, and intrapersonal deeper learning domains: A systematic review of computational thinking

Hao-Yue Jin

Maria Cutumisu

2024-05-16

Education and Information Technologies : Official Journal of the IFIP technical committee on Education (publié)

doi.org

Data Selection for Transfer Unlearning

Nazanin Mohammadi Sepahvand

Vincent Dumoulin

Eleni Triantafillou

Gintare Karolina Dziugaite

2024-05-16

ArXiv (prépublication)

doi.org

arxiv.org

Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering

Vaibhav Adlakha

Parishad BehnamGhader

Xing Han Lu

Nicholas Meade

Siva Reddy

Retriever-augmented instruction-following models are attractive alternatives to fine-tuned approaches for information-seeking tasks such as … (voir plus)question answering (QA). By simply prepending retrieved documents in its input along with an instruction, these models can be adapted to various information domains and tasks without additional fine-tuning. While the model responses tend to be natural and fluent, the additional verbosity makes traditional QA evaluation metrics such as exact match (EM) and F1 unreliable for accurately quantifying model performance. In this work, we investigate the performance of instruction-following models across three information-seeking QA tasks. We use both automatic and human evaluation to evaluate these models along two dimensions: 1) how well they satisfy the user's information need (correctness), and 2) whether they produce a response based on the provided knowledge (faithfulness). Guided by human evaluation and analysis, we highlight the shortcomings of traditional metrics for both correctness and faithfulness. We then propose simple token-overlap based and model-based metrics that reflect the true performance of these models. Our analysis reveals that instruction-following models are competitive, and sometimes even outperform fine-tuned models for correctness. However, these models struggle to stick to the provided knowledge and often hallucinate in their responses. We hope our work encourages a more holistic evaluation of instruction-following models for QA. Our code and data is available at https://github.com/McGill-NLP/instruct-qa

2024-05-16

Transactions of the Association for Computational Linguistics (publié)

doi.org

arxiv.org

Sustained IFN signaling is associated with delayed development of SARS-CoV-2-specific immunity

Elsa Brunet-Ratnasingham

Sacha Morin

Haley E. Randolph

Marjorie Labrecque

Justin Bélair

Raphaël Lima-Barbosa

Amélie Pagliuzza

Lorie Marchitto

Michael Hultström

Julia Niessl

Rose Cloutier

Alina M. Sreng Flores

Nathalie Brassard

Mehdi Benlarbi

Jérémie Prévost

Shilei Ding

Sai Priya Anand

Gérémy Sannier

Amanda Marks

Dick Wågsäter … (voir 27 de plus)

Eric Bareke

Hugo Zeberg

Miklos Lipcsey

Robert Frithiof

Anders Larsson

Sirui Zhou

Tomoko Nakanishi

David R. Morrison

Dani Vezina

Catherine Bourassa

Gabrielle Gendron-Lepage

Halima Medjahed

Floriane Point

Jonathan Richard

Catherine Larochelle

Alexandre Prat

Janet L. Cunningham

Nathalie Arbour

Madeleine Durand

J. Brent Richards

Kevin R. Moon

Nicolas Chomont

Andrés Finzi

Martine Tétreault

Luis Barreiro

Guy Wolf

Daniel E. Kaufmann

Plasma RNAemia, delayed antibody responses and inflammation predict COVID-19 outcomes, but the mechanisms underlying these immunovirological… (voir plus) patterns are poorly understood. We profile 782 longitudinal plasma samples from 318 hospitalized COVID-19 patients. Integrated analysis using k-means reveal four patient clusters in a discovery cohort: mechanically ventilated critically-ill cases are subdivided into good prognosis and high-fatality clusters (reproduced in a validation cohort), while non-critical survivors are delineated by high and low antibody responses. Only the high-fatality cluster is enriched for transcriptomic signatures associated with COVID-19 severity, and each cluster has distinct RBD-specific antibody elicitation kinetics. Both critical and non-critical clusters with delayed antibody responses exhibit sustained IFN signatures, which negatively correlate with contemporaneous RBD-specific IgG levels and absolute SARS-CoV-2-specific B and CD4+ T cell frequencies. These data suggest that the Interferon paradox previously described in murine LCMV models is operative in COVID-19, with excessive IFN signaling delaying development of adaptive virus-specific immunity.

2024-05-16

Nature Communications (publié)

doi.org

Towards a framework selection for assessing the performance of photovoltaic solar power plants: criteria determination

Meryam Chafiq

Loubna Benabbou

Hanane Dagdougui

Ismail Belhaj

Abdelali Djdiaa

Hicham Bouzekri

Abdelaziz Berrado

Mastery of Key Performance Indicators (KPIs) in the realm of photovoltaic solar power plants is pivotal for evaluating their effectiveness a… (voir plus)nd fine-tuning their operational efficiency. The assessment of these plants' performance has con-sistently stood as a focal point in scientific research. Nevertheless, the investigation into the process of selecting a framework for classifying KPIs, particularly through their categorization based on criteria, sub-criteria, or aspects, has been relatively limited in research. This article addresses this gap by conducting a comprehensive literature review on various KPIs and, drawing upon both literature and practical experience, formulating a set of criteria to serve as the foundation for a Multi-Criteria Decision Analysis (MCDA) method. This intricate taxonomic framework enhances the understanding of infrastructure performance for stakeholders in the solar industry. By streamlining decision-making, it simplifies the selection of KPIs tailored to specific requirements, thus mitigating the complexity arising from the abundance of KPIs in the literature. As a result, decision-makers can make well-informed choices regarding the monitoring and evaluation framework that best suits the performance goals of their solar plant.

2024-05-16

2024 4th International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET) (publié)

doi.org

LLMs can learn self-restraint through iterative self-reflection

Alexandre Piché

Aristides Milios

Dzmitry Bahdanau

Chris Pal

2024-05-15

ArXiv (prépublication)

doi.org

arxiv.org

LLMs can learn self-restraint through iterative self-reflection

Alexandre Piché

Aristides Milios

Dzmitry Bahdanau

Chris Pal

In order to be deployed safely, Large Language Models (LLMs) must be capable of dynamically adapting their behavior based on their level of … (voir plus)knowledge and uncertainty associated with specific topics. This adaptive behavior, which we refer to as self-restraint, is non-trivial to teach since it depends on the internal knowledge of an LLM. By default, LLMs are trained to maximize the next token likelihood, which does not teach the model to modulate its answer based on its level of uncertainty. In order to learn self-restraint, we devise a utility function that can encourage the model to produce responses only when it is confident in them. This utility function can be used to score generation of different length and abstention. To optimize this function, we introduce ReSearch, a process of"self-reflection"consisting of iterative self-prompting and self-evaluation. We use the ReSearch algorithm to generate synthetic data on which we finetune our models. Compared to their original versions, our resulting models generate fewer \emph{hallucinations} overall at no additional inference cost, for both known and unknown topics, as the model learns to selectively restrain itself. In addition, our method elegantly incorporates the ability to abstain by augmenting the samples generated by the model during the search procedure with an answer expressing abstention.

2024-05-15

ArXiv (prépublication)

doi.org

arxiv.org

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Publications

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Mots-clés populaires:

Publications