Publications

Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs

Mehran Shakerinava

2025-05-17

ArXiv (prépublication)

FedWeight: mitigating covariate shift of federated learning on electronic health records data through patients re-weighting

Mike He Zhu

Jun Bai

Na Li

Xiaoxiao Li

Dianbo Liu

David Buckeridge

Yue Li

2025-05-17

NPJ Digital Medicine (publié)

doi.org

Search-Based Correction of Reasoning Chains for Language Models

Minsu Kim

Jean-Pierre R. Falet

Oliver E. Richardson

Xiaoyin Chen

Moksh J. Jain

Sungjin Ahn

Sungsoo Ahn

Yoshua Bengio

2025-05-17

ArXiv (prépublication)

arxiv.org

Search-Based Correction of Reasoning Chains for Language Models

Minsu Kim

Jean-Pierre R. Falet

Oliver E. Richardson

Xiaoyin Chen

Moksh J. Jain

Sungjin Ahn

Sungsoo Ahn

Yoshua Bengio

2025-05-17

ArXiv (prépublication)

arxiv.org

A multi-ancestry genetic reference for the Quebec population

Peyton McClelland

Georgette Femerling

R. Laflamme

Alejandro Mejia-Garcia

Mohadese Sayahian Dehkordi

Hongyu Xiao

Alex Diaz-Papkovich

Justin Pelletier

Jean-Christophe Grenier

Ken Sin Lo

Luke Anderson-Trocmé

Justin Bellavance

Vincent Chapdelaine

Geneviève Gagnon

Annelie De Mori

Gerardo Martinez

Kristen Mohler

Thibault de Malliard

Catherine Labbé

Marjorie Labrecque … (voir 14 de plus)

Alexandre Montpetit

D. Spiegelman

Guy A. Rouleau

Jean-francois Théroux

Hufeng Zhou

Simon L. Girard

Julie Hussin

Anne-Marie Laberge

C. Bhérer

Martine Tétreault

Sarah A. Gagliano Taliun

Daniel Taliun

Simon Gravel

Guillaume Lettre

While international efforts have characterized genetic variation in millions of individuals, the interplay of environmental, social, cultura… (voir plus)l, and genetic factors is poorly understood for most worldwide populations. The province of Quebec in Canada has been the site of numerous genetic studies, often focusing on individual Mendelian diseases in founder sub-populations. Here, we profiled and analyzed genome-wide genotyped variation in 29,337 Quebec residents from the large population-based cohort CARTaGENE (CaG), including rich phenotype and environmental data. We also sequenced the whole-genome of 2,173 CaG participants, including 163 and 132 individuals with grandparents born in Haiti and Morocco, respectively. We use this genetic information to gain insight into Quebec's demography and to help interpret the potential significance of variants identified in clinically important genes. We built an imputation panel by phasing the CaG whole-genome sequence data and showed, using genome-wide association studies (GWAS), how it improves the discovery of phenotype-genotype associations in this population. We provide allele frequency information and GWAS results through dedicated and publicly available websites. The genetic data, paired with phenotypic and environmental information, is also available for research use upon scientific and ethical review.

2025-05-16

medRxiv (prépublication)

doi.org

A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment

Jean-Philippe Corbeil

Amin Dada

Jean-Michel Attendu

Asma Ben Abacha

Alessandro Sordoni

Lucas Caccia

Franccois Beaulieu

Thomas Lin

Jens Kleesiek

Paul Vozila

2025-05-15

ArXiv (prépublication)

arxiv.org

A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment

Jean-Philippe Corbeil

Amin Dada

Jean-Michel Attendu

Asma Ben Abacha

Alessandro Sordoni

Lucas Caccia

Franccois Beaulieu

Thomas Lin

Jens Kleesiek

Paul Vozila

High computation costs and latency of large language models such as GPT-4 have limited their deployment in clinical settings. Small language… (voir plus) models (SLMs) offer a cost-effective alternative, but their limited capacity requires biomedical domain adaptation, which remains challenging. An additional bottleneck is the unavailability and high sensitivity of clinical data. To address these challenges, we propose a novel framework for adapting SLMs into high-performing clinical models. We introduce the MediPhi collection of 3.8B-parameter SLMs developed with our novel framework: pre-instruction tuning of experts on relevant medical and clinical corpora (PMC, Medical Guideline, MedWiki, etc.), model merging, and clinical-tasks alignment. To cover most clinical tasks, we extended the CLUE benchmark to CLUE+, doubling its size. Our expert models deliver relative improvements on this benchmark over the base model without any task-specific fine-tuning: 64.3% on medical entities, 49.5% on radiology reports, and 44% on ICD-10 coding (outperforming GPT-4-0125 by 14%). We unify the expert models into MediPhi via model merging, preserving gains across benchmarks. Furthermore, we built the MediFlow collection, a synthetic dataset of 2.5 million high-quality instructions on 14 medical NLP tasks, 98 fine-grained document types, and JSON format support. Alignment of MediPhi using supervised fine-tuning and direct preference optimization achieves further gains of 18.9% on average.

2025-05-15

ArXiv (prépublication)

arxiv.org

Persistent signs of poisoning after massive drug ingestion: move the ultrasound probe to the stomach.

N. Lautrou-cabasson

H. Pirollet

C. Lombois

Guillaume Dumas

2025-05-15

Intensive Care Medicine (publié)

doi.org

Plasticity as the Mirror of Empowerment

David Abel

Michael Bowling

Andre Barreto

Will Dabney

Shi Dong

Steven Hansen

Anna Harutyunyan

Khimya Khetarpal

Clare Lyle

Razvan Pascanu

Georgios Piliouras

Doina Precup

Jonathan Richens

Mark Rowland

Tom Schaul

Satinder Singh

2025-05-15

ArXiv (prépublication)

arxiv.org

Plasticity as the Mirror of Empowerment

David Abel

Michael Bowling

Andre Barreto

Will Dabney

Shi Dong

Steven Hansen

Anna Harutyunyan

Khimya Khetarpal

Clare Lyle

Razvan Pascanu

Georgios Piliouras

Doina Precup

Jonathan Richens

Mark Rowland

Tom Schaul

Satinder Singh

2025-05-15

ArXiv (prépublication)

arxiv.org

Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?

Anthony GX-Chen

Dongyan Lin

Mandana Samiei

Doina Precup

Blake Richards

Rob Fergus

Kenneth Marino

2025-05-14

ArXiv (prépublication)

arxiv.org

Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?

Anthony GX-Chen

Dongyan Lin

Mandana Samiei

Doina Precup

Blake Richards

Rob Fergus

Kenneth Marino

Language model (LM) agents are increasingly used as autonomous decision-makers who need to actively gather information to guide their decisi… (voir plus)ons. A crucial cognitive skill for such agents is the efficient exploration and understanding of the causal structure of the world -- key to robust, scientifically grounded reasoning. Yet, it remains unclear whether LMs possess this capability or exhibit systematic biases leading to erroneous conclusions. In this work, we examine LMs' ability to explore and infer causal relationships, using the well-established"Blicket Test"paradigm from developmental psychology. We find that LMs reliably infer the common, intuitive disjunctive causal relationships but systematically struggle with the unusual, yet equally (or sometimes even more) evidenced conjunctive ones. This"disjunctive bias"persists across model families, sizes, and prompting strategies, and performance further declines as task complexity increases. Interestingly, an analogous bias appears in human adults, suggesting that LMs may have inherited deep-seated reasoning heuristics from their training data. To this end, we quantify similarities between LMs and humans, finding that LMs exhibit adult-like inference profiles (but not children-like). Finally, we propose a test-time sampling method which explicitly samples and eliminates hypotheses about causal relationships from the LM. This scalable approach significantly reduces the disjunctive bias and moves LMs closer to the goal of scientific, causally rigorous reasoning.

2025-05-14

ArXiv (prépublication)

arxiv.org

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Publications

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Mots-clés populaires:

Publications