Publications

A density estimation perspective on learning from pairwise human preferences

Vincent Dumoulin

Daniel D. Johnson

Yann Dauphin

Learning from human feedback (LHF) -- and in particular learning from pairwise preferences -- has recently become a crucial ingredient in tr… (see more)aining large language models (LLMs), and has been the subject of much research. Most recent works frame it as a reinforcement learning problem, where a reward function is learned from pairwise preference data and the LLM is treated as a policy which is adapted to maximize the rewards, often under additional regularization constraints. We propose an alternative interpretation which centers on the generative process for pairwise preferences and treats LHF as a density estimation problem. We provide theoretical and empirical results showing that for a family of generative processes defined via preference behavior distribution equations, training a reward function on pairwise preferences effectively models an annotator's implicit preference distribution. Finally, we discuss and present findings on"annotator misspecification"-- failure cases where wrong modeling assumptions are made about annotator behavior, resulting in poorly-adapted models -- suggesting that approaches that learn from pairwise human preferences could have trouble learning from a population of annotators with diverse viewpoints.

2024-02-27

TMLR (accepted)

doi.org

openreview.net

A Neural-Evolutionary Algorithm for Autonomous Transit Network Design

Andrew Holliday

Gregory Dudek

2024-02-27

ArXiv (preprint)

doi.org

arxiv.org

RAMEN Unveils Clinical Variable Networks for COVID-19 Severity and Long COVID Using Absorbing Random Walks and Genetic Algorithms

Yiwei Xiong

Jingtao Wang

Xiaoxiao Shang

Tingting Chen

Douglas D. Fraser

Gregory Fonseca

Simon Rousseau

Jun Ding

The COVID-19 pandemic has significantly altered global socioeconomic structures and individual lives. Understanding the disease mechanisms a… (see more)nd facilitating diagnosis requires comprehending the complex interplay among clinical factors like demographics, symptoms, comorbidities, treatments, lab results, complications, and other metrics, and their relation to outcomes such as disease severity and long term outcomes (e.g., post-COVID-19 condition/long COVID). Conventional correlational methods struggle with indirect and directional connections among these factors, while standard graphical methods like Bayesian networks are computationally demanding for extensive clinical variables. In response, we introduced RAMEN, a methodology that integrates Genetic Algorithms with random walks for efficient Bayesian network inference, designed to map the intricate relationships among clinical variables. Applying RAMEN to the Biobanque québécoise de la COVID-19 (BQC19) dataset, we identified critical markers for long COVID and varying disease severity. The Bayesian Network, corroborated by existing literature and supported through multi-omics analyses, highlights significant clinical variables linked to COVID-19 outcomes. RAMEN’s ability to accurately map these connections contributes substantially to developing early and effective diagnostics for severe COVID-19 and long COVID.

2024-02-27

bioRxiv (preprint)

doi.org

On the Societal Impact of Open Foundation Models

Sayash Kapoor

Rishi Bommasani

Kevin Klyman

Shayne Longpre

Ashwin Ramaswami

Peter Cihon

Aspen Hopkins

Kevin Bankston

Stella Biderman

Miranda Bogen

Rumman Chowdhury

Alex Engler

Peter Henderson

Yacine Jernite

Seth Lazar

Stefano Maffulli

Alondra Nelson

Joelle Pineau

Aviya Skowron

Dawn Song … (see 5 more)

Victor Storchan

Daniel Zhang

Daniel E. Ho

Percy Liang

Arvind Narayanan

2024-02-27

ArXiv (preprint)

doi.org

arxiv.org

Effective Latent Differential Equation Models via Attention and Multiple Shooting

Germán Abrevaya

Mahta Ramezanian-Panahi

Jean-Christophe Gagnon-Audet

Pablo Polosecki

Irina Rish

Silvina Ponce Dawson

Guillermo Cecchi

Guillaume Dumas

2024-02-26

TMLR (accepted)

openreview.net

SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning

Luca Zampierin

Ghouthi Boukli Hacene

Bac Nguyen

Mirco Ravanelli

2024-02-26

ArXiv (preprint)

doi.org

arxiv.org

Correction to: Multi-agent reinforcement learning for fast-timescale demand response of residential loads

Vincent Mai

Philippe Maisonneuve

Tianyu Zhang

Hadi Nekoei

Liam Paull

Antoine Lesage-Landry

2024-02-23

Machine-mediated learning (published)

doi.org

Intra-Host Evolution Analyses in an Immunosuppressed Patient Supports SARS-CoV-2 Viral Reservoir Hypothesis

Dominique Fournelle

Fatima Mostefai

Elsa Brunet-Ratnasingham

Raphael Poujol

Jean-Christophe Grenier

José Héctor Gálvez

Amélie Pagliuzza

Inès Levade

Sandrine Moreira

Mehdi Benlarbi

Guillaume Beaudoin-Bussières

Gabrielle Gendron-Lepage

Catherine Bourassa

Alexandra Tauzin

Simon Grandjean Lapierre

Nicolas Chomont

Andrés Finzi

Daniel E. Kaufmann

Morgan Craig

Julie Hussin

2024-02-23

Viruses (published)

doi.org

The Sample Average Approximation Method for Solving Two-Stage Stochastic Programs with Endogenous Uncertainty

Maria Bazotte

Margarida Carvalho

Thibaut Vidal

Real-world decision-making problems involve Type 1 decision-dependent uncertainty, where the probability distribution of the stochastic proc… (see more)ess depends on the model decisions. However, few studies focus on two-stage stochastic programs with this type of endogenous uncertainty, and those that do lack general methodologies. We thus propose herein a general method for solving a class of these programs based on the transformation of random variables, a technique widely employed in probability and statistics. The proposed method is tailored to large-scale problems with discrete or continuous endogenous random variables. The random variable transformation allows the use of the sample average approximation (SAA) method, which provides optimality convergence guarantees under certain conditions. We show that, for some classical distributions, the proposed method reduces to solving mixed-integer linear or convex programs. Finally, we validate this method by applying it to a network design and facility-protection problem, considering distinct decision-dependent distributions for the random variables. Whereas most distributions result in a nonlinear nonconvex deterministic equivalent program, the proposed method solves mixed-integer linear programs in all cases. In addition, it produces attractive performance estimators for the SAA method in a reasonable computational time and outperforms the case in which the endogenous distribution defines a mixed-integer deterministic equivalent.

2024-02-23

ArXiv (preprint)

arxiv.org

Posterior inference of Hi-C contact frequency through sampling

Yanlin Zhang

Christopher J. F. Cameron

Mathieu Blanchette

Hi-C is one of the most widely used approaches to study three-dimensional genome conformations. Contacts captured by a Hi-C experiment are r… (see more)epresented in a contact frequency matrix. Due to the limited sequencing depth and other factors, Hi-C contact frequency matrices are only approximations of the true interaction frequencies and are further reported without any quantification of uncertainty. Hence, downstream analyses based on Hi-C contact maps (e.g., TAD and loop annotation) are themselves point estimations. Here, we present the Hi-C interaction frequency sampler (HiCSampler) that reliably infers the posterior distribution of the interaction frequency for a given Hi-C contact map by exploiting dependencies between neighboring loci. Posterior predictive checks demonstrate that HiCSampler can infer highly predictive chromosomal interaction frequency. Summary statistics calculated by HiCSampler provide a measurement of the uncertainty for Hi-C experiments, and samples inferred by HiCSampler are ready for use by most downstream analysis tools off the shelf and permit uncertainty measurements in these analyses without modifications.

2024-02-22

Frontiers in Bioinformatics (published)

doi.org

Reinforcement Learning with Elastic Time Steps

Dong Wang

Giovanni Beltrame

2024-02-22

ArXiv (preprint)

doi.org

arxiv.org

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Lucas Lehnert

Sainbayar Sukhbaatar

Paul McVay

Michael Rabbat

Yuandong Tian

While Transformers have enabled tremendous progress in various application settings, such architectures still lag behind traditional symboli… (see more)c planners for solving complex decision making tasks. In this work, we demonstrate how to train Transformers to solve complex planning tasks and present Searchformer, a Transformer model that optimally solves previously unseen Sokoban puzzles 93.7% of the time, while using up to 26.8% fewer search steps than standard

2024-02-21

ArXiv (preprint)

doi.org

arxiv.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications