Publications

Crowdkeeping in Last-mile Delivery

Xin Wang

Okan Arslan

Erick Delage

2024-02-28

Transportation Science (publié)

doi.org

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Soham De

Samuel L. Smith

Anushan Fernando

Aleksandar Botev

George Cristian-Muraru

Albert Gu

Ruba Haroun

Leonard Berrada

Yutian Chen 0001

Srivatsan Srinivasan

Guillaume Desjardins

Arnaud Doucet

David Mark Budden

Yee Whye Teh

Razvan Pascanu

Nando de Freitas

Caglar Gulçehre

2024-02-28

ArXiv (prépublication)

arxiv.org

StarCoder 2 and The Stack v2: The Next Generation

Anton Lozhkov

Raymond Li

Loubna Ben allal

Federico Cassano

Joel Lamy-Poirier

Nouamane Tazi

Ao Tang

Dmytro Pykhtar

Jiawei Liu

Yuxiang Wei

Tianyang Liu

Max Tian

Denis Kocetkov

Arthur Zucker

Younes Belkada

Zijian Wang

Qian Liu

Dmitry Abulkhanov

Indraneil Paul

Zhuang Li … (voir 46 de plus)

Wen-Ding Li

Megan L. Risdal

Jia LI

Jian Zhu

Terry Yue Zhuo

Evgenii Zheltonozhskii

Nii Osae Osae Dade

Wenhao Yu

Lucas Krauss

Naman Jain

Yixuan Su

Xuanli He

Manan Dey

Edoardo Abati

Yekun Chai

Niklas Muennighoff

Xiangru Tang

Muhtasham Oblokulov

Christopher Akiki

Marc Marone

Chenghao Mou

Mayank Mishra

Alex Gu

Binyuan Hui

Tri Dao

Armel Zebaze

Olivier Dehaene

Nicolas Patry

Canwen Xu

Julian McAuley

Han Hu

Torsten Scholak

Sebastien Paquet

Jennifer Robinson

Carolyn Jane Anderson

Nicolas Chapados

Md. Mostofa Ali Patwary

Nima Tajbakhsh

Yacine Jernite

Carlos Muñoz Ferrandis

Lingming Zhang

Sean Hughes

Thomas Wolf

Arjun Guha

Leandro Von Werra

Harm de Vries

The BigCode project, an open-scientific collaboration focused on the responsible development of Large Language Models for Code (Code LLMs), … (voir plus)introduces StarCoder2. In partnership with Software Heritage (SWH), we build The Stack v2 on top of the digital commons of their source code archive. Alongside the SWH repositories spanning 619 programming languages, we carefully select other high-quality data sources, such as GitHub pull requests, Kaggle notebooks, and code documentation. This results in a training set that is 4x larger than the first StarCoder dataset. We train StarCoder2 models with 3B, 7B, and 15B parameters on 3.3 to 4.3 trillion tokens and thoroughly evaluate them on a comprehensive set of Code LLM benchmarks. We find that our small model, StarCoder2-3B, outperforms other Code LLMs of similar size on most benchmarks, and also outperforms StarCoderBase-15B. Our large model, StarCoder2- 15B, significantly outperforms other models of comparable size. In addition, it matches or outperforms CodeLlama-34B, a model more than twice its size. Although DeepSeekCoder- 33B is the best-performing model at code completion for high-resource languages, we find that StarCoder2-15B outperforms it on math and code reasoning benchmarks, as well as several low-resource languages. We make the model weights available under an OpenRAIL license and ensure full transparency regarding the training data by releasing the SoftWare Heritage persistent IDentifiers (SWHIDs) of the source code data.

2024-02-28

ArXiv (prépublication)

doi.org

arxiv.org

The use of dose surface maps as a tool to investigate spatial dose delivery accuracy for the rectum during prostate radiotherapy

Haley Patrick

J. Kildea

2024-02-28

Journal of Applied Clinical Medical Physics (publié)

doi.org

When does word order matter and when doesn't it?

Xuanda Chen

Timothy John O'donnell

Siva Reddy

Language models (LMs) may appear insensitive to word order changes in natural language understanding (NLU) tasks. In this paper, we propose … (voir plus)that linguistic redundancy can explain this phenomenon, whereby word order and other linguistic cues such as case markers provide overlapping and thus redundant information. Our hypothesis is that models exhibit insensitivity to word order when the order provides redundant information, and the degree of insensitivity varies across tasks. We quantify how informative word order is using mutual information (MI) between unscrambled and scrambled sentences. Our results show the effect that the less informative word order is, the more consistent the model's predictions are between unscrambled and scrambled sentences. We also find that the effect varies across tasks: for some tasks, like SST-2, LMs' prediction is almost always consistent with the original one even if the Pointwise-MI (PMI) changes, while for others, like RTE, the consistency is near random when the PMI gets lower, i.e., word order is really important.

2024-02-28

ArXiv (prépublication)

doi.org

arxiv.org

Acoustic tactile sensing for mobile robot wheels

Wilfred Mason

David Brenken

Falcon Z. Dai

Ricardo Gonzalo Cruz Castillo

Olivier St-Martin Cormier

Audrey Sedal

2024-02-27

ArXiv (prépublication)

doi.org

arxiv.org

ICE-SEARCH: A Language Model-Driven Feature Selection Approach

Tianze Yang

Tianyi Yang

Shaoshan Liu

Fuyuan Lyu

Xue Liu

This study unveils the In-Context Evolutionary Search (ICE-SEARCH) method, the first work that melds language models (LMs) with evolutionary… (voir plus) algorithms for feature selection (FS) tasks and demonstrates its effectiveness in Medical Predictive Analytics (MPA) applications. ICE-SEARCH harnesses the crossover and mutation capabilities inherent in LMs within an evolutionary framework, significantly improving FS through the model's comprehensive world knowledge and its adaptability to a variety of roles. Our evaluation of this methodology spans three crucial MPA tasks: stroke, cardiovascular disease, and diabetes, where ICE-SEARCH outperforms traditional FS methods in pinpointing essential features for medical applications. ICE-SEARCH achieves State-of-the-Art (SOTA) performance in stroke prediction and diabetes prediction; the Decision-Randomized ICE-SEARCH ranks as SOTA in cardiovascular disease prediction. Our results not only demonstrate the efficacy of ICE-SEARCH in medical FS but also underscore the versatility, efficiency, and scalability of integrating LMs in FS tasks. The study emphasizes the critical role of incorporating domain-specific insights, illustrating ICE-SEARCH's robustness, generalizability, and swift convergence. This opens avenues for further research into comprehensive and intricate FS landscapes, marking a significant stride in the application of artificial intelligence in medical predictive analytics.

2024-02-27

ArXiv (prépublication)

doi.org

arxiv.org

A density estimation perspective on learning from pairwise human preferences

Daniel D. Johnson

Learning from human feedback (LHF) -- and in particular learning from pairwise preferences -- has recently become a crucial ingredient in tr… (voir plus)aining large language models (LLMs), and has been the subject of much research. Most recent works frame it as a reinforcement learning problem, where a reward function is learned from pairwise preference data and the LLM is treated as a policy which is adapted to maximize the rewards, often under additional regularization constraints. We propose an alternative interpretation which centers on the generative process for pairwise preferences and treats LHF as a density estimation problem. We provide theoretical and empirical results showing that for a family of generative processes defined via preference behavior distribution equations, training a reward function on pairwise preferences effectively models an annotator's implicit preference distribution. Finally, we discuss and present findings on"annotator misspecification"-- failure cases where wrong modeling assumptions are made about annotator behavior, resulting in poorly-adapted models -- suggesting that approaches that learn from pairwise human preferences could have trouble learning from a population of annotators with diverse viewpoints.

2024-02-26

TMLR (accepté)

doi.org

openreview.net

RAMEN Unveils Clinical Variable Networks for COVID-19 Severity and Long COVID Using Absorbing Random Walks and Genetic Algorithms

Yiwei Xiong

Jingtao Wang

Xiaoxiao Shang

Tingting Chen

Douglas D. Fraser

Gregory Fonseca

Simon Rousseau

Jun Ding

The COVID-19 pandemic has significantly altered global socioeconomic structures and individual lives. Understanding the disease mechanisms a… (voir plus)nd facilitating diagnosis requires comprehending the complex interplay among clinical factors like demographics, symptoms, comorbidities, treatments, lab results, complications, and other metrics, and their relation to outcomes such as disease severity and long term outcomes ( e . g ., post-COVID-19 condition/long COVID). Conventional correlational methods struggle with indirect and directional connections among these factors, while standard graphical methods like Bayesian networks are computationally demanding for extensive clinical variables. In response, we introduced RAMEN, a methodology that integrates Genetic Algorithms with random walks for efficient Bayesian network inference, designed to map the intricate relationships among clinical variables. Applying RAMEN to the Biobanque québécoise de la COVID-19 (BQC19) dataset, we identified critical markers for long COVID and varying disease severity. The Bayesian Network, corroborated by existing literature and supported through multi-omics analyses, highlights significant clinical variables linked to COVID-19 outcomes. RAMEN’s ability to accurately map these connections contributes substantially to developing early and effective diagnostics for severe COVID-19 and long COVID.

2024-02-26

bioRxiv (prépublication)

doi.org

Effective Latent Differential Equation Models via Attention and Multiple Shooting

Germán Abrevaya

Mahta Ramezanian-Panahi

Jean-christophe Gagnon-audet

Irina Rish

Pablo Polosecki

Silvina Ponce Dawson

Guillermo Cecchi

Guillaume Dumas

2024-02-25

TMLR (accepté)

openreview.net

Correction to: Multi-agent reinforcement learning for fast-timescale demand response of residential loads

Vincent Mai

Philippe Maisonneuve

Tianyu Zhang

Hadi Nekoei

Liam Paull

Antoine Lesage-Landry

2024-02-22

Machine-Mediated Learning (publié)

doi.org

Intra-Host Evolution Analyses in an Immunosuppressed Patient Supports SARS-CoV-2 Viral Reservoir Hypothesis

Dominique Fournelle

Fatima Mostefai

Elsa Brunet-Ratnasingham

Raphaël Poujol

Jean-Christophe Grenier

José Héctor Gálvez

Amélie Pagliuzza

Inès Levade

Sandrine Moreira

Mehdi Benlarbi

Guillaume Beaudoin-Bussières

Gabrielle Gendron-Lepage

Catherine Bourassa

Alexandra Tauzin

Simon Grandjean Lapierre

Nicolas Chomont

Andrés Finzi

Daniel E. Kaufmann

Morgan Craig

Julie G. Hussin

Throughout the SARS-CoV-2 pandemic, several variants of concern (VOCs) have been identified, many of which share recurrent mutations in the … (voir plus)spike glycoprotein’s receptor-binding domain (RBD). This region coincides with known epitopes and can therefore have an impact on immune escape. Protracted infections in immunosuppressed patients have been hypothesized to lead to an enrichment of such mutations and therefore drive evolution towards VOCs. Here, we present the case of an immunosuppressed patient that developed distinct populations with immune escape mutations throughout the course of their infection. Notably, by investigating the co-occurrence of substitutions on individual sequencing reads in the RBD, we found quasispecies harboring mutations that confer resistance to known monoclonal antibodies (mAbs) such as S:E484K and S:E484A. These mutations were acquired without the patient being treated with mAbs nor convalescent sera and without them developing a detectable immune response to the virus. We also provide additional evidence for a viral reservoir based on intra-host phylogenetics, which led to a viral substrain that evolved elsewhere in the patient’s body, colonizing their upper respiratory tract (URT). The presence of SARS-CoV-2 viral reservoirs can shed light on protracted infections interspersed with periods where the virus is undetectable, and potential explanations for long-COVID cases.

2024-02-22

Viruses (publié)

doi.org

La plateforme Mila Ventures

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

Publications

La plateforme Mila Ventures

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

Mots-clés populaires:

Publications