Yann Dauphin

Capturing Individual Human Preferences with Reward Features

Andre Barreto

Vincent Dumoulin

Yiran Mao

Nicolas Perez-Nieves

Mark Rowland

Bobak Shahriari

Yann Dauphin

Doina Precup

Hugo Larochelle

Reinforcement learning from human feedback usually models preferences using a reward model that does not distinguish between people. We argu… (voir plus)e that this is unlikely to be a good design choice in contexts with high potential for disagreement, like in the training of large language models. We propose a method to specialise a reward model to a person or group of people. Our approach builds on the observation that individual preferences can be captured as a linear combination of a set of general reward features. We show how to learn such features and subsequently use them to quickly adapt the reward model to a specific individual, even if their preferences are not reflected in the training data. We present experiments with large language models comparing the proposed architecture with a non-adaptive reward model and also adaptive counterparts, including models that do in-context personalisation. Depending on how much disagreement there is in the training data, our model either significantly outperforms the baselines or matches their performance with a simpler architecture and more stable training.

2025-09-17

NeurIPS.cc/2025/Conference (poster)

doi.org

openreview.net

Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation

Hi Bn

Ramakrishna Appicharla

Kamal Kumar

Asif Gupta

Dzmitry Bahdanau

Kyunghyun Cho

Yoshua Ben

Ondrej Bojar

Christian Buck

Christian Federmann

Yong Cheng

Lu Jiang

Wolfgang Macherey

Alexis Conneau

Guillaume Lample. 2019

Cross

Yinhan Liu

Jiatao Gu

Naman Goyal

Sergey Xian Li … (voir 45 de plus)

Carol MyersScotton. 1997

El Moatez

Billah Nagoudi

AbdelRahim Elmadany

Muhammad AbdulMageed. 2021. Investigat

Myle Ott

Sergey Edunov

Alexei R Baevski

Parth Patwa

Gustavo Aguilar

Sudipta Kar

Suraj

Srinivas Pandey

Björn Pykl

Gambäck

Tanmoy

Ashish Vaswani

Noam M. Shazeer

Niki Parmar

dukasz Kaiser

Illia Polosukhin. 2017

Attention

Genta Indra Winata

Andrea Madotto

ChienSheng

Wu Pascale

Fung

Codeswitching

ing. In

Felix Wu

Angela Fan

Yann Dauphin

Linting Xue

Noah Constant

Mihir Adam Roberts

Rami Kale

Aditya AlRfou

Aditya Siddhant

Barua

Shuyan Zhou

Xiangkai Zeng

Antonios Yingqi Zhou

Anastasopoulos Graham

Neubig. 2019

Im

The widespread online communication in a modern multilingual world has provided opportunities to blend more than one language (aka code-mixe… (voir plus)d language) in a single utterance. This has resulted a formidable challenge for the computational models due to the scarcity of annotated data and presence of noise. A potential solution to mitigate the data scarcity problem in low-resource setup is to leverage existing data in resource-rich language through translation. In this paper, we tackle the problem of code-mixed (Hinglish and Bengalish) to English machine translation. First, we synthetically develop HINMIX, a parallel corpus of Hinglish to English, with ~4.2M sentence pairs. Subsequently, we propose RCMT, a robust perturbation based joint-training model that learns to handle noise in the real-world code-mixed text by parameter sharing across clean and noisy words. Further, we show the adaptability of RCMT in a zero-shot setup for Bengalish to English translation. Our evaluation and comprehensive analyses qualitatively and quantitatively demonstrate the superiority of RCMT over state-of-the-art code-mixed and robust translation methods.

2024-03-24

ArXiv (prépublication)

doi.org

arxiv.org

A density estimation perspective on learning from pairwise human preferences

Daniel D. Johnson

Learning from human feedback (LHF) -- and in particular learning from pairwise preferences -- has recently become a crucial ingredient in tr… (voir plus)aining large language models (LLMs), and has been the subject of much research. Most recent works frame it as a reinforcement learning problem, where a reward function is learned from pairwise preference data and the LLM is treated as a policy which is adapted to maximize the rewards, often under additional regularization constraints. We propose an alternative interpretation which centers on the generative process for pairwise preferences and treats LHF as a density estimation problem. We provide theoretical and empirical results showing that for a family of generative processes defined via preference behavior distribution equations, training a reward function on pairwise preferences effectively models an annotator's implicit preference distribution. Finally, we discuss and present findings on"annotator misspecification"-- failure cases where wrong modeling assumptions are made about annotator behavior, resulting in poorly-adapted models -- suggesting that approaches that learn from pairwise human preferences could have trouble learning from a population of annotators with diverse viewpoints.

2024-02-26

TMLR (accepté)

doi.org

openreview.net

JaxPruner: A concise library for sparsity research

Joo Hyung Lee

Wonpyo Park

Nicole Elyse Mitchell

Jonathan Pilault

Johan Samir Obando Ceron

Han-Byul Kim

Namhoon Lee

Elias Frantar

Yun Long

Amir Yazdanbakhsh

Shivani Agrawal

Suvinay Subramanian

Xin Wang

Sheng-Chun Kao

Xingyao Zhang

Trevor Gale

Aart J.C. Bik

Woohyun Han

Milen Ferev

Zhonglin Han … (voir 5 de plus)

Hong-Seok Kim

Yann Dauphin

Gintare Karolina Dziugaite

Pablo Samuel Castro

Utku Evci

This paper introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research. JaxPruner aims … (voir plus)to accelerate research on sparse neural networks by providing concise implementations of popular pruning and sparse training algorithms with minimal memory and latency overhead. Algorithms implemented in JaxPruner use a common API and work seamlessly with the popular optimization library Optax, which, in turn, enables easy integration with existing JAX based libraries. We demonstrate this ease of integration by providing examples in four different codebases: Scenic, t5x, Dopamine and FedJAX and provide baseline experiments on popular benchmarks.

2024-01-07

Conference on Parsimony and Learning (publié)

doi.org

proceedings.mlr.press

Selective Brain Damage: Measuring the Disparate Impact of Model Pruning

Sara Hooker

Aaron Courville

Yann Dauphin

Andrea Frome

Neural network pruning techniques have demonstrated it is possible to remove the majority of weights in a network with surprisingly little d… (voir plus)egradation to test set accuracy. However, this measure of performance conceals significant differences in how different classes and images are impacted by pruning. We find that certain examples, which we term pruning identified exemplars (PIEs), and classes are systematically more impacted by the introduction of sparsity. Removing PIE images from the test-set greatly improves top-1 accuracy for both pruned and non-pruned models. These hard-to-generalize-to images tend to be mislabelled, of lower image quality, depict multiple objects or require fine-grained classification. These findings shed light on previously unknown trade-offs, and suggest that a high degree of caution should be exercised before pruning is used in sensitive domains.

2019-11-12

arXiv.org (prépublication)

openreview.net

What Do Compressed Deep Neural Networks Forget

Sara Hooker

Aaron Courville

Gregory Clark

Yann Dauphin

Andrea Frome

Deep neural network pruning and quantization techniques have demonstrated it is possible to achieve high levels of compression with surprisi… (voir plus)ngly little degradation to test set accuracy. However, this measure of performance conceals significant differences in how different classes and images are impacted by model compression techniques. We find that models with radically different numbers of weights have comparable top-line performance metrics but diverge considerably in behavior on a narrow subset of the dataset. This small subset of data points, which we term Pruning Identified Exemplars (PIEs) are systematically more impacted by the introduction of sparsity. Compression disproportionately impacts model performance on the underrepresented long-tail of the data distribution. PIEs over-index on atypical or noisy images that are far more challenging for both humans and algorithms to classify. Our work provides intuition into the role of capacity in deep neural networks and the trade-offs incurred by compression. An understanding of this disparate impact is critical given the widespread deployment of compressed models in the wild.

2019-11-12

ArXiv (prépublication)

arxiv.org

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Yann Dauphin

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Yann Dauphin

Publications