Hugo Larochelle

Towards Sustainable Investment Policies Informed by Opponent Shaping

Juan Agustin Duque

razvan ciuca

Ayoub Echchahed

Addressing climate change requires global coordination, yet rational economic actors often prioritize immediate gains over collective welfar… (see more)e, resulting in social dilemmas. InvestESG is a recently proposed multi-agent simulation that captures the dynamic interplay between investors and companies under climate risk. We provide a formal characterization of the conditions under which InvestESG exhibits an intertemporal social dilemma, deriving theoretical thresholds at which individual incentives diverge from collective welfare. Building on this, we apply Advantage Alignment, a scalable opponent shaping algorithm shown to be effective in general-sum games, to influence agent learning in InvestESG. We offer theoretical insights into why Advantage Alignment systematically favors socially beneficial equilibria by biasing learning dynamics toward cooperative outcomes. Our results demonstrate that strategically shaping the learning processes of economic agents can result in better outcomes that could inform policy mechanisms to better align market incentives with long-term sustainability goals.

2025-06-23

rl-conference.cc/RLC/2025/Workshop/CoCoMARL (poster)

openreview.net

Bringing SAM to new heights: Leveraging elevation data for tree crown segmentation from drone imagery

Mélisande Teng

Arthur Ouaknine

Etienne Lalibert'e

Yoshua Bengio

David Rolnick

Hugo Larochelle

Information on trees at the individual level is crucial for monitoring forest ecosystems and planning forest management. Current monitoring … (see more)methods involve ground measurements, requiring extensive cost, time and labor. Advances in drone remote sensing and computer vision offer great potential for mapping individual trees from aerial imagery at broad-scale. Large pre-trained vision models, such as the Segment Anything Model (SAM), represent a particularly compelling choice given limited labeled data. In this work, we compare methods leveraging SAM for the task of automatic tree crown instance segmentation in high resolution drone imagery in three use cases: 1) boreal plantations, 2) temperate forests and 3) tropical forests. We also study the integration of elevation data into models, in the form of Digital Surface Model (DSM) information, which can readily be obtained at no additional cost from RGB drone imagery. We present BalSAM, a model leveraging SAM and DSM information, which shows potential over other methods, particularly in the context of plantations. We find that methods using SAM out-of-the-box do not outperform a custom Mask R-CNN, even with well-designed prompts. However, efficiently tuning SAM end-to-end and integrating DSM information are both promising avenues for tree crown instance segmentation models.

2025-06-05

ArXiv (preprint)

arxiv.org

The Search for Squawk: Agile Modeling in Bioacoustics

Vincent Dumoulin

Otilia Stretcu

Jenny Hamer

Lauren Harrell

Rob Laber

Hugo Larochelle

Bart van Merriënboer

Amanda Navine

Patrick Hart

Ben Williams

Timothy A. C. Lamont

Tries B. Rasak

Mars Coral Restoration Team

Sheryn Brodie

Brendan Doohan

Philip Eichinski

Paul Roe

Lin Schwarzkopf

Tom Denton

2025-05-05

ArXiv (preprint)

arxiv.org

The Search for Squawk: Agile Modeling in Bioacoustics

Vincent Dumoulin

Otilia Stretcu

Jenny Hamer

Lauren Harrell

Rob Laber

Hugo Larochelle

Bart van Merriënboer

Amanda Navine

Patrick Hart

Ben Williams

Timothy A. C. Lamont

Tries B. Rasak

Mars Coral Restoration Team

Sheryn Brodie

Brendan Doohan

Philip Eichinski

Paul Roe

Lin Schwarzkopf

Tom Denton

2025-05-05

ArXiv (preprint)

arxiv.org

Assessing SAM for Tree Crown Instance Segmentation from Drone Imagery

Mélisande Teng

Arthur Ouaknine

Etienne Lalibert'e

Yoshua Bengio

David Rolnick

Hugo Larochelle

2025-03-26

ArXiv (preprint)

doi.org

arxiv.org

Assessing SAM for Tree Crown Instance Segmentation from Drone Imagery

Mélisande Teng

Arthur Ouaknine

Etienne Lalibert'e

Yoshua Bengio

David Rolnick

Hugo Larochelle

2025-03-26

ArXiv (preprint)

arxiv.org

Capturing Individual Human Preferences with Reward Features

Andr'e Barreto

Vincent Dumoulin

Yiran Mao

Nicolas Perez-Nieves

Bobak Shahriari

Yann Dauphin

Doina Precup

Hugo Larochelle

2025-03-21

ArXiv (preprint)

arxiv.org

Capturing Individual Human Preferences with Reward Features

Andre Barreto

Vincent Dumoulin

Yiran Mao

Nicolas Perez-Nieves

Bobak Shahriari

Yann Dauphin

Doina Precup

Hugo Larochelle

Reinforcement learning from human feedback usually models preferences using a reward model that does not distinguish between people. We argu… (see more)e that this is unlikely to be a good design choice in contexts with high potential for disagreement, like in the training of large language models. We propose a method to specialise a reward model to a person or group of people. Our approach builds on the observation that individual preferences can be captured as a linear combination of a set of general reward features. We show how to learn such features and subsequently use them to quickly adapt the reward model to a specific individual, even if their preferences are not reflected in the training data. We present experiments with large language models comparing the proposed architecture with a non-adaptive reward model and also adaptive counterparts, including models that do in-context personalisation. Depending on how much disagreement there is in the training data, our model either significantly outperforms the baselines or matches their performance with a simpler architecture and more stable training.

2025-03-21

ArXiv (preprint)

doi.org

arxiv.org

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL

Ghada Sokar

Johan Samir Obando Ceron

Aaron Courville

Hugo Larochelle

Pablo Samuel Castro

The use of deep neural networks in reinforcement learning (RL) often suffers from performance degradation as model size increases. While sof… (see more)t mixtures of experts (SoftMoEs) have recently shown promise in mitigating this issue for online RL, the reasons behind their effectiveness remain largely unknown. In this work we provide an in-depth analysis identifying the key factors driving this performance gain. We discover the surprising result that tokenizing the encoder output, rather than the use of multiple experts, is what is behind the efficacy of SoftMoEs. Indeed, we demonstrate that even with an appropriately scaled single expert, we are able to maintain the performance gains, largely thanks to tokenization.

2025-01-22

ICLR.cc/2025/Conference (spotlight)

doi.org

openreview.net

Selective Unlearning via Representation Erasure Using Domain Adversarial Training

Nazanin Mohammadi Sepahvand

Eleni Triantafillou

Hugo Larochelle

Doina Precup

James J. Clark

Daniel M. Roy

Gintare Karolina Dziugaite

When deploying machine learning models in the real world, we often face the challenge of “unlearning” specific data points or subsets a… (see more)fter training. Inspired by Domain-Adversarial Training of Neural Networks (DANN), we propose a novel algorithm,SURE, for targeted unlearning.SURE treats the process as a domain adaptation problem, where the “forget set” (data to be removed) and a validation set from the same distribution form two distinct domains. We train a domain classifier to discriminate between representations from the forget and validation sets.Using a gradient reversal strategy similar to DANN, we perform gradient updates to the representations to “fool” the domain classifier and thus obfuscate representations belonging to the forget set. Simultaneously, gradient descent is applied to the retain set (original training data minus the forget set) to preserve its classification performance. Unlike other unlearning approaches whose training objectives are built based on model outputs, SURE directly manipulates the representations.This is key to ensure robustness against a set of more powerful attacks than currently considered in the literature, that aim to detect which examples were unlearned through access to learned embeddings. Our thorough experiments reveal that SURE has a better unlearning quality to utility trade-off compared to other standard unlearning techniques for deep neural networks.

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

Selective Unlearning via Representation Erasure Using Domain Adversarial Training

Nazanin Mohammadi Sepahvand

Eleni Triantafillou

Hugo Larochelle

Doina Precup

James J. Clark

Daniel M. Roy

Gintare Karolina Dziugaite

When deploying machine learning models in the real world, we often face the challenge of “unlearning” specific data points or subsets a… (see more)fter training. Inspired by Domain-Adversarial Training of Neural Networks (DANN), we propose a novel algorithm,SURE, for targeted unlearning.SURE treats the process as a domain adaptation problem, where the “forget set” (data to be removed) and a validation set from the same distribution form two distinct domains. We train a domain classifier to discriminate between representations from the forget and validation sets.Using a gradient reversal strategy similar to DANN, we perform gradient updates to the representations to “fool” the domain classifier and thus obfuscate representations belonging to the forget set. Simultaneously, gradient descent is applied to the retain set (original training data minus the forget set) to preserve its classification performance. Unlike other unlearning approaches whose training objectives are built based on model outputs, SURE directly manipulates the representations.This is key to ensure robustness against a set of more powerful attacks than currently considered in the literature, that aim to detect which examples were unlearned through access to learned embeddings. Our thorough experiments reveal that SURE has a better unlearning quality to utility trade-off compared to other standard unlearning techniques for deep neural networks.

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

Many-Shot In-Context Learning

Rishabh Agarwal

Avi Singh

Lei M Zhang

Bernd Bohnet

Stephanie C.Y. Chan

Luis Rosias

Biao Zhang

Ankesh Anand

Zaheer Abbas

Azade Nova

John D Co-Reyes

Eric Chu

Feryal Behbahani

Aleksandra Faust

Hugo Larochelle

Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, w… (see more)ithout any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples – the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative and discriminative tasks. While promising, many-shot ICL can be bottlenecked by the available amount of human-generated outputs. To mitigate this limitation, we explore two new settings: (1) "Reinforced ICL" that uses model-generated chain-of-thought rationales in place of human rationales, and (2) "Unsupervised ICL" where we remove rationales from the prompt altogether, and prompts the model only with domain-specific inputs. We find that both Reinforced and Unsupervised ICL can be quite effective in the many-shot regime, particularly on complex reasoning tasks. We demonstrate that, unlike few-shot learning, many-shot learning is effective at overriding pretraining biases, can learn high-dimensional functions with numerical inputs, and performs comparably to supervised fine-tuning. Finally, we reveal the limitations of next-token prediction loss as an indicator of downstream ICL performance.

2024-09-25

NeurIPS.cc/2024/Conference (spotlight)

doi.org

openreview.net

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Hugo Larochelle

Biography

Current Students

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Hugo Larochelle

Biography

Current Students

Publications