Clement Gehring

Alumni

Billets de blogue

7 août 2024

Équations différentielles neuronales pour la régulation de la température dans les bâtiments dans le cadre de programmes de réponse à la demande

par

Vincent Taboga

Clement Gehring

Mathieu Le Cam

Hanane Dagdougui

Pierre-Luc Bacon

Lire l'article

Publications

Neural differential equations for temperature control in buildings under demand response programs

Mathieu Le Cam

2024-07-31

Applied Energy (publié)

doi.org

Do Transformer World Models Give Better Policy Gradients?

A natural approach for reinforcement learning is to predict future rewards by unrolling a neural network world model, and to backpropagate t… (voir plus)hrough the resulting computational graph to learn a policy. However, this method often becomes impractical for long horizons since typical world models induce hard-to-optimize loss landscapes. Transformers are known to efficiently propagate gradients over long horizons: could they be the solution to this problem? Surprisingly, we show that commonly-used transformer world models produce circuitous gradient paths, which can be detrimental to long-range policy gradients. To tackle this challenge, we propose a class of world models called Actions World Models (AWMs), designed to provide more direct routes for gradient propagation. We integrate such AWMs into a policy gradient framework that underscores the relationship between network architectures and the policy gradient updates they inherently represent. We demonstrate that AWMs can generate optimization landscapes that are easier to navigate even when compared to those from the simulator itself. This property allows transformer AWMs to produce better policies than competitive baselines in realistic long-horizon tasks.

2024-07-07

Proceedings of the 41st International Conference on Machine Learning (publié)

doi.org

proceedings.mlr.press

Bridging State and History Representations: Understanding Self-Predictive RL

Benjamin Eysenbach

Representations are at the core of all deep reinforcement learning (RL) methods for both Markov decision processes (MDPs) and partially obse… (voir plus)rvable Markov decision processes (POMDPs). Many representation learning methods and theoretical frameworks have been developed to understand what constitutes an effective representation. However, the relationships between these methods and the shared properties among them remain unclear. In this paper, we show that many of these seemingly distinct methods and frameworks for state and history abstractions are, in fact, based on a common idea of self-predictive abstraction. Furthermore, we provide theoretical insights into the widely adopted objectives and optimization, such as the stop-gradient technique, in learning self-predictive representations. These findings together yield a minimalist algorithm to learn self-predictive representations for states and histories. We validate our theories by applying our algorithm to standard MDPs, MDPs with distractors, and POMDPs with sparse rewards. These findings culminate in a set of preliminary guidelines for RL practitioners.

2024-01-15

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Course Correcting Koopman Representations

Koopman representations aim to learn features of nonlinear dynamical systems (NLDS) which lead to linear dynamics in the latent space. Theor… (voir plus)etically, such features can be used to simplify many problems in modeling and control of NLDS. In this work we study autoencoder formulations of this problem, and different ways they can be used to model dynamics, specifically for future state prediction over long horizons. We discover several limitations of predicting future states in the latent space and propose an inference-time mechanism, which we refer to as Periodic Reencoding, for faithfully capturing long term dynamics. We justify this method both analytically and empirically via experiments in low and high dimensional NLDS.

2024-01-15

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Clement Gehring

Billets de blogue

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Clement Gehring

Billets de blogue

Publications