Portrait de (Rex) Devon Hjelm

(Rex) Devon Hjelm

Membre affilié
Chercheur scientifique, Apple MLR

Étudiants actuels

Doctorat - Université de Montréal
Co-superviseur⋅e :

Publications

Generative Models for Decision Making
Bogdan Mazoure
Lisa Lee
Roberta Raileanu
Yilun Du
Walter Talbott
Katherine Metcalf
Alexander T Toshev
Generative Artificial Intelligence (AI) has made significant advancements in recent years, particularly with the development of large langua… (voir plus)ge and diffusion models. These generative models have demonstrated impressive capabilities in various tasks, such as text generation and image and audio synthesis. Concurrently, Reinforcement Learning (RL) has made significant strides in solving complex sequential decision-making problems with the help of external knowledge sources . However, there remains untapped potential in combining generative models with RL algorithms to tackle real-world challenges, particularly to improve sample efficiency of tabula rasa training by introducing priors from related domains such as visual question-answering, image captioning and image generation. This workshop aims to bring together researchers and practitioners from the fields of generative AI and reinforcement learning to explore the latest advances, methodologies, and applications. By fostering collaborations between these two domains, we intend to unlock new opportunities for addressing complex problems that lie at the intersection of both fields.
Large Language Models as Generalizable Policies for Embodied Tasks
Andrew Szot
Max Schwarzer
Harsh Agrawal
Bogdan Mazoure
Walter Talbott
Rin Metcalf
Natalie Mackraz
Alexander T Toshev
Poly-View Contrastive Learning
Amitis Shidani
Jason Ramapuram
Russell Webb
Eeshan Gunesh Dhekane
Dan Busbridge
Self-supervised multimodal learning for group inferences from MRI data: Discovering disorder-relevant brain regions and multimodal links
Alex Fedorov
Eloy Geenjaar
Lei Wu
Tristan Sylvain
Thomas P. DeRamus
Margaux Luck
Maria Misiura
Girish Mittapalle
Sergey M. Plis
Vince D. Calhoun
Value function estimation using conditional diffusion models for control
Bogdan Mazoure
Walter Talbott
Miguel Ángel Bautista
Alexander T Toshev
Joshua M. Susskind
PatchBlender: A Motion Prior for Video Transformers
Gabriele Prato
Yale Song
Janarthanan Rajendran
Neel Joshi
Robust Contrastive Learning against Noisy Views
Ching-Yao Chuang
Xin Wang
Vibhav Vineet
Neel Joshi
Antonio Torralba
Stefanie Jegelka
Yale Song
Contrastive learning relies on an assumption that positive pairs contain related views that share certain underlying information about an in… (voir plus)stance, e.g., patches of an image or co-occurring multimodal signals of a video. What if this assumption is violated? The literature suggests that contrastive learning produces suboptimal representations in the presence of noisy views, e.g., false positive pairs with no apparent shared information. In this work, we pro-pose a new contrastive loss function that is robust against noisy views. We provide rigorous theoretical justifications by showing connections to robust symmetric losses for noisy binary classification and by establishing a new contrastive bound for mutual information maximization based on the Wasserstein distance measure. The proposed loss is completely modality-agnostic and a simple drop-in replacement for the InfoNCE loss, which makes it easy to apply to ex-isting contrastive frameworks. We show that our approach provides consistent improvements over the state-of-the-art on image, video, and graph contrastive learning bench-marks that exhibit a variety of real-world noise patterns.
Robust Contrastive Learning against Noisy Views
Ching-Yao Chuang
Xin Wang
Vibhav Vineet
Neel Joshi
Antonio Torralba
Stefanie Jegelka
Ya-heng Song
Contrastive learning relies on an assumption that positive pairs contain related views that share certain underlying information about an in… (voir plus)stance, e.g., patches of an image or co-occurring multimodal signals of a video. What if this assumption is violated? The literature suggests that contrastive learning produces suboptimal representations in the presence of noisy views, e.g., false positive pairs with no apparent shared information. In this work, we pro-pose a new contrastive loss function that is robust against noisy views. We provide rigorous theoretical justifications by showing connections to robust symmetric losses for noisy binary classification and by establishing a new contrastive bound for mutual information maximization based on the Wasserstein distance measure. The proposed loss is completely modality-agnostic and a simple drop-in replacement for the InfoNCE loss, which makes it easy to apply to ex-isting contrastive frameworks. We show that our approach provides consistent improvements over the state-of-the-art on image, video, and graph contrastive learning bench-marks that exhibit a variety of real-world noise patterns.
Understanding by Understanding Not: Modeling Negation in Language Models
Negation is a core construction in natural language. Despite being very successful on many tasks, state-of-the-art pre-trained language mode… (voir plus)ls often handle negation incorrectly. To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus. By training BERT with the resulting combined objective we reduce the mean top 1 error rate to 4% on the negated LAMA dataset. We also see some improvements on the negated NLI benchmarks.
DATA-EFFICIENT REINFORCEMENT LEARNING
Nitarshan Rajkumar
Michael Noukhovitch
Ankesh Anand
Philip Bachman
Data efficiency poses a major challenge for deep reinforcement learning. We approach this issue from the perspective of self-supervised repr… (voir plus)esentation learning, leveraging reward-free exploratory data to pretrain encoder networks. We employ a novel combination of latent dynamics modelling and goal-reaching objectives, which exploit the inherent structure of data in reinforcement learning. We demonstrate that our method scales well with network capacity and pretraining data. When evaluated on the Atari 100k data-efficiency benchmark, our approach significantly outperforms previous methods combining unsupervised pretraining with task-specific finetuning, and approaches human-level performance.
Pretraining Representations for Data-Efficient Reinforcement Learning
Max Schwarzer
Nitarshan Rajkumar
Michael Noukhovitch
Ankesh Anand
Philip Bachman
Data efficiency is a key challenge for deep reinforcement learning. We address this problem by using unlabeled data to pretrain an encoder w… (voir plus)hich is then finetuned on a small amount of task-specific data. To encourage learning representations which capture diverse aspects of the underlying MDP, we employ a combination of latent dynamics modelling and unsupervised goal-conditioned RL. When limited to 100k steps of interaction on Atari games (equivalent to two hours of human experience), our approach significantly surpasses prior work combining offline representation pretraining with task-specific finetuning, and compares favourably with other pretraining methods that require orders of magnitude more data. Our approach shows particular promise when combined with larger models as well as more diverse, task-aligned observational data -- approaching human-level performance and data-efficiency on Atari in our best setting.
Implicit Regularization in Deep Learning: A View from Function Space
Aristide Baratin
Thomas George
César Laurent
We approach the problem of implicit regularization in deep learning from a geometrical viewpoint. We highlight a possible regularization eff… (voir plus)ect induced by a dynamical alignment of the neural tangent features introduced by Jacot et al, along a small number of task-relevant directions. By extrapolating a new analysis of Rademacher complexity bounds in linear models, we propose and study a new heuristic complexity measure for neural networks which captures this phenomenon, in terms of sequences of tangent kernel classes along in the learning trajectories.