Harry Zhao

Collaborateur·rice alumni - McGill

Superviseur⋅e principal⋅e

Doina Precup

Co-supervisor

Yoshua Bengio

Sujets de recherche

Apprentissage de représentations

Apprentissage par renforcement

Neurosciences computationnelles

Raisonnement

Site web

Google Scholar

GitHub

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Lire l'article

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

22 novembre 2021

Un agent de planification inspiré par la conscience pour l’apprentissage par renforcement basé sur un modèle

par

Harry Mingde Zhao

Lire l'article

Publications

Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Networks

Mingde Zhao

Xiao-Wen Chang

The core operation of current Graph Neural Networks (GNNs) is the aggregation enabled by the graph Laplacian or message passing, which filte… (voir plus)rs the neighborhood node information. Though effective for various tasks, in this paper, we show that they are potentially a problematic factor underlying all GNN methods for learning on certain datasets, as they force the node representations similar, making the nodes gradually lose their identity and become indistinguishable. Hence, we augment the aggregation operations with their dual, i.e. diversification operators that make the node more distinct and preserve the identity. Such augmentation replaces the aggregation with a two-channel filtering process that, in theory, is beneficial for enriching the node representations. In practice, the proposed two-channel filters can be easily patched on existing GNN methods with diverse training strategies, including spectral and spatial (message passing) methods. In the experiments, we observe desired characteristics of the models and significant performance boost upon the baselines on 9 node classification tasks.

2022-11-21

NeurIPS.cc/2022/Workshop/GLFrontiers (accepté)

doi.org

openreview.net

Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL

Akram Erraqabi

Marlos C. Machado

Harry Zhao

Mingde Zhao

Sainbayar Sukhbaatar

Alessandro Lazaric

Ludovic Denoyer

Yoshua Bengio

In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting, with applications ranging from… (voir plus) skill discovery to reward shaping. Recently, learning the Laplacian representation has been framed as the optimization of a temporally-contrastive objective to overcome its computational limitations in large (or continuous) state spaces. However, this approach requires uniform access to all states in the state space, overlooking the exploration problem that emerges during the representation learning process. In this work, we propose an alternative method that is able to recover, in a non-uniform-prior setting, the expressiveness and the desired properties of the Laplacian representation. We do so by combining the representation learning with a skill-based covering policy, which provides a better training distribution to extend and refine the representation. We also show that a simple augmentation of the representation objective with the learned temporal abstractions improves dynamics-awareness and helps exploration. We find that our method succeeds as an alternative to the Laplacian in the non-uniform setting and scales to challenging continuous control environments. Finally, even if our method is not optimized for skill discovery, the learned skills can successfully solve difficult continuous navigation tasks with sparse rewards, where standard skill discovery approaches are no so effective.

2022-05-19

auai.org/UAI/2022/Conference (poster)

doi.org

proceedings.mlr.press

Exploration-Driven Representation Learning in Reinforcement Learning

Akram Erraqabi

Harry Zhao

Mingde Zhao

Marlos C. Machado

Yoshua Bengio

Sainbayar Sukhbaatar

Ludovic Denoyer

Alessandro Lazaric

Learning reward-agnostic representations is an emerging paradigm in reinforcement learning. These representations can be leveraged for sever… (voir plus)al purposes ranging from reward shaping to skill discovery. Nevertheless, in order to learn such representations, existing methods often rely on assuming uniform access to the state space. With such a privilege, the agent’s coverage of the environment can be limited which hurts the quality of the learned representations. In this work, we introduce a method that explicitly couples representation learning with exploration when the agent is not provided with a uniform prior over the state space. Our method learns representations that constantly drive exploration while the data generated by the agent’s exploratory behavior drives the learning of better representations. We empirically validate our approach in goal-achieving tasks, demonstrating that the learned representation captures the dynamics of the environment, leads to more accurate value estimation, and to faster credit assignment, both when used for control and for reward shaping. Finally, the exploratory policy that emerges from our approach proves to be successful at continuous navigation tasks with sparse rewards.

2021-07-21

ICML.cc/2021/Workshop/URL (poster)

openreview.net

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Harry Zhao

Billets de blogue

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Harry Zhao

Billets de blogue

Publications