Doina Precup

Sumana Basu

PhD - McGill University

Co-supervisor :

Adriana Romero Soriano

Collaborating Alumni - McGill University

Lynn Cherif

Master's Research - McGill University

Co-supervisor :

PhD - McGill University

Co-supervisor :

PhD - McGill University

Principal supervisor :

David Meger

Jonathan Colaço Carr

Master's Research - McGill University

Principal supervisor :

Prakash Panangaden

Élodie Coté-Gauthier

Collaborating researcher - McGill University

Co-supervisor :

Isabeau Prémont-Schwarz

Franco Del Balso

Research Intern - Université de Montréal

Jesse Farebrother

PhD - McGill University

Principal supervisor :

Marc Gendron-Bellemare

PhD - McGill University

Principal supervisor :

PhD - McGill University

Haque Ishfaq

Collaborating Alumni - McGill University

Mohammad Sami Nur Islam Islam

Master's Research - McGill University

Arushi Jain

Collaborating Alumni - McGill University

PhD - Polytechnique Montréal

Flemming Kondrup

Postdoctorate - McGill University

Elaine Lau

Master's Research - McGill University

Jonathan Lebensold

Collaborating Alumni - McGill University

Undergraduate - McGill University

Ray Luo

PhD - McGill University

Principal supervisor :

G McCracken

PhD - McGill University

Nazanin Mohammadi Sepahvand

Collaborating Alumni - McGill University

Shahrad Mohammadzadeh

Master's Research - McGill University

Principal supervisor :

Gabriela Moisescu-Pareja

Collaborating researcher - McGill University

Co-supervisor :

Irina Rish

Padideh Nouri

PhD - Université de Montréal

Co-supervisor :

PhD - McGill University

Co-supervisor :

Nate Rahn

PhD - McGill University

Principal supervisor :

Marc Gendron-Bellemare

Sahand Rezaei-Shoshtari

PhD - McGill University

Co-supervisor :

PhD - McGill University

Co-supervisor :

PhD - McGill University

Co-supervisor :

PhD - McGill University

Nishanth Anand Vemgal

PhD - McGill University

PhD - McGill University

Co-supervisor :

Samira Ebrahimi Kahou

Zihan Wang

PhD - McGill University

Skipper: Combining Spatial and Temporal Abstraction for Better Generalization

Guangyuan Wang

Research Intern - McGill University

Steve Wen

Master's Research - McGill University

Co-supervisor :

Gregory Dudek

Zijing Wu

PhD - McGill University

Principal supervisor :

PhD - McGill University

Harry Zhao

Collaborating Alumni - McGill University

Co-supervisor :

Blog Posts

Generic thumbnail for Mila Blog articles.

February 22, 2024

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Read the article

Publications

Temporal Abstraction in Reinforcement Learning with the Successor Representation

Marlos C. Machado

Andre Barreto

Michael Bowling

arxiv.org

When Do Graph Neural Networks Help with Node Classification: Investigating the Homophily Principle on Node Distinguishability

Sitao Luan

Chenqing Hua

Minkai Xu

Qincheng Lu

Jiaqi Zhu

Xiao-Wen Chang

Jie Fu

Jure Leskovec

Homophily principle, i.e., nodes with the same labels are more likely to be connected, was believed to be the main reason for the performanc… (see more)e superiority of Graph Neural Networks (GNNs) over Neural Networks (NNs) on Node Classiﬁcation (NC) tasks. Recently, people have developed theoretical results arguing that, even though the homophily principle is broken, the advantage of GNNs can still hold as long as nodes from the same class share similar neighborhood patterns [29], which questions the validity of homophily. However, this argument only considers intra-class Node Distinguishability (ND) and ignores inter-class ND, which is insufﬁcient to study the effect of homophily. In this paper, we ﬁrst demonstrate the aforementioned insufﬁciency with examples and argue that an ideal situation for ND is to have smaller intra-class ND than inter-class ND. To formulate this idea and have a better understanding of homophily, we propose Contextual Stochastic Block Model for Homophily (CSBM-H) and deﬁne two metrics, Probabilistic Bayes Error (PBE) and Expected Negative KL-divergence (ENKL), to quantify ND, through which we can also ﬁnd how intra- and inter-class ND inﬂuence ND together. We visualize the results and give detailed analysis. Through experiments, we veriﬁed that the superiority of GNNs is

Offline Policy Optimization in RL with Variance Regularizaton

Riashat Islam

Samarth Sinha

Homanga Bharadhwaj

Samin Yeasar Arnob

Zhuoran Yang

Animesh Garg

Zhaoran Wang

Lihong Li

2022-12-29

ArXiv (preprint)

arxiv.org

Towards Continual Reinforcement Learning: A Review and Perspectives

Khimya Khetarpal

Matthew D Riemer

Irina Rish

2022-12-22

Journal of Artificial Intelligence Research (published)

arxiv.org

Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Networks

Mingde Zhao

Xiao-Wen Chang

The core operation of current Graph Neural Networks (GNNs) is the aggregation enabled by the graph Laplacian or message passing, which filte… (see more)rs the neighborhood node information. Though effective for various tasks, in this paper, we show that they are potentially a problematic factor underlying all GNN methods for learning on certain datasets, as they force the node representations similar, making the nodes gradually lose their identity and become indistinguishable. Hence, we augment the aggregation operations with their dual, i.e. diversification operators that make the node more distinct and preserve the identity. Such augmentation replaces the aggregation with a two-channel filtering process that, in theory, is beneficial for enriching the node representations. In practice, the proposed two-channel filters can be easily patched on existing GNN methods with diverse training strategies, including spectral and spatial (message passing) methods. In the experiments, we observe desired characteristics of the models and significant performance boost upon the baselines on 9 node classification tasks.

2022-11-22

NeurIPS.cc/2022/Workshop/GLFrontiers (published)

When Do We Need GNN for Node Classification?

Sitao Luan

Chenqing Hua

Qincheng Lu

Jiaqi Zhu

Xiao-Wen Chang

2022-10-30

arXiv.org (preprint)

When Do We Need GNN for Node Classification?

Sitao Luan

Chenqing Hua

Qincheng Lu

Jiaqi Zhu

Xiao-Wen Chang

2022-10-30

arXiv.org (preprint)

Low-Rank Representation of Reinforcement Learning Policies

Thang Doan

We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional… (see more) embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability and convergence guarantees. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly represented in a low-dimensional space while the embedded policy incurs almost no decrease in returns.

2022-10-27

Journal of Artificial Intelligence Research (published)

Simulating Human Gaze with Neural Visual Attention

Leo Schwinn

Bjoern Eskofier

Dario Zanca

2022-10-20

NeurIPS.cc/2022/Workshop/GMML (oral)

The Paradox of Choice: On the Role of Attention in Hierarchical Reinforcement Learning

Andrei Cristian Nica

Khimya Khetarpal

Decision-making AI agents are often faced with two important challenges: the depth of the planning horizon, and the branching factor due to … (see more)having many choices. Hierarchical reinforcement learning methods aim to solve the first problem, by providing shortcuts that skip over multiple time steps. To cope with the breadth, it is desirable to restrict the agent's attention at each step to a reasonable number of possible choices. The concept of affordances (Gibson, 1977) suggests that only certain actions are feasible in certain states. In this work, we first characterize "affordances" as a "hard" attention mechanism that strictly limits the available choices of temporally extended options. We then investigate the role of hard versus soft attention in training data collection, abstract value learning in long-horizon tasks, and handling a growing number of choices. To this end, we present an online, model-free algorithm to learn affordances that can be used to further learn subgoal options. Finally, we identify and empirically demonstrate the settings in which the "paradox of choice" arises, i.e. when having fewer but more meaningful choices improves the learning speed and performance of a reinforcement learning agent.

2022-10-20

NeurIPS.cc/2022/Workshop/Attention (poster)