Publications

Hypernetworks for Zero-shot Transfer in Reinforcement Learning

Sahand Rezaei-Shoshtari

Charlotte Morissette

Francois Hogan

In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objec… (see more)tive and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.

2023-06-26

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Latent Space Evolution under Incremental Learning with Concept Drift (Student Abstract)

Charles Bourbeau

Audrey Durand

This work investigates the evolution of latent space when deep learning models are trained incrementally in non-stationary environments that… (see more) stem from concept drift. We propose a methodology for visualizing the incurred change in latent representations. We further show that classes not targeted by concept drift can be negatively affected, suggesting that the observation of all classes during learning may regularize the latent space.

2023-06-26

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

Signed Laplacian Graph Neural Networks

Yu Li

Meng Qu

Jian Tang

Yi Chang

2023-06-26

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

The Effect of diversity in Meta-Learning

Ramnath Kumar

Tristan Deleu

Yoshua Bengio

Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples. Recent studies show that task … (see more)distribution plays a vital role in the performance of the model. Conventional wisdom is that task diversity should improve the performance of meta-learning. In this work, we find evidence to the contrary; we study different task distributions on a myriad of models and datasets to evaluate the effect of task diversity on meta-learning algorithms. For this experiment, we train on multiple datasets, and with three broad classes of meta-learning models - Metric-based (i.e., Protonet, Matching Networks), Optimization-based (i.e., MAML, Reptile, and MetaOptNet), and Bayesian meta-learning models (i.e., CNAPs). Our experiments demonstrate that the effect of task diversity on all these algorithms follows a similar trend, and task diversity does not seem to offer any benefits to the learning of the model. Furthermore, we also demonstrate that even a handful of tasks, repeated over multiple batches, would be sufficient to achieve a performance similar to uniform sampling and draws into question the need for additional tasks to create better models.

2023-06-26

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

openreview.net

Partial Ordered Statistics Decoding with Enhanced Error Patterns

Marwan Jalaleddine

Huayi Zhou

Jiajie Li

Warren Gross

Guessing Random Additive Noise Decoding (GRAND) excels at decoding high-rate codes but struggles to decode low-rate codes with reasonable co… (see more)mplexity. Ordered Statistics Decoding (OSD) specifically excels in decoding short codes irrespective of rates; however, OSD necessitates the use of Gaussian elimination which introduces additional time, space and computational complexity. Partial Ordered Statistics Decoding (POSD) was proposed to reduce the time, space, and computational complexity of OSD; however, the current partition-based POSD has poor decoding performance since it does not generate test error patterns across partitions. In this paper, we propose to improve the decoding performance of POSD by incorporating test error patterns inspired by GRAND methods. This work offers a trade-off between performance and complexity compared to existing decoders such as GRAND and OSD. We enhance POSD by optimizing the scheduling of Test Error Patterns (TEPs) and show that our technique can be applied to any code in a standard form. At a target BER 10−4 with eBCH (128,64) the enhanced error patterns achieve more than 0.6 dB gain in performance compared to the POSD with partition-based error patterns. Moreover, at a target frame error rate of 10−5, POSD uses 10× less binary operations compared to GRAND when decoding eBCH (128,64) and RLC(128,64) codes. With BCH (127,29) and RLC(128,32), at a target frame error rate of 10−2, POSD with enhanced error patterns with a maximum number of queries (MQ) of 104 achieves up to a 2 dB gain to its GRAND equivalent which is using 107 maximum number of queries.

2023-06-25

2023 IEEE International Symposium on Information Theory (ISIT) (published)

doi.org

CeBed: A Benchmark for Deep Data-Driven OFDM Channel Estimation

Amal Feriani

Di Wu

Steve Liu

Gregory Dudek

2023-06-23

ArXiv (preprint)

doi.org

arxiv.org

Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design

Julien Roy

Pierre-Luc Bacon

Chris Pal

Emmanuel Bengio

In recent years, in-silico molecular design has received much attention from the machine learning community. When designing a new compound f… (see more)or pharmaceutical applications, there are usually multiple properties of such molecules that need to be optimised: binding energy to the target, synthesizability, toxicity, EC50, and so on. While previous approaches have employed a scalarization scheme to turn the multi-objective problem into a preference-conditioned single objective, it has been established that this kind of reduction may produce solutions that tend to slide towards the extreme points of the objective space when presented with a problem that exhibits a concave Pareto front. In this work we experiment with an alternative formulation of goal-conditioned molecular generation to obtain a more controllable conditional model that can uniformly explore solutions along the entire Pareto front.

2023-06-23

ICML.cc/2023/Workshop/DeployableGenerativeAI (published)

doi.org

openreview.net

Using modular connectome-based predictive modeling to reveal brain-behavior relationships of individual differences in working memory

Huayi Yang

Junjun Zhang

Zhenlan Jin

Pouya Bashivan

Ling Li

2023-06-22

Brain Structure and Function (published)

doi.org

Accelerating exploration and representation learning with offline pre-training

Bogdan Mazoure

Jake Bruce

Doina Precup

Rob Fergus

Ankit Anand

Sequential decision-making agents struggle with long horizon tasks, since solving them requires multi-step reasoning. Most reinforcement lea… (see more)rning (RL) algorithms address this challenge by improved credit assignment, introducing memory capability, altering the agent's intrinsic motivation (i.e. exploration) or its worldview (i.e. knowledge representation). Many of these components could be learned from offline data. In this work, we follow the hypothesis that exploration and representation learning can be improved by separately learning two different models from a single offline dataset. We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward separately from a single collection of human demonstrations can significantly improve the sample efficiency on the challenging NetHack benchmark. We also ablate various components of our experimental setting and highlight crucial insights.

2023-06-20

ICML.cc/2023/Workshop/ILHF (published)

doi.org

openreview.net

Accelerating Generalized Random Forests with Fixed-Point Trees

David L. Fleischer

David A. Stephens

Archer Yang

2023-06-20

ArXiv (preprint)

doi.org

arxiv.org