Publications

Learning Successor Features the Simple Way

Raymond Chua

Arna Ghosh

Christos Kaplanis

In Deep Reinforcement Learning (RL), it is a challenge to learn representations that do not exhibit catastrophic forgetting or interference … (see more)in non-stationary environments. Successor Features (SFs) offer a potential solution to this challenge. However, canonical techniques for learning SFs from pixel-level observations often lead to representation collapse, wherein representations degenerate and fail to capture meaningful variations in the data. More recent methods for learning SFs can avoid representation collapse, but they often involve complex losses and multiple learning phases, reducing their efficiency. We introduce a novel, simple method for learning SFs directly from pixels. Our approach uses a combination of a Temporal-difference (TD) loss and a reward prediction loss, which together capture the basic mathematical definition of SFs. We show that our approach matches or outperforms existing SF learning techniques in both 2D (Minigrid), 3D (Miniworld) mazes and Mujoco, for both single and continual learning scenarios. As well, our technique is efficient, and can reach higher levels of performance in less time than other approaches. Our work provides a new, streamlined technique for learning SFs directly from pixel observations, with no pretraining required.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Listenable Maps for Zero-Shot Audio Classifiers

Francesco Paissan

Luca Della Libera

Mirco Ravanelli

Cem Subakan

Interpreting the decisions of deep learning models, including audio classifiers, is crucial for ensuring the transparency and trustworthines… (see more)s of this technology. In this paper, we introduce LMAC-ZS (Listenable Maps for Audio Classifiers in the Zero-Shot context), which, to the best of our knowledge, is the first decoder-based post-hoc interpretation method for explaining the decisions of zero-shot audio classifiers. The proposed method utilizes a novel loss function that maximizes the faithfulness to the original similarity between a given text-and-audio pair. We provide an extensive evaluation using the Contrastive Language-Audio Pretraining (CLAP) model to showcase that our interpreter remains faithful to the decisions in a zero-shot classification context. Moreover, we qualitatively show that our method produces meaningful explanations that correlate well with different text prompts.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Do LLMs Build World Representations? Probing Through the Lens of State Abstraction

Zichao Li

Yanshuai Cao

Jackie Cheung

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

Many-Shot In-Context Learning

Rishabh Agarwal

Avi Singh

Lei M Zhang

Bernd Bohnet

Stephanie C.Y. Chan

Luis Rosias

Biao Zhang

Ankesh Anand

Zaheer Abbas

Azade Nova

John D Co-Reyes

Eric Chu

Feryal Behbahani

Aleksandra Faust

Hugo Larochelle

Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, w… (see more)ithout any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples – the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative and discriminative tasks. While promising, many-shot ICL can be bottlenecked by the available amount of human-generated outputs. To mitigate this limitation, we explore two new settings: (1) "Reinforced ICL" that uses model-generated chain-of-thought rationales in place of human rationales, and (2) "Unsupervised ICL" where we remove rationales from the prompt altogether, and prompts the model only with domain-specific inputs. We find that both Reinforced and Unsupervised ICL can be quite effective in the many-shot regime, particularly on complex reasoning tasks. We demonstrate that, unlike few-shot learning, many-shot learning is effective at overriding pretraining biases, can learn high-dimensional functions with numerical inputs, and performs comparably to supervised fine-tuning. Finally, we reveal the limitations of next-token prediction loss as an indicator of downstream ICL performance.

2024-09-25

NeurIPS.cc/2024/Conference (spotlight)

doi.org

openreview.net

Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving

Aniket Rajiv Didolkar

Anirudh Goyal

Nan Rosemary Ke

Siyuan Guo

Michal Valko

Timothy P Lillicrap

Danilo Jimenez Rezende

Yoshua Bengio

Michael Curtis Mozer

Sanjeev Arora

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Multi-Scale Representation Learning for Protein Fitness Prediction

Zuobai Zhang

Pascal Notin

Yining Huang

Aurelie Lozano

Vijil Chenthamarakshan

Debora Susan Marks

Payel Das

Jian Tang

Designing novel functional proteins crucially depends on accurately modeling their fitness landscape. Given the limited availability of func… (see more)tional annotations from wet-lab experiments, previous methods have primarily relied on self-supervised models trained on vast, unlabeled protein sequence or structure datasets. While initial protein representation learning studies solely focused on either sequence or structural features, recent hybrid architectures have sought to merge these modalities to harness their respective strengths. However, these sequence-structure models have so far achieved only incremental improvements when compared to the leading sequence-only approaches, highlighting unresolved challenges effectively leveraging these modalities together. Moreover, the function of certain proteins is highly dependent on the granular aspects of their surface topology, which have been overlooked by prior models. To address these limitations, we introduce the Sequence-Structure-Surface Fitness (**S3F**) model — a novel multimodal representation learning framework that integrates protein features across several scales. Our approach combines sequence representations from a protein language model with Geometric Vector Perceptron networks encoding protein backbone and detailed surface topology. The proposed method achieves state-of-the-art fitness prediction on the ProteinGym benchmark encompassing 217 substitution deep mutational scanning assays, and provides insights into the determinants of protein function. Our code is at https://github.com/DeepGraphLearning/S3F.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

Normalization and effective learning rates in reinforcement learning

Clare Lyle

Zeyu Zheng

Khimya Khetarpal

James Martens

Hado van Hasselt

Razvan Pascanu

Will Dabney

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Offline Multitask Representation Learning for Reinforcement Learning

Haque Ishfaq

Thanh Nguyen-Tang

Songtao Feng

Raman Arora

Mengdi Wang

Ming Yin

Doina Precup

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Parseval Regularization for Continual Reinforcement Learning

Wesley Chung

Lynn Cherif

David Meger

Doina Precup

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

Periodic agent-state based Q-learning for POMDPs

Amit Sinha

Matthieu Geist

Aditya Mahajan

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

Predicting Future Actions of Reinforcement Learning Agents

Stephen Chung

Scott Niekum

David Scott Krueger

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

QGFN: Controllable Greediness with Action Values

Elaine Lau

Stephen Zhewen Lu

Ling Pan

Doina Precup

Emmanuel Bengio

Generative Flow Networks (GFlowNets; GFNs) are a family of energy-based generative methods for combinatorial objects, capable of generating … (see more)diverse and high-utility samples. However, consistently biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate,

2024-09-25

NeurIPS.cc/2024/Conference (poster)

doi.org

openreview.net

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications