Publications

Ghost on the Shell: An Expressive Representation of General 3D Shapes

Zhen Liu

Yao Feng

Yuliang Xiu

Weiyang Liu

Liam Paull

Michael J. Black

Bernhard Schölkopf

2024-01-16

ICLR.cc/2024/Conference (présentation orale)

doi.org

openreview.net

Hallucination Detection and Hallucination Mitigation: An Investigation

Junliang Luo

Tianyu Li

Di Wu

M. Jenkin

Steve Liu

Gregory Dudek

2024-01-16

ArXiv (prépublication)

doi.org

arxiv.org

How connectivity structure shapes rich and lazy learning in neural circuits

Yuhan Helena Liu

Aristide Baratin

Jonathan Cornford

Stefan Mihalas

Eric Todd SheaBrown

Guillaume Lajoie

In theoretical neuroscience, recent work leverages deep learning tools to explore how some network attributes critically influence its learn… (voir plus)ing dynamics. Notably, initial weight distributions with small (resp. large) variance may yield a rich (resp. lazy) regime, where significant (resp. minor) changes to network states and representation are observed over the course of learning. However, in biology, neural circuit connectivity generally has a low-rank structure and therefore differs markedly from the random initializations generally used for these studies. As such, here we investigate how the structure of the initial weights — in particular their effective rank — influences the network learning regime. Through both empirical and theoretical analyses, we discover that high-rank initializations typically yield smaller network changes indicative of lazier learning, a finding we also confirm with experimentally-driven initial connectivity in recurrent neural networks. Conversely, low-rank initialization biases learning towards richer learning. Importantly, however, as an exception to this rule, we find lazier learning can still occur with a low-rank initialization that aligns with task and data statistics. Our research highlights the pivotal role of initial weight structures in shaping learning regimes, with implications for metabolic costs of plasticity and risks of catastrophic forgetting.

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Improving Intrinsic Exploration by Creating Stationary Objectives

Roger Creus Castanyer

Joshua Romoff

Glen Berseth

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Intelligent Switching for Reset-Free RL

Darshan Patil

Janarthanan Rajendran

Glen Berseth

Sarath Chandar

In the real world, the strong episode resetting mechanisms that are needed to train agents in simulation are unavailable. The \textit{resett… (voir plus)ing} assumption limits the potential of reinforcement learning in the real world, as providing resets to an agent usually requires the creation of additional handcrafted mechanisms or human interventions. Recent work aims to train agents (\textit{forward}) with learned resets by constructing a second (\textit{backward}) agent that returns the forward agent to the initial state. We find that the termination and timing of the transitions between these two agents are crucial for algorithm success. With this in mind, we create a new algorithm, Reset Free RL with Intelligently Switching Controller (RISC) which intelligently switches between the two agents based on the agent's confidence in achieving its current goal. Our new method achieves state-of-the-art performance on several challenging environments for reset-free RL.

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Intelligent Switching for Reset-Free RL

Darshan Patil

Janarthanan Rajendran

Glen Berseth

Sarath Chandar

2024-01-16

ICLR.cc/2024/Conference (poster)

openreview.net

Jointly-Learned Exit and Inference for a Dynamic Neural Network

Florence Regol

Joud Chataoui

Mark Coates

2024-01-16

ICLR.cc/2024/Conference (poster)

openreview.net

Large Language Models as Generalizable Policies for Embodied Tasks

Andrew Szot

Max Schwarzer

Harsh Agrawal

Bogdan Mazoure

Walter Talbott

Rin Metcalf

Natalie Mackraz

(Rex) Devon Hjelm

Alexander T Toshev

2024-01-16

ICLR.cc/2024/Conference (poster)

doi.org

openreview.net

Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency

Tianhong Li

Sangnie Bhardwaj

Yonglong Tian

Han Zhang

Jarred Barber

Dina Katabi

Guillaume Lajoie

Huiwen Chang

Dilip Krishnan

Current vision-language generative models rely on expansive corpora of paired image-text data to attain optimal performance and generalizati… (voir plus)on capabilities. However, automatically collecting such data (e.g. via large-scale web scraping) leads to low quality and poor image-text correlation, while human annotation is more accurate but requires significant manual effort and expense. We introduce

2024-01-16

ICLR.cc/2024/Conference (spotlight)

doi.org

openreview.net

Local Search GFlowNets

Minsu Kim

Taeyoung Yun

Emmanuel Bengio

Dinghuai Zhang

Yoshua Bengio

Sungsoo Ahn

Jinkyoo Park

Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their re… (voir plus)wards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which focuses on exploiting high-rewarded sample space to resolve this issue. Our main idea is to explore the local neighborhood via backtracking and reconstruction guided by backward and forward policies, respectively. This allows biasing the samples toward high-reward solutions, which is not possible for a typical GFlowNet solution generation scheme, which uses the forward policy to generate the solution from scratch. Extensive experiments demonstrate a remarkable performance improvement in several biochemical tasks. Source code is available: https://github.com/dbsxodud-11/ls_gfn.

2024-01-16

ICLR.cc/2024/Conference (spotlight)

openreview.net

LOQA: Learning with Opponent Q-Learning Awareness

Milad Aghajohari

Juan Agustin Duque

Tim Cooijmans

Aaron Courville

In various real-world scenarios, interactions among agents often resemble the dynamics of general-sum games, where each agent strives to opt… (voir plus)imize its own utility. Despite the ubiquitous relevance of such settings, decentralized machine learning algorithms have struggled to find equilibria that maximize individual utility while preserving social welfare. In this paper we introduce Learning with Opponent Q-Learning Awareness (LOQA) , a novel reinforcement learning algorithm tailored to optimizing an agent's individual utility while fostering cooperation among adversaries in partially competitive environments. LOQA assumes that each agent samples actions proportionally to their action-value function Q. Experimental results demonstrate the effectiveness of LOQA at achieving state-of-the-art performance in benchmark scenarios such as the Iterated Prisoner's Dilemma and the Coin Game. LOQA achieves these outcomes with a significantly reduced computational footprint compared to previous works, making it a promising approach for practical multi-agent applications.

2024-01-16

ICLR.cc/2024/Conference (poster)

openreview.net

Mastering Memory Tasks with World Models

Mohammad Reza Samsami

Artem Zholus

Janarthanan Rajendran

Sarath Chandar

Current model-based reinforcement learning (MBRL) agents struggle with long-term dependencies. This limits their ability to effectively solv… (voir plus)e tasks involving extended time gaps between actions and outcomes, or tasks demanding the recalling of distant observations to inform current actions. To improve temporal coherence, we integrate a new family of state space models (SSMs) in world models of MBRL agents to present a new method, Recall to Imagine (R2I). This integration aims to enhance both long-term memory and long-horizon credit assignment. Through a diverse set of illustrative tasks, we systematically demonstrate that R2I not only establishes a new state-of-the-art for challenging memory and credit assignment RL tasks, such as BSuite and POPGym, but also showcases superhuman performance in the complex memory domain of Memory Maze. At the same time, it upholds comparable performance in classic RL tasks, such as Atari and DMC, suggesting the generality of our method. We also show that R2I is faster than the state-of-the-art MBRL method, DreamerV3, resulting in faster wall-time convergence.

2024-01-16

ICLR.cc/2024/Conference (présentation orale)

doi.org

openreview.net

Le traitement du langage naturel à l'ère de l'IA générative

Boussole des politiques en IA

Vie étudiante et ressources

Publications

Le traitement du langage naturel à l'ère de l'IA générative

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications