Publications

Discrete, compositional, and symbolic representations through attractor dynamics

Andrew Nam

Chen Sun

Compositionality is an important feature of discrete symbolic systems, such as language and programs, as it enables them to have infinite ca… (voir plus)pacity despite a finite symbol set. It serves as a useful abstraction for reasoning in both cognitive science and in AI, yet the interface between continuous and symbolic processing is often imposed by fiat at the algorithmic level, such as by means of quantization or a softmax sampling step. In this work, we explore how discretization could be implemented in a more neurally plausible manner through the modeling of attractor dynamics that partition the continuous representation space into basins that correspond to sequences of symbols. Building on established work in attractor networks and introducing novel training methods, we show that imposing structure in the symbolic space can produce compositionality in the attractor-supported representation space of rich sensory inputs. Lastly, we argue that our model exhibits the process of an information bottleneck that is thought to play a role in conscious experience, decomposing the rich information of a sensory input into stable components encoding symbolic information.

2023-10-26

NeurIPS.cc/2023/Workshop/InfoCog (présentation orale)

doi.org

openreview.net

Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search

Abbas Mehrabian

Ankit Anand

Hyunjik Kim

Nicolas Sonnerat

Matej Balog

Gheorghe Comanici

Tudor Berariu

Andrew Lee

Anian Ruoss

Anna Bulanova

Daniel Toyama

Sam Blackwell

Bernardino Romera Paredes

Petar Veličković

Laurent Orseau

Joonkyung Lee

Anurag Murty Naredla

Doina Precup

Adam Zsolt Wagner

2023-10-26

NeurIPS.cc/2023/Workshop/MATH-AI (poster)

doi.org

openreview.net

Forecaster: Towards Temporally Abstract Tree-Search Planning from Pixels

The ability to plan at many different levels of abstraction enables agents to envision the long-term repercussions of their decisions and th… (voir plus)us enables sample-efficient learning. This becomes particularly beneficial in complex environments from high-dimensional state space such as pixels, where the goal is distant and the reward sparse. We introduce Forecaster, a deep hierarchical reinforcement learning approach which plans over high-level goals leveraging a temporally abstract world model. Forecaster learns an abstract model of its environment by modelling the transitions dynamics at an abstract level and training a world model on such transition. It then uses this world model to choose optimal high-level goals through a tree-search planning procedure. It additionally trains a low-level policy that learns to reach those goals. Our method not only captures building world models with longer horizons, but also, planning with such models in downstream tasks. We empirically demonstrate Forecaster's potential in both single-task learning and generalization to new tasks in the AntMaze domain.

2023-10-26

NeurIPS.cc/2023/Workshop/GenPlan (publié)

doi.org

openreview.net

Improving Generalization in Reinforcement Learning Training Regimes for Social Robot Navigation

Adam Sigal

Hsiu-Chin Lin

AJung Moon

In order for autonomous mobile robots to navigate in human spaces, they must abide by our social norms. Reinforcement learning (RL) has emer… (voir plus)ged as an effective method to train robot sequential decision-making policies that are able to respect these norms. However, a large portion of existing work in the field conducts both RL training and testing in simplistic environments. This limits the generalization potential of these models to unseen environments, and undermines the meaningfulness of their reported results. We propose a method to improve the generalization performance of RL social navigation methods using curriculum learning. By employing multiple environment types and by modeling pedestrians using multiple dynamics models, we are able to progressively diversify and escalate difficulty in training. Our results show that the use of curriculum learning in training can be used to achieve better generalization performance than previous training methods. We also show that results presented in many existing state-of-the art RL social navigation works do not evaluate their methods outside of their training environments, and thus do not reflect their policies' failure to adequately generalize to out-of-distribution scenarios. In response, we validate our training approach on larger and more crowded testing environments than those used in training, allowing for more meaningful measurements of model performance.

2023-10-26

NeurIPS.cc/2023/Workshop/GenPlan (publié)

openreview.net

Latent Space Simulator for Unveiling Molecular Free Energy Landscapes and Predicting Transition Dynamics

Simon Dobers

Hannes Stärk

Xiang Fu

Dominique Beaini

Stephan Günnemann

Free Energy Surfaces (FES) and metastable transition rates are key elements in understanding the behavior of molecules within a system. Howe… (voir plus)ver, the typical approaches require computing force fields across billions of time steps in a molecular dynamics (MD) simulation, which is often considered intractable when dealing with large systems or databases. In this work, we propose LaMoDy, a latent-space MD simulator, to effectively tackle the intractability with around 20-fold speed improvements compared to classical MD. The model leverages a chirality-aware SE(3)-invariant encoder-decoder architecture to generate a latent space coupled with a recurrent neural network to run the time-wise dynamics. We show that LaMoDy effectively recovers realistic trajectories and FES more accurately and faster than existing methods while capturing their major dynamical and conformational properties. Furthermore, the proposed approach can generalize to molecules outside the training distribution.

2023-10-26

NeurIPS.cc/2023/Workshop/AI4Science (poster)

openreview.net

Learning Macro Variables with Auto-encoders

Dhanya Sridhar

Eric Elmoznino

Maitreyi Swaroop

2023-10-26

NeurIPS.cc/2023/Workshop/CRL (poster)

openreview.net

Learning Optimizers for Local SGD

Charles-Etienne Joseph

2023-10-26

NeurIPS.cc/2023/Workshop/Federated_Learning (poster)

openreview.net

Learning Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy

Max Schwarzer

Jesse Farebrother

Joshua Greaves

Kevin Roccapriore

Ekin Dogus Cubuk

Rishabh Agarwal

Aaron Courville

Bellemare Marc-Emmanuel

Sergei Kalinin

Igor Mordatch

Pablo Samuel Castro

We introduce a machine learning approach to determine the transition rates of silicon atoms on a single layer of carbon atoms, when stimulat… (voir plus)ed by the electron beam of a scanning transmission electron microscope (STEM). Our method is data-centric, leveraging data collected on a STEM. The data samples are processed and filtered to produce symbolic representations, which we use to train a neural network to predict transition rates. These rates are then applied to guide a single silicon atom throughout the lattice to pre-determined target destinations. We present empirical analyses that demonstrate the efficacy and generality of our approach.

2023-10-26

NeurIPS.cc/2023/Workshop/AI4Mat (spotlight)

openreview.net

Role of Structural and Conformational Diversity for Machine Learning Potentials

Nikhil Shenoy

Prudencio Tossou

Emmanuel Noutahi

Hadrien Mary

Dominique Beaini

Jiarui Ding

In the field of Machine Learning Interatomic Potentials (MLIPs), understanding the intricate relationship between data biases, specifically … (voir plus)conformational and structural diversity, and model generalization is critical in improving the quality of Quantum Mechanics (QM) data generation efforts. We investigate these dynamics through two distinct experiments: a fixed budget one, where the dataset size remains constant, and a fixed molecular set one, which focuses on fixed structural diversity while varying conformational diversity. Our results reveal nuanced patterns in generalization metrics. Notably, for optimal structural and conformational generalization, a careful balance between structural and conformational diversity is required, but existing QM datasets do not meet that trade-off. Additionally, our results highlight the limitation of the MLIP models at generalizing beyond their training distribution, emphasizing the importance of defining applicability domain during model deployment. These findings provide valuable insights and guidelines for QM data generation efforts.

2023-10-26

NeurIPS.cc/2023/Workshop/AI4Science (présentation orale)

doi.org

openreview.net

Self-Supervised Disentanglement by Leveraging Structure in Data Augmentations

Cian Eastwood

Julius von Kügelgen

Linus Ericsson

Diane Bouchacourt

P Vincent

Mark Ibrahim

Bernhard Schölkopf

Self-supervised representation learning often uses data augmentations to induce some invariance to "style" attributes of the data. However, … (voir plus)with downstream tasks generally unknown at training time, it is difficult to deduce a priori which attributes of the data are indeed "style" and can be safely discarded. To address this, we introduce a more principled approach that seeks to disentangle style features rather than discard them. The key idea is to add multiple style embedding spaces where: (i) each is invariant to all-but-one augmentation; and (ii) joint entropy is maximized. We formalize our structured data-augmentation procedure from a causal latent-variable-model perspective, and prove identifiability of both content and (multiple blocks of) style variables. We empirically demonstrate the benefits our approach on synthetic datasets and then present promising but limited results on ImageNet.

2023-10-26

NeurIPS.cc/2023/Workshop/CRL (poster)

openreview.net

SORBET: A Siamese Network for Ontology Embeddings Using a Distance-Based Regression Loss and BERT

Francis Gosselin

Amal Zouaq

2023-10-26

The Semantic Web – ISWC 2023 (publié)

doi.org

On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions

Alvaro Carbonero

Alexandre Duval

Victor Schmidt

Santiago Miret

Alex Hernández-García

Yoshua Bengio

David Rolnick

The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorpor… (voir plus)ate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.

2023-10-26

NeurIPS.cc/2023/Workshop/AI4Mat (poster)

doi.org

openreview.net

Mila sur Udemy

Désinformation 2.0 : quand l’IA brouille nos ondes

Publications du Fellowship en politiques de l'IA

Publications

Mila sur Udemy

Désinformation 2.0 : quand l’IA brouille nos ondes

Publications du Fellowship en politiques de l'IA

Mots-clés populaires:

Publications