Publications

Learning Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy
Max Schwarzer
Jesse Farebrother
Joshua Greaves
Kevin Roccapriore
Ekin Dogus Cubuk
Rishabh Agarwal
Sergei Kalinin
Igor Mordatch
We introduce a machine learning approach to determine the transition rates of silicon atoms on a single layer of carbon atoms, when stimulat… (see more)ed by the electron beam of a scanning transmission electron microscope (STEM). Our method is data-centric, leveraging data collected on a STEM. The data samples are processed and filtered to produce symbolic representations, which we use to train a neural network to predict transition rates. These rates are then applied to guide a single silicon atom throughout the lattice to pre-determined target destinations. We present empirical analyses that demonstrate the efficacy and generality of our approach.
Role of Structural and Conformational Diversity for Machine Learning Potentials
Nikhil Shenoy
Prudencio Tossou
Emmanuel Noutahi
Hadrien Mary
Jiarui Ding
In the field of Machine Learning Interatomic Potentials (MLIPs), understanding the intricate relationship between data biases, specifically … (see more)conformational and structural diversity, and model generalization is critical in improving the quality of Quantum Mechanics (QM) data generation efforts. We investigate these dynamics through two distinct experiments: a fixed budget one, where the dataset size remains constant, and a fixed molecular set one, which focuses on fixed structural diversity while varying conformational diversity. Our results reveal nuanced patterns in generalization metrics. Notably, for optimal structural and conformational generalization, a careful balance between structural and conformational diversity is required, but existing QM datasets do not meet that trade-off. Additionally, our results highlight the limitation of the MLIP models at generalizing beyond their training distribution, emphasizing the importance of defining applicability domain during model deployment. These findings provide valuable insights and guidelines for QM data generation efforts.
Self-Supervised Disentanglement by Leveraging Structure in Data Augmentations
Cian Eastwood
Julius von Kügelgen
Linus Ericsson
Diane Bouchacourt
Mark Ibrahim
Bernhard Schölkopf
Self-supervised representation learning often uses data augmentations to induce some invariance to "style" attributes of the data. However, … (see more)with downstream tasks generally unknown at training time, it is difficult to deduce a priori which attributes of the data are indeed "style" and can be safely discarded. To address this, we introduce a more principled approach that seeks to disentangle style features rather than discard them. The key idea is to add multiple style embedding spaces where: (i) each is invariant to all-but-one augmentation; and (ii) joint entropy is maximized. We formalize our structured data-augmentation procedure from a causal latent-variable-model perspective, and prove identifiability of both content and (multiple blocks of) style variables. We empirically demonstrate the benefits our approach on synthetic datasets and then present promising but limited results on ImageNet.
SORBET: A Siamese Network for Ontology Embeddings Using a Distance-Based Regression Loss and BERT
Francis Gosselin
On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions
Alvaro Carbonero
Alexandre AGM Duval
Victor Schmidt
Santiago Miret
Alex Hernandez-Garcia
The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorpor… (see more)ate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.
Towards equilibrium molecular conformation generation with GFlowNets
Alexandra Volokhova
Michał Koziarski
Alex Hernandez-Garcia
Cheng-Hao Liu
Santiago Miret
Pablo Lemos
Luca Thiede
Zichao Yan
Alan Aspuru-Guzik
Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule. In this pa… (see more)per we propose to use GFlowNet for sampling conformations of small molecules from the Boltzmann distribution, as determined by the molecule's energy. The proposed approach can be used in combination with energy estimation methods of different fidelity and discovers a diverse set of low-energy conformations for highly flexible drug-like molecules. We demonstrate that GFlowNet can reproduce molecular potential energy surfaces by sampling proportionally to the Boltzmann distribution.
Interacting Diffusion Processes for Event Sequence Forecasting
Mai Zeng
Florence Regol
Neural Temporal Point Processes (TPPs) have emerged as the primary framework for predicting sequences of events that occur at irregular time… (see more) intervals, but their sequential nature can hamper performance for long-horizon forecasts. To address this, we introduce a novel approach that incorporates a diffusion generative model. The model facilitates sequence-to-sequence prediction, allowing multi-step predictions based on historical event sequences. In contrast to previous approaches, our model directly learns the joint probability distribution of types and inter-arrival times for multiple events. This allows us to fully leverage the high dimensional modeling capability of modern generative models. Our model is composed of two diffusion processes, one for the time intervals and one for the event types. These processes interact through their respective denoising functions, which can take as input intermediate representations from both processes, allowing the model to learn complex interactions. We demonstrate that our proposal outperforms state-of-the-art baselines for long-horizon forecasting of TPP.
Managing AI Risks in an Era of Rapid Progress
Geoffrey Hinton
Andrew Yao
Dawn Song
Pieter Abbeel
Trevor Darrell
Yuval Noah Harari
Ya-Qin Zhang
Lan Xue
Shai Shalev-Shwartz
Gillian K. Hadfield
Jeff Clune
Frank Hutter
Atilim Güneş Baydin
Sheila McIlraith
Qiqi Gao
Ashwin Acharya
Anca Dragan … (see 5 more)
Philip Torr
Stuart Russell
Daniel Kahneman
Jan Brauner
Sören Mindermann
Surrogate Minimization: An Optimization Algorithm for Training Large Neural Networks with Model Parallelism
Reza Asad
Reza Babanezhad Harikandeh
Issam Hadj Laradji
Sharan Vaswani
The value of standards for health datasets in artificial intelligence-based applications
Anmol Arora
Joseph E. Alderman
Joanne Palmer
Shaswath Ganapathi
Elinor Laws
Melissa D. McCradden
Lauren Oakden-Rayner
Stephen R. Pfohl
Marzyeh Ghassemi
Francis McKay
Darren Treanor
Bilal Mateen
Jacqui Gath
Adewole O. Adebajo
Stephanie Kuku
Rubeta Matin
Katherine Heller
Elizabeth Sapey
Neil J. Sebire … (see 4 more)
Heather Cole-Lewis
Melanie Calvert
Alastair Denniston
Xiaoxuan Liu
PDB-Struct: A Comprehensive Benchmark for Structure-based Protein Design
Chuanrui Wang
Bozitao Zhong
Zuobai Zhang
Narendra Chaudhary
Sanchit Misra
Structure-based protein design has attracted increasing interest, with numerous methods being introduced in recent years. However, a univers… (see more)ally accepted method for evaluation has not been established, since the wet-lab validation can be overly time-consuming for the development of new algorithms, and the
PDB-Struct: A Comprehensive Benchmark for Structure-based Protein Design
Chuanrui Wang
Bozitao Zhong
Zuobai Zhang
Narendra Chaudhary
Sanchit Misra
Structure-based protein design has attracted increasing interest, with numerous methods being introduced in recent years. However, a univers… (see more)ally accepted method for evaluation has not been established, since the wet-lab validation can be overly time-consuming for the development of new algorithms, and the