Publications

Self-Supervised Disentanglement by Leveraging Structure in Data Augmentations

Cian Eastwood

Julius von Kügelgen

Linus Ericsson

Diane Bouchacourt

Mark Ibrahim

Bernhard Schölkopf

Self-supervised representation learning often uses data augmentations to induce some invariance to "style" attributes of the data. However, … (voir plus)with downstream tasks generally unknown at training time, it is difficult to deduce a priori which attributes of the data are indeed "style" and can be safely discarded. To address this, we introduce a more principled approach that seeks to disentangle style features rather than discard them. The key idea is to add multiple style embedding spaces where: (i) each is invariant to all-but-one augmentation; and (ii) joint entropy is maximized. We formalize our structured data-augmentation procedure from a causal latent-variable-model perspective, and prove identifiability of both content and (multiple blocks of) style variables. We empirically demonstrate the benefits our approach on synthetic datasets and then present promising but limited results on ImageNet.

2023-10-27

NeurIPS.cc/2023/Workshop/CRL (poster)

openreview.net

SORBET: A Siamese Network for Ontology Embeddings Using a Distance-Based Regression Loss and BERT

Francis Gosselin

Amal Zouaq

2023-10-27

The Semantic Web – ISWC 2023 (publié)

doi.org

On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions

Alvaro Carbonero

Alexandre AGM Duval

Victor Schmidt

Santiago Miret

Alex Hernandez-Garcia

Yoshua Bengio

David Rolnick

The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorpor… (voir plus)ate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Mat (poster)

doi.org

openreview.net

Towards equilibrium molecular conformation generation with GFlowNets

Alexandra Volokhova

Michał Koziarski

Alex Hernandez-Garcia

Cheng-Hao Liu

Santiago Miret

Pablo Lemos

Luca Thiede

Zichao Yan

Alán Aspuru-Guzik

Yoshua Bengio

Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule. In this pa… (voir plus)per we propose to use GFlowNet for sampling conformations of small molecules from the Boltzmann distribution, as determined by the molecule's energy. The proposed approach can be used in combination with energy estimation methods of different fidelity and discovers a diverse set of low-energy conformations for highly flexible drug-like molecules. We demonstrate that GFlowNet can reproduce molecular potential energy surfaces by sampling proportionally to the Boltzmann distribution.

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Mat (poster)

doi.org

openreview.net

Interacting Diffusion Processes for Event Sequence Forecasting

Mai Zeng

Florence Regol

Mark Coates

Neural Temporal Point Processes (TPPs) have emerged as the primary framework for predicting sequences of events that occur at irregular time… (voir plus) intervals, but their sequential nature can hamper performance for long-horizon forecasts. To address this, we introduce a novel approach that incorporates a diffusion generative model. The model facilitates sequence-to-sequence prediction, allowing multi-step predictions based on historical event sequences. In contrast to previous approaches, our model directly learns the joint probability distribution of types and inter-arrival times for multiple events. This allows us to fully leverage the high dimensional modeling capability of modern generative models. Our model is composed of two diffusion processes, one for the time intervals and one for the event types. These processes interact through their respective denoising functions, which can take as input intermediate representations from both processes, allowing the model to learn complex interactions. We demonstrate that our proposal outperforms state-of-the-art baselines for long-horizon forecasting of TPP.

2023-10-26

ArXiv (prépublication)

doi.org

arxiv.org

Surrogate Minimization: An Optimization Algorithm for Training Large Neural Networks with Model Parallelism

Reza Asad

Reza Babanezhad Harikandeh

Issam Hadj Laradji

Nicolas Le Roux

Sharan Vaswani

2023-10-26

NeurIPS.cc/2023/Workshop/OPT (poster)

openreview.net

The value of standards for health datasets in artificial intelligence-based applications

Anmol Arora

Joseph E. Alderman

Joanne Palmer

Shaswath Ganapathi

Elinor Laws

Melissa D. McCradden

Lauren Oakden-Rayner

Stephen R. Pfohl

Marzyeh Ghassemi

Francis McKay

Darren Treanor

Negar Rostamzadeh

Bilal Mateen

Jacqui Gath

Adewole O. Adebajo

Stephanie Kuku

Rubeta Matin

Katherine Heller

Elizabeth Sapey

Neil J. Sebire … (voir 4 de plus)

Heather Cole-Lewis

Melanie Calvert

Alastair Denniston

Xiaoxuan Liu

2023-10-26

Nature Medicine (publié)

doi.org

CL-MASR: A Continual Learning Benchmark for Multilingual ASR

Luca Della Libera

Pooneh Mousavi

Salah Zaiem

Cem (Yusuf) Subakan

Mirco Ravanelli

2023-10-25

ArXiv (prépublication)

doi.org

arxiv.org

PDB-Struct: A Comprehensive Benchmark for Structure-based Protein Design

Chuanrui Wang

Bozitao Zhong

Zuobai Zhang

Narendra Chaudhary

Sanchit Misra

Jian Tang

Structure-based protein design has attracted increasing interest, with numerous methods being introduced in recent years. However, a univers… (voir plus)ally accepted method for evaluation has not been established, since the wet-lab validation can be overly time-consuming for the development of new algorithms, and the

2023-10-25

NeurIPS.cc/2023/Workshop/AI4D3 (poster)

openreview.net

PDB-Struct: A Comprehensive Benchmark for Structure-based Protein Design

Chuanrui Wang

Bozitao Zhong

Zuobai Zhang

Narendra Chaudhary

Sanchit Misra

Jian Tang

Structure-based protein design has attracted increasing interest, with numerous methods being introduced in recent years. However, a univers… (voir plus)ally accepted method for evaluation has not been established, since the wet-lab validation can be overly time-consuming for the development of new algorithms, and the

2023-10-25

NeurIPS.cc/2023/Workshop/AI4D3 (poster)

openreview.net

Role of Structural and Conformational Diversity for Machine Learning Potentials

Nikhil Shenoy

Prudencio Tossou

Emmanuel Noutahi

Hadrien Mary

Dominique Beaini

Jiarui Ding

In the field of Machine Learning Interatomic Potentials (MLIPs), understanding the intricate relationship between data biases, specifically … (voir plus)conformational and structural diversity, and model generalization is critical in improving the quality of Quantum Mechanics (QM) data generation efforts. We investigate these dynamics through two distinct experiments: a fixed budget one, where the dataset size remains constant, and a fixed molecular set one, which focuses on fixed structural diversity while varying conformational diversity. Our results reveal nuanced patterns in generalization metrics. Notably, for optimal structural and conformational generalization, a careful balance between structural and conformational diversity is required, but existing QM datasets do not meet that trade-off. Additionally, our results highlight the limitation of the MLIP models at generalizing beyond their training distribution, emphasizing the importance of defining applicability domain during model deployment. These findings provide valuable insights and guidelines for QM data generation efforts.

2023-10-25

NeurIPS.cc/2023/Workshop/AI4D3 (poster)

doi.org

openreview.net

Understanding Graph Neural Networks with Generalized Geometric Scattering Transforms

Michael Perlmutter

Alexander Tong

Feng Gao

Guy Wolf

Matthew Hirn

The scattering transform is a multilayered wavelet-based deep learning architecture that acts as a model of convolutional neural networks. R… (voir plus)ecently, several works have introduced generalizations of the scattering transform for non-Euclidean settings such as graphs. Our work builds upon these constructions by introducing windowed and non-windowed geometric scattering transforms for graphs based upon a very general class of asymmetric wavelets. We show that these asymmetric graph scattering transforms have many of the same theoretical guarantees as their symmetric counterparts. As a result, the proposed construction unifies and extends known theoretical results for many of the existing graph scattering architectures. In doing so, this work helps bridge the gap between geometric scattering and other graph neural networks by introducing a large family of networks with provable stability and invariance guarantees. These results lay the groundwork for future deep learning architectures for graph-structured data that have learned filters and also provably have desirable theoretical properties.

2023-10-25

SIAM Journal on Mathematics of Data Science (publié)

doi.org

arxiv.org

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Publications

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications