Siamak Ravanbakhsh

Symmetry Breaking and Equivariant Neural Networks

Sékou-Oumar Kaba

Using symmetry as an inductive bias in deep learning has been proven to be a principled approach for sample-efficient model design. However,… (see more) the relationship between symmetry and the imperative for equivariance in neural networks is not always obvious. Here, we analyze a key limitation that arises in equivariant functions: their incapacity to break symmetry at the level of individual data samples. In response, we introduce a novel notion of 'relaxed equivariance' that circumvents this limitation. We further demonstrate how to incorporate this relaxation into equivariant multilayer perceptrons (E-MLPs), offering an alternative to the noise-injection method. The relevance of symmetry breaking is then discussed in various application domains: physics, graph representation learning, combinatorial optimization and equivariant decoding.

2023-12-14

ArXiv (preprint)

Symmetry Breaking and Equivariant Neural Networks

Sékou-Oumar Kaba

Using symmetry as an inductive bias in deep learning has been proven to be a principled approach for sample-efficient model design. However,… (see more) the relationship between symmetry and the imperative for equivariance in neural networks is not always obvious. Here, we analyze a key limitation that arises in equivariant functions: their incapacity to break symmetry at the level of individual data samples. In response, we introduce a novel notion of 'relaxed equivariance' that circumvents this limitation. We further demonstrate how to incorporate this relaxation into equivariant multilayer perceptrons (E-MLPs), offering an alternative to the noise-injection method. The relevance of symmetry breaking is then discussed in various application domains: physics, graph representation learning, combinatorial optimization and equivariant decoding.

2023-11-28

NeurIPS.cc/2023/Workshop/NeurReps (oral)

Physics-Informed Transformer Networks

Fabricio Dos Santos

F. Dos

Tara Akhound-Sadegh

Physics-informed neural networks (PINNs) have been recognized as a viable alternative to conventional numerical solvers for Partial Differen… (see more)tial Equations (PDEs). The main appeal of PINNs is that since they directly enforce the PDE equation, one does not require access to costly ground truth solutions for training the model. However, a key challenge is their limited generalization across varied initial conditions. Addressing this, our study presents a novel Physics-Informed Transformer (PIT) model for learning the solution operator for PDEs. Using the attention mechanism, PIT learns to leverage the relationships between its initial condition and query points, resulting in a significant improvement in generalization. Moreover, in contrast to existing physics-informed networks, our model is invariant to the discretization of the input domain, providing great flexibility in problem specification and training. We validated our proposed method on the 1D Burgers’ and the 2D Heat equations, demonstrating notable improvement over standard PINN models for operator learning with negligible computational overhead.

2023-10-31

NeurIPS.cc/2023/Workshop/DLDE (poster)

Learning to Reach Goals via Diffusion

Vineet Jain

We present a novel perspective on goal-conditioned reinforcement learning by framing it within the context of denoising diffusion models. An… (see more)alogous to the diffusion process, where Gaussian noise is used to create random trajectories that walk away from the data manifold, we construct trajectories that move away from potential goal states. We then learn a goal-conditioned policy to reverse these deviations, analogously to the score function. This approach, which we call Merlin, can reach specified goals from an arbitrary initial state without learning a separate value function. In contrast to recent works utilizing diffusion models in offline RL, Merlin stands out as the first method to perform diffusion in the state space, requiring only one ``denoising"iteration per environment step. We experimentally validate our approach in various offline goal-reaching tasks, demonstrating substantial performance enhancements compared to state-of-the-art methods while improving computational efficiency over other diffusion-based RL methods by an order of magnitude. Our results suggest that this perspective on diffusion for RL is a simple, scalable, and practical direction for sequential decision making.

2023-10-04

ArXiv (preprint)

Equivariant Adaptation of Large Pretrained Models

Arnab Kumar Mondal

Siba Smarak Panigrahi

Sékou-Oumar Kaba

Sai Rajeswar

Equivariant networks are specifically designed to ensure consistent behavior with respect to a set of input transformations, leading to high… (see more)er sample efficiency and more accurate and robust predictions. However, redesigning each component of prevalent deep neural network architectures to achieve chosen equivariance is a difficult problem and can result in a computationally expensive network during both training and inference. A recently proposed alternative towards equivariance that removes the architectural constraints is to use a simple canonicalization network that transforms the input to a canonical form before feeding it to an unconstrained prediction network. We show here that this approach can effectively be used to make a large pretrained network equivariant. However, we observe that the produced canonical orientations can be misaligned with those of the training distribution, hindering performance. Using dataset-dependent priors to inform the canonicalization function, we are able to make large pretrained models equivariant while maintaining their performance. This significantly improves the robustness of these models to deterministic transformations of the data, such as rotations. We believe this equivariant adaptation of large pretrained models can help their domain-specific applications with known symmetry priors.

Laurence Perreault-Levasseur

Lie Point Symmetry and Physics-Informed Networks

Tara Akhound-Sadegh

Johannes Brandstetter

Max Welling

Symmetries have been leveraged to improve the generalization of neural networks through different mechanisms from data augmentation to equiv… (see more)ariant architectures. However, despite their potential, their integration into neural solvers for partial differential equations (PDEs) remains largely unexplored. We explore the integration of PDE symmetries, known as Lie point symmetries, in a major family of neural solvers known as physics-informed neural networks (PINNs). We propose a loss function that informs the network about Lie point symmetries in the same way that PINN models try to enforce the underlying PDE through a loss function. Intuitively, our symmetry loss ensures that the infinitesimal generators of the Lie group conserve the PDE solutions.. Effectively, this means that once the network learns a solution, it also learns the neighbouring solutions generated by Lie point symmetries. Empirical evaluations indicate that the inductive bias introduced by the Lie point symmetries of the PDEs greatly boosts the sample efficiency of PINNs.

Using Multiple Vector Channels Improves E(n)-Equivariant Graph Neural Networks

Daniel Levy

Sékou-Oumar Kaba

Carmelo Gonzales

Santiago Miret

2023-09-06

ArXiv (preprint)

Equivariance with Learned Canonicalization Functions

Sékou-Oumar Kaba

Arnab Kumar Mondal

Yang Zhang

Yoshua Bengio

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (published)

Equivariance With Learned Canonicalization Functions

Sékou-Oumar Kaba

Arnab Kumar Mondal

Yan Zhang

Yoshua Bengio

Symmetry-based neural networks often constrain the architecture in order to achieve invariance or equivariance to a group of transformations… (see more). In this paper, we propose an alternative that avoids this architectural constraint by learning to produce a canonical representation of the data. These canonicalization functions can readily be plugged into non-equivariant backbone architectures. We offer explicit ways to implement them for many groups of interest. We show that this approach enjoys universality while providing interpretable insights. Our main hypothesis is that learning a neural network to perform canonicalization is better than doing it using predefined heuristics. Our results show that learning the canonicalization function indeed leads to better results and that the approach achieves great performance in practice.

2023-04-24

ICML.cc/2023/Conference (poster)

Characterization Of Inpaint Residuals In Interferometric Measurements of the Epoch Of Reionization

Michael Pagano

Jing Liu

Adrian Liu

Nicholas S Kern

Aaron Ewall-Wice

Philip Bull

Robert Pascua

Zara Abdurashidova

Tyrone Adams

James E Aguirre

Paul Alexander

Zaki S Ali

Rushelle Baartman

Yanga Balfour

Adam P Beardsley

Gianni Bernardi

Tashalee S Billings

Judd D Bowman

Richard F Bradley … (see 58 more)

Jacob Burba

Steven Carey

Chris L Carilli

Carina Cheng

David R DeBoer

Eloy de Lera Acedo

Matt Dexter

Joshua S Dillon

Nico Eksteen

John Ely

Nicolas Fagnoni

Randall Fritz

Steven R Furlanetto

Kingsley Gale-Sides

Brian Glendenning

Deepthi Gorthi

Bradley Greig

Jasper Grobbelaar

Ziyaad Halday

Bryna J Hazelton

Jacqueline N Hewitt

Jack Hickish

Daniel C Jacobs

Austin Julius

MacCalvin Kariseb

Joshua Kerrigan

Piyanat Kittiwisit

Saul A Kohn

Matthew Kolopanis

Adam Lanman

Paul La Plante

Anita Loots

David Harold Edward MacMahon

Lourence Malan

Cresshim Malgas

Keith Malgas

Bradley Marero

Zachary E Martinot

Andrei Mesinger

Mathakane Molewa

Miguel F Morales

Tshegofalang Mosiane

Abraham R Neben

Bojan Nikolic

Hans Nuwegeld

Aaron R Parsons

Nipanjana Patra

Samantha Pieterse

Nima Razavi-Ghods

James Robnett

Kathryn Rosie

Peter Sims

Craig Smith

Hilton Swarts

Nithyanandan Thyagarajan

Pieter van Wyngaarden

Peter K G Williams

Haoxuan Zheng

To mitigate the effects of Radio Frequency Interference (RFI) on the data analysis pipelines of 21cm interferometric instruments, numerous i… (see more)npaint techniques have been developed. In this paper we examine the qualitative and quantitative errors introduced into the visibilities and power spectrum due to inpainting. We perform our analysis on simulated data as well as real data from the Hydrogen Epoch of Reionization Array (HERA) Phase 1 upper limits. We also introduce a convolutional neural network that is capable of inpainting RFI corrupted data. We train our network on simulated data and show that our network is capable at inpainting real data without requiring to be retrained. We find that techniques that incorporate high wavenumbers in delay space in their modeling are best suited for inpainting over narrowband RFI. We show that with our fiducial parameters Discrete Prolate Spheroidal Sequences (DPSS) and CLEAN provide the best performance for intermittent RFI while Gaussian Progress Regression (GPR) and Least Squares Spectral Analysis (LSSA) provide the best performance for larger RFI gaps. However we caution that these qualitative conclusions are sensitive to the chosen hyperparameters of each inpainting technique. We show that all inpainting techniques reliably reproduce foreground dominated modes in the power spectrum. Since the inpainting techniques should not be capable of reproducing noise realizations, we find that the largest errors occur in the noise dominated delay modes. We show that as the noise level of the data comes down, CLEAN and DPSS are most capable of reproducing the fine frequency structure in the visibilities.

2023-02-10

Monthly Notices of the Royal Astronomical Society (published)

Physics-Informed Transformer Networks

F. Dos

Santos

Tara Akhound-Sadegh