Portrait of Siamak Ravanbakhsh

Siamak Ravanbakhsh

Core Academic Member
Canada CIFAR AI Chair
Assistant Professor, McGill University, School of Computer Science
Research Topics
Causality
Deep Learning
Dynamical Systems
Generative Models
Graph Neural Networks
Information Theory
Learning on Graphs
Machine Learning Theory
Molecular Modeling
Probabilistic Models
Reasoning
Reinforcement Learning
Representation Learning

Biography

Siamak Ravanbakhsh is an assistant professor at McGill University’s School of Computer Science and a core academic member of Mila – Quebec Artificial Intelligence Institute.

Before joining McGill and Mila, he held a similar position at the University of British Columbia. Prior to that, he was a postdoctoral fellow at the Machine Learning Department and Robotics Institute of Carnegie Mellon University. He completed his PhD at the University of Alberta.

Ravanbakhsh’s research is centred around problems of representation learning, in particular the principled use of geometry, probabilistic inference and symmetry.

Current Students

PhD - McGill University
Research Intern - McGill University
Professional Master's - McGill University
Research Intern - McGill University
Independent visiting researcher
PhD - McGill University
PhD - McGill University
PhD - McGill University
Co-supervisor :
Collaborating researcher
PhD - McGill University
Master's Research - McGill University
Master's Research - McGill University
Master's Research - McGill University
Collaborating Alumni - McGill University
Postdoctorate - McGill University
Master's Research - McGill University
PhD - McGill University
Collaborating Alumni - McGill University
Professional Master's - McGill University

Publications

Learning to Reach Goals via Diffusion
Vineet Jain
We present a novel perspective on goal-conditioned reinforcement learning by framing it within the context of denoising diffusion models. An… (see more)alogous to the diffusion process, where Gaussian noise is used to create random trajectories that walk away from the data manifold, we construct trajectories that move away from potential goal states. We then learn a goal-conditioned policy to reverse these deviations, analogously to the score function. This approach, which we call Merlin, can reach specified goals from an arbitrary initial state without learning a separate value function. In contrast to recent works utilizing diffusion models in offline RL, Merlin stands out as the first method to perform diffusion in the state space, requiring only one ``denoising"iteration per environment step. We experimentally validate our approach in various offline goal-reaching tasks, demonstrating substantial performance enhancements compared to state-of-the-art methods while improving computational efficiency over other diffusion-based RL methods by an order of magnitude. Our results suggest that this perspective on diffusion for RL is a simple, scalable, and practical direction for sequential decision making.
Equivariant Adaptation of Large Pretrained Models
Arnab Kumar Mondal
Siba Smarak Panigrahi
Sékou-Oumar Kaba
Sai Rajeswar
Equivariant networks are specifically designed to ensure consistent behavior with respect to a set of input transformations, leading to high… (see more)er sample efficiency and more accurate and robust predictions. However, redesigning each component of prevalent deep neural network architectures to achieve chosen equivariance is a difficult problem and can result in a computationally expensive network during both training and inference. A recently proposed alternative towards equivariance that removes the architectural constraints is to use a simple canonicalization network that transforms the input to a canonical form before feeding it to an unconstrained prediction network. We show here that this approach can effectively be used to make a large pretrained network equivariant. However, we observe that the produced canonical orientations can be misaligned with those of the training distribution, hindering performance. Using dataset-dependent priors to inform the canonicalization function, we are able to make large pretrained models equivariant while maintaining their performance. This significantly improves the robustness of these models to deterministic transformations of the data, such as rotations. We believe this equivariant adaptation of large pretrained models can help their domain-specific applications with known symmetry priors.
Lie Point Symmetry and Physics-Informed Networks
Tara Akhound-Sadegh
Johannes Brandstetter
Max Welling
Symmetries have been leveraged to improve the generalization of neural networks through different mechanisms from data augmentation to equiv… (see more)ariant architectures. However, despite their potential, their integration into neural solvers for partial differential equations (PDEs) remains largely unexplored. We explore the integration of PDE symmetries, known as Lie point symmetries, in a major family of neural solvers known as physics-informed neural networks (PINNs). We propose a loss function that informs the network about Lie point symmetries in the same way that PINN models try to enforce the underlying PDE through a loss function. Intuitively, our symmetry loss ensures that the infinitesimal generators of the Lie group conserve the PDE solutions.. Effectively, this means that once the network learns a solution, it also learns the neighbouring solutions generated by Lie point symmetries. Empirical evaluations indicate that the inductive bias introduced by the Lie point symmetries of the PDEs greatly boosts the sample efficiency of PINNs.
Using Multiple Vector Channels Improves E(n)-Equivariant Graph Neural Networks
Daniel Levy
Sékou-Oumar Kaba
Carmelo Gonzales
Santiago Miret
Equivariance with Learned Canonicalization Functions
Sékou-Oumar Kaba
Arnab Kumar Mondal
Yan Zhang
Equivariance With Learned Canonicalization Functions
Sékou-Oumar Kaba
Arnab Kumar Mondal
Yan Zhang
Symmetry-based neural networks often constrain the architecture in order to achieve invariance or equivariance to a group of transformations… (see more). In this paper, we propose an alternative that avoids this architectural constraint by learning to produce a canonical representation of the data. These canonicalization functions can readily be plugged into non-equivariant backbone architectures. We offer explicit ways to implement them for many groups of interest. We show that this approach enjoys universality while providing interpretable insights. Our main hypothesis is that learning a neural network to perform canonicalization is better than doing it using predefined heuristics. Our results show that learning the canonicalization function indeed leads to better results and that the approach achieves great performance in practice.
Characterization Of Inpaint Residuals In Interferometric Measurements of the Epoch Of Reionization
Michael Pagano
Jing Liu
Adrian Liu
Nicholas S Kern
Aaron Ewall-Wice
Philip Bull
Robert Pascua
Zara Abdurashidova
Tyrone Adams
James E Aguirre
Paul Alexander
Zaki S Ali
Rushelle Baartman
Yanga Balfour
Adam P Beardsley
Gianni Bernardi
Tashalee S Billings
Judd D Bowman
Richard F Bradley … (see 58 more)
Jacob Burba
Steven Carey
Chris L Carilli
Carina Cheng
David R DeBoer
Eloy de Lera Acedo
Matt Dexter
Joshua S Dillon
Nico Eksteen
John Ely
Nicolas Fagnoni
Randall Fritz
Steven R Furlanetto
Kingsley Gale-Sides
Brian Glendenning
Deepthi Gorthi
Bradley Greig
Jasper Grobbelaar
Ziyaad Halday
Bryna J Hazelton
Jacqueline N Hewitt
Jack Hickish
Daniel C Jacobs
Austin Julius
MacCalvin Kariseb
Joshua Kerrigan
Piyanat Kittiwisit
Saul A Kohn
Matthew Kolopanis
Adam Lanman
Paul La Plante
Anita Loots
David Harold Edward MacMahon
Lourence Malan
Cresshim Malgas
Keith Malgas
Bradley Marero
Zachary E Martinot
Andrei Mesinger
Mathakane Molewa
Miguel F Morales
Tshegofalang Mosiane
Abraham R Neben
Bojan Nikolic
Hans Nuwegeld
Aaron R Parsons
Nipanjana Patra
Samantha Pieterse
Nima Razavi-Ghods
James Robnett
Kathryn Rosie
Peter Sims
Craig Smith
Hilton Swarts
Nithyanandan Thyagarajan
Pieter van Wyngaarden
Peter K G Williams
Haoxuan Zheng
To mitigate the effects of Radio Frequency Interference (RFI) on the data analysis pipelines of 21cm interferometric instruments, numerous i… (see more)npaint techniques have been developed. In this paper we examine the qualitative and quantitative errors introduced into the visibilities and power spectrum due to inpainting. We perform our analysis on simulated data as well as real data from the Hydrogen Epoch of Reionization Array (HERA) Phase 1 upper limits. We also introduce a convolutional neural network that is capable of inpainting RFI corrupted data. We train our network on simulated data and show that our network is capable at inpainting real data without requiring to be retrained. We find that techniques that incorporate high wavenumbers in delay space in their modeling are best suited for inpainting over narrowband RFI. We show that with our fiducial parameters Discrete Prolate Spheroidal Sequences (DPSS) and CLEAN provide the best performance for intermittent RFI while Gaussian Progress Regression (GPR) and Least Squares Spectral Analysis (LSSA) provide the best performance for larger RFI gaps. However we caution that these qualitative conclusions are sensitive to the chosen hyperparameters of each inpainting technique. We show that all inpainting techniques reliably reproduce foreground dominated modes in the power spectrum. Since the inpainting techniques should not be capable of reproducing noise realizations, we find that the largest errors occur in the noise dominated delay modes. We show that as the noise level of the data comes down, CLEAN and DPSS are most capable of reproducing the fine frequency structure in the visibilities.
Physics-Informed Transformer Networks
F. Dos
Santos
Tara Akhound-Sadegh
Physics-informed neural networks (PINNs) have been recognized as a viable alternative to conventional numerical solvers for Partial Differen… (see more)tial Equations (PDEs). The main appeal of PINNs is that since they directly enforce the PDE equation, one does not require access to costly ground truth solutions for training the model. However, a key challenge is their limited generalization across varied initial conditions. Addressing this, our study presents a novel Physics-Informed Transformer (PIT) model for learning the solution operator for PDEs. Using the attention mechanism, PIT learns to leverage the relationships between its initial condition and query points, resulting in a significant improvement in generalization. Moreover, in contrast to existing physics-informed networks, our model is invariant to the discretization of the input domain, providing great flexibility in problem specification and training. We validated our proposed method on the 1D Burgers’ and the 2D Heat equations, demonstrating notable improvement over standard PINN models for operator learning with negligible computational overhead.