Portrait de Siamak Ravanbakhsh

Siamak Ravanbakhsh

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur adjoint, McGill University, École d'informatique
Sujets de recherche
Apprentissage de représentations
Apprentissage par renforcement
Apprentissage profond
Apprentissage sur graphes
Causalité
Modèles génératifs
Modèles probabilistes
Modélisation moléculaire
Raisonnement
Réseaux de neurones en graphes
Systèmes dynamiques
Théorie de l'apprentissage automatique
Théorie de l'information

Biographie

Siamak Ravanbakhsh est professeur adjoint à l’École d’informatique de l’Université McGill depuis août 2019. Avant de se joindre à McGill et à Mila – Institut québécois d’intelligence artificielle, il a occupé un poste similaire à l’Université de la Colombie-Britannique. De 2015 à 2017, il a été stagiaire postdoctoral au Département d’apprentissage automatique et à l’Institut de robotique de l’Université Carnegie Mellon, et il a obtenu un doctorat de l’Université de l’Alberta. Il s’intéresse aux problèmes de l’apprentissage de la représentation et de l’inférence dans l’IA.

Ses recherches actuelles portent sur le rôle de la symétrie et de l’invariance dans l’apprentissage profond des représentations.

Étudiants actuels

Doctorat - McGill
Co-superviseur⋅e :
Stagiaire de recherche - McGill
Maîtrise professionnelle - McGill
Stagiaire de recherche - McGill
Visiteur de recherche indépendant
Doctorat - McGill
Doctorat - McGill
Co-superviseur⋅e :
Collaborateur·rice de recherche
Doctorat - McGill
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Collaborateur·rice alumni - McGill
Postdoctorat - McGill
Maîtrise recherche - McGill
Collaborateur·rice alumni - McGill
Maîtrise professionnelle - McGill

Publications

Learning to Reach Goals via Diffusion
Vineet Jain
We present a novel perspective on goal-conditioned reinforcement learning by framing it within the context of denoising diffusion models. An… (voir plus)alogous to the diffusion process, where Gaussian noise is used to create random trajectories that walk away from the data manifold, we construct trajectories that move away from potential goal states. We then learn a goal-conditioned policy to reverse these deviations, analogously to the score function. This approach, which we call Merlin, can reach specified goals from an arbitrary initial state without learning a separate value function. In contrast to recent works utilizing diffusion models in offline RL, Merlin stands out as the first method to perform diffusion in the state space, requiring only one ``denoising"iteration per environment step. We experimentally validate our approach in various offline goal-reaching tasks, demonstrating substantial performance enhancements compared to state-of-the-art methods while improving computational efficiency over other diffusion-based RL methods by an order of magnitude. Our results suggest that this perspective on diffusion for RL is a simple, scalable, and practical direction for sequential decision making.
Equivariant Adaptation of Large Pretrained Models
Arnab Kumar Mondal
Siba Smarak Panigrahi
Sékou-Oumar Kaba
Sai Rajeswar
Equivariant networks are specifically designed to ensure consistent behavior with respect to a set of input transformations, leading to high… (voir plus)er sample efficiency and more accurate and robust predictions. However, redesigning each component of prevalent deep neural network architectures to achieve chosen equivariance is a difficult problem and can result in a computationally expensive network during both training and inference. A recently proposed alternative towards equivariance that removes the architectural constraints is to use a simple canonicalization network that transforms the input to a canonical form before feeding it to an unconstrained prediction network. We show here that this approach can effectively be used to make a large pretrained network equivariant. However, we observe that the produced canonical orientations can be misaligned with those of the training distribution, hindering performance. Using dataset-dependent priors to inform the canonicalization function, we are able to make large pretrained models equivariant while maintaining their performance. This significantly improves the robustness of these models to deterministic transformations of the data, such as rotations. We believe this equivariant adaptation of large pretrained models can help their domain-specific applications with known symmetry priors.
Lie Point Symmetry and Physics-Informed Networks
Tara Akhound-Sadegh
Johannes Brandstetter
Max Welling
Symmetries have been leveraged to improve the generalization of neural networks through different mechanisms from data augmentation to equiv… (voir plus)ariant architectures. However, despite their potential, their integration into neural solvers for partial differential equations (PDEs) remains largely unexplored. We explore the integration of PDE symmetries, known as Lie point symmetries, in a major family of neural solvers known as physics-informed neural networks (PINNs). We propose a loss function that informs the network about Lie point symmetries in the same way that PINN models try to enforce the underlying PDE through a loss function. Intuitively, our symmetry loss ensures that the infinitesimal generators of the Lie group conserve the PDE solutions.. Effectively, this means that once the network learns a solution, it also learns the neighbouring solutions generated by Lie point symmetries. Empirical evaluations indicate that the inductive bias introduced by the Lie point symmetries of the PDEs greatly boosts the sample efficiency of PINNs.
Using Multiple Vector Channels Improves E(n)-Equivariant Graph Neural Networks
Daniel Levy
Sékou-Oumar Kaba
Carmelo Gonzales
Santiago Miret
Equivariance with Learned Canonicalization Functions
Sékou-Oumar Kaba
Arnab Kumar Mondal
Yan Zhang
Equivariance With Learned Canonicalization Functions
Sékou-Oumar Kaba
Arnab Kumar Mondal
Yan Zhang
Symmetry-based neural networks often constrain the architecture in order to achieve invariance or equivariance to a group of transformations… (voir plus). In this paper, we propose an alternative that avoids this architectural constraint by learning to produce a canonical representation of the data. These canonicalization functions can readily be plugged into non-equivariant backbone architectures. We offer explicit ways to implement them for many groups of interest. We show that this approach enjoys universality while providing interpretable insights. Our main hypothesis is that learning a neural network to perform canonicalization is better than doing it using predefined heuristics. Our results show that learning the canonicalization function indeed leads to better results and that the approach achieves great performance in practice.
Characterization Of Inpaint Residuals In Interferometric Measurements of the Epoch Of Reionization
Michael Pagano
Jing Liu
Adrian Liu
Nicholas S Kern
Aaron Ewall-Wice
Philip Bull
Robert Pascua
Zara Abdurashidova
Tyrone Adams
James E Aguirre
Paul Alexander
Zaki S Ali
Rushelle Baartman
Yanga Balfour
Adam P Beardsley
Gianni Bernardi
Tashalee S Billings
Judd D Bowman
Richard F Bradley … (voir 58 de plus)
Jacob Burba
Steven Carey
Chris L Carilli
Carina Cheng
David R DeBoer
Eloy de Lera Acedo
Matt Dexter
Joshua S Dillon
Nico Eksteen
John Ely
Nicolas Fagnoni
Randall Fritz
Steven R Furlanetto
Kingsley Gale-Sides
Brian Glendenning
Deepthi Gorthi
Bradley Greig
Jasper Grobbelaar
Ziyaad Halday
Bryna J Hazelton
Jacqueline N Hewitt
Jack Hickish
Daniel C Jacobs
Austin Julius
MacCalvin Kariseb
Joshua Kerrigan
Piyanat Kittiwisit
Saul A Kohn
Matthew Kolopanis
Adam Lanman
Paul La Plante
Anita Loots
David Harold Edward MacMahon
Lourence Malan
Cresshim Malgas
Keith Malgas
Bradley Marero
Zachary E Martinot
Andrei Mesinger
Mathakane Molewa
Miguel F Morales
Tshegofalang Mosiane
Abraham R Neben
Bojan Nikolic
Hans Nuwegeld
Aaron R Parsons
Nipanjana Patra
Samantha Pieterse
Nima Razavi-Ghods
James Robnett
Kathryn Rosie
Peter Sims
Craig Smith
Hilton Swarts
Nithyanandan Thyagarajan
Pieter van Wyngaarden
Peter K G Williams
Haoxuan Zheng
To mitigate the effects of Radio Frequency Interference (RFI) on the data analysis pipelines of 21cm interferometric instruments, numerous i… (voir plus)npaint techniques have been developed. In this paper we examine the qualitative and quantitative errors introduced into the visibilities and power spectrum due to inpainting. We perform our analysis on simulated data as well as real data from the Hydrogen Epoch of Reionization Array (HERA) Phase 1 upper limits. We also introduce a convolutional neural network that is capable of inpainting RFI corrupted data. We train our network on simulated data and show that our network is capable at inpainting real data without requiring to be retrained. We find that techniques that incorporate high wavenumbers in delay space in their modeling are best suited for inpainting over narrowband RFI. We show that with our fiducial parameters Discrete Prolate Spheroidal Sequences (DPSS) and CLEAN provide the best performance for intermittent RFI while Gaussian Progress Regression (GPR) and Least Squares Spectral Analysis (LSSA) provide the best performance for larger RFI gaps. However we caution that these qualitative conclusions are sensitive to the chosen hyperparameters of each inpainting technique. We show that all inpainting techniques reliably reproduce foreground dominated modes in the power spectrum. Since the inpainting techniques should not be capable of reproducing noise realizations, we find that the largest errors occur in the noise dominated delay modes. We show that as the noise level of the data comes down, CLEAN and DPSS are most capable of reproducing the fine frequency structure in the visibilities.
Physics-Informed Transformer Networks
F. Dos
Santos
Tara Akhound-Sadegh
Physics-informed neural networks (PINNs) have been recognized as a viable alternative to conventional numerical solvers for Partial Differen… (voir plus)tial Equations (PDEs). The main appeal of PINNs is that since they directly enforce the PDE equation, one does not require access to costly ground truth solutions for training the model. However, a key challenge is their limited generalization across varied initial conditions. Addressing this, our study presents a novel Physics-Informed Transformer (PIT) model for learning the solution operator for PDEs. Using the attention mechanism, PIT learns to leverage the relationships between its initial condition and query points, resulting in a significant improvement in generalization. Moreover, in contrast to existing physics-informed networks, our model is invariant to the discretization of the input domain, providing great flexibility in problem specification and training. We validated our proposed method on the 1D Burgers’ and the 2D Heat equations, demonstrating notable improvement over standard PINN models for operator learning with negligible computational overhead.