Portrait de Siamak Ravanbakhsh

Siamak Ravanbakhsh

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur agrégé, McGill University, École d'informatique
Sujets de recherche
Alignement de l'IA
Apprentissage actif
Apprentissage de représentations
Apprentissage par renforcement
Apprentissage profond
Apprentissage sur graphes
Généralisation
IA pour la science
Inférence bayésienne
Modèles génératifs
Modèles probabilistes
Raisonnement
Symétrie

Biographie

Siamak Ravanbakhsh est professeur adjoint à l’École d’informatique de l’Université McGill depuis août 2019. Avant de se joindre à McGill et à Mila – Institut québécois d’intelligence artificielle, il a occupé un poste similaire à l’Université de la Colombie-Britannique. De 2015 à 2017, il a été stagiaire postdoctoral au Département d’apprentissage automatique et à l’Institut de robotique de l’Université Carnegie Mellon, et il a obtenu un doctorat de l’Université de l’Alberta. Il s’intéresse aux problèmes de l’apprentissage de la représentation et de l’inférence dans l’IA.

Ses recherches actuelles portent sur le rôle de la symétrie et de l’invariance dans l’apprentissage profond des représentations.

Étudiants actuels

Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Maîtrise recherche - McGill
Maîtrise professionnelle - McGill
Doctorat - McGill
Doctorat - McGill
Maîtrise recherche - McGill
Stagiaire de recherche - McGill
Doctorat - McGill
Collaborateur·rice de recherche - McGill
Postdoctorat - McGill
Maîtrise recherche - McGill
Collaborateur·rice alumni - McGill

Publications

Equivariant Adaptation of Large Pretrained Models
Arnab Kumar Mondal
Sékou-Oumar Kaba
Sai Rajeswar
Equivariant networks are specifically designed to ensure consistent behavior with respect to a set of input transformations, leading to high… (voir plus)er sample efficiency and more accurate and robust predictions. However, redesigning each component of prevalent deep neural network architectures to achieve chosen equivariance is a difficult problem and can result in a computationally expensive network during both training and inference. A recently proposed alternative towards equivariance that removes the architectural constraints is to use a simple canonicalization network that transforms the input to a canonical form before feeding it to an unconstrained prediction network. We show here that this approach can effectively be used to make a large pretrained network equivariant. However, we observe that the produced canonical orientations can be misaligned with those of the training distribution, hindering performance. Using dataset-dependent priors to inform the canonicalization function, we are able to make large pretrained models equivariant while maintaining their performance. This significantly improves the robustness of these models to deterministic transformations of the data, such as rotations. We believe this equivariant adaptation of large pretrained models can help their domain-specific applications with known symmetry priors.
Lie Point Symmetry and Physics Informed Networks
Symmetries have been leveraged to improve the generalization of neural networks through different mechanisms from data augmentation to equiv… (voir plus)ariant architectures. However, despite their potential, their integration into neural solvers for partial differential equations (PDEs) remains largely unexplored. We explore the integration of PDE symmetries, known as Lie point symmetries, in a major family of neural solvers known as physics-informed neural networks (PINNs). We propose a loss function that informs the network about Lie point symmetries in the same way that PINN models try to enforce the underlying PDE through a loss function. Intuitively, our symmetry loss ensures that the infinitesimal generators of the Lie group conserve the PDE solutions. Effectively, this means that once the network learns a solution, it also learns the neighbouring solutions generated by Lie point symmetries. Empirical evaluations indicate that the inductive bias introduced by the Lie point symmetries of the PDEs greatly boosts the sample efficiency of PINNs.
Using Multiple Vector Channels Improves E(n)-Equivariant Graph Neural Networks
Sékou-Oumar Kaba
Carmelo Gonzales
Santiago Miret
Equivariance with Learned Canonicalization Functions
Sékou-Oumar Kaba
Arnab Kumar Mondal
Symmetry-based neural networks often constrain the architecture in order to achieve invariance or equivariance to a group of transformations… (voir plus). In this paper, we propose an alternative that avoids this architectural constraint by learning to produce canonical representations of the data. These canonicalization functions can readily be plugged into non-equivariant backbone architectures. We offer explicit ways to implement them for some groups of interest. We show that this approach enjoys universality while providing interpretable insights. Our main hypothesis, supported by our empirical results, is that learning a small neural network to perform canonicalization is better than using predefined heuristics. Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks, including image classification,
Characterization of inpaint residuals in interferometric measurements of the epoch of reionization
Michael Pagano
Jing Liu
Adrian Liu
Nicholas S. Kern
Aaron Ewall-Wice
Philip Bull
Robert Pascua
Zara Abdurashidova
Tyrone Adams
James E. Aguirre
Paul Alexander
Zaki S. Ali
Rushelle Baartman
Yanga Balfour
Adam P. Beardsley
Gianni Bernardi
Tashalee S. Billings
Judd D. Bowman
Richard F. Bradley … (voir 58 de plus)
Jacob Burba
Steven Carey
Chris L. Carilli
Carina Cheng
David R. DeBoer
Eloy de Lera Acedo
Matt Dexter
Joshua S. Dillon
Nico Eksteen
John Ely
Nicolas Fagnoni
Randall Fritz
Steven R. Furlanetto
Kingsley Gale-Sides
Brian Glendenning
Deepthi Gorthi
Bradley Greig
Jasper Grobbelaar
Ziyaad Halday
Bryna J. Hazelton
Jacqueline N. Hewitt
Jack Hickish
Daniel C. Jacobs
Austin Julius
MacCalvin Kariseb
Joshua Kerrigan
Piyanat Kittiwisit
Saul A. Kohn
Matthew Kolopanis
Adam Lanman
Paul La Plante
Anita Loots
David Harold Edward MacMahon
Lourence Malan
Cresshim Malgas
Keith Malgas
Bradley Marero
Zachary E. Martinot
Andrei Mesinger
Mathakane Molewa
Miguel F. Morales
Tshegofalang Mosiane
Abraham R. Neben
Bojan Nikolic
Hans Nuwegeld
Aaron R. Parsons
Nipanjana Patra
Samantha Pieterse
Nima Razavi-Ghods
James Robnett
Kathryn Rosie
Peter Sims
Craig Smith
Hilton Swarts
Nithyanandan Thyagarajan
Pieter van Wyngaarden
Peter K. G. Williams
Haoxuan Zheng
Radio Frequency Interference (RFI) is one of the systematic challenges preventing 21cm interferometric instruments from detecting the Epoch … (voir plus)of Reionization. To mitigate the effects of RFI on data analysis pipelines, numerous inpaint techniques have been developed to restore RFI corrupted data. We examine the qualitative and quantitative errors introduced into the visibilities and power spectrum due to inpainting. We perform our analysis on simulated data as well as real data from the Hydrogen Epoch of Reionization Array (HERA) Phase 1 upper limits. We also introduce a convolutional neural network that capable of inpainting RFI corrupted data in interferometric instruments. We train our network on simulated data and show that our network is capable at inpainting real data without requiring to be retrained. We find that techniques that incorporate high wavenumbers in delay space in their modeling are best suited for inpainting over narrowband RFI. We also show that with our fiducial parameters Discrete Prolate Spheroidal Sequences (DPSS) and CLEAN provide the best performance for intermittent ``narrowband'' RFI while Gaussian Progress Regression (GPR) and Least Squares Spectral Analysis (LSSA) provide the best performance for larger RFI gaps. However we caution that these qualitative conclusions are sensitive to the chosen hyperparameters of each inpainting technique. We find these results to be consistent in both simulated and real visibilities. We show that all inpainting techniques reliably reproduce foreground dominated modes in the power spectrum. Since the inpainting techniques should not be capable of reproducing noise realizations, we find that the largest errors occur in the noise dominated delay modes. We show that in the future, as the noise level of the data comes down, CLEAN and DPSS are most capable of reproducing the fine frequency structure in the visibilities of HERA data.
Galaxies on graph neural networks: towards robust synthetic galaxy catalogs with deep generative models
Yesukhei Jagvaral
François Lanusse
Sukhdeep Singh
Rachel Mandelbaum
Duncan Campbell
The future astronomical imaging surveys are set to provide precise constraints on cosmological parameters, such as dark energy. However, pro… (voir plus)-duction of synthetic data for these surveys, to test and validate analysis methods, suffers from a very high computational cost. In particular, generating mock galaxy catalogs at sufficiently large volume and high resolution will soon become computa-tionally unreachable. In this paper, we address this problem with a Deep Generative Model to create robust mock galaxy catalogs that may be used to test and develop the analysis pipelines of future weak lensing surveys. We build our model on a custom built Graph Convolutional Networks, by placing each galaxy on a graph node and then connecting the graphs within each gravitationally bound system. We train our model on a cosmological simulation with realistic galaxy populations to capture the 2D and 3D orientations of galaxies. The samples from the model exhibit comparable statistical properties to those in the simulations. To the best of our knowledge, this is the first instance of a generative model on graphs in an astrophysical/cosmological context.
Galaxies and Halos on Graph Neural Networks: Deep Generative Modeling Scalar and Vector Quantities for Intrinsic Alignment
Yesukhei Jagvaral
François Lanusse
Sukhdeep Singh
Rachel Mandelbaum
Duncan Campbell
In order to prepare for the upcoming wide-field cosmological surveys, large simulations of the Universe with realistic galaxy populations ar… (voir plus)e required. In particular, the tendency of galaxies to naturally align towards overdensities, an effect called intrinsic alignments (IA), can be a major source of systematics in the weak lensing analysis. As the details of galaxy formation and evolution relevant to IA cannot be simulated in practice on such volumes, we propose as an alternative a Deep Generative Model. This model is trained on the IllustrisTNG-100 simulation and is capable of sampling the orientations of a population of galaxies so as to recover the correct alignments. In our approach, we model the cosmic web as a set of graphs, where the graphs are constructed for each halo, and galaxy orientations as a signal on those graphs. The generative model is implemented on a Generative Adversarial Network architecture and uses specifically designed Graph-Convolutional Networks sensitive to the relative 3D positions of the vertices. Given (sub)halo masses and tidal fields, the model is able to learn and predict scalar features such as galaxy and dark matter subhalo shapes; and more importantly, vector features such as the 3D orientation of the major axis of the ellipsoid and the complex 2D ellipticities. For correlations of 3D orientations the model is in good quantitative agreement with the measured values from the simulation, except for at very small and transition scales. For correlations of 2D ellipticities, the model is in good quantitative agreement with the measured values from the simulation on all scales. Additionally, the model is able to capture the dependence of IA on mass, morphological type and central/satellite type.
EqR: Equivariant Representations for Data-Efficient Reinforcement Learning
Utility Theory for Sequential Decision Making
The von Neumann-Morgenstern (VNM) utility theorem shows that under certain axioms of rationality, decision-making is reduced to maximizing t… (voir plus)he expectation of some utility function. We extend these axioms to increasingly structured sequential decision making settings and identify the structure of the corresponding utility functions. In particular, we show that memoryless preferences lead to a utility in the form of a per transition reward and multiplicative factor on the future return. This result motivates a generalization of Markov Decision Processes (MDPs) with this structure on the agent's returns, which we call Affine-Reward MDPs. A stronger constraint on preferences is needed to recover the commonly used cumulative sum of scalar rewards in MDPs. A yet stronger constraint simplifies the utility function for goal-seeking agents in the form of a difference in some function of states that we call potential functions. Our necessary and sufficient conditions demystify the reward hypothesis that underlies the design of rational agents in reinforcement learning by adding an axiom to the VNM rationality axioms and motivates new directions for AI research involving sequential decision making.
Transformation Coding: Simple Objectives for Equivariant Representations
Equivariant Networks for Crystal Structures
Sékou-Oumar Kaba
Supervised learning with deep models has tremendous potential for applications in materials science. Recently, graph neural networks have be… (voir plus)en used in this context, drawing direct inspiration from models for molecules. However, materials are typically much more structured than molecules, which is a feature that these models do not leverage. In this work, we introduce a class of models that are equivariant with respect to crystalline symmetry groups. We do this by defining a generalization of the message passing operations that can be used with more general permutation groups, or that can alternatively be seen as defining an expressive convolution operation on the crystal graph. Empirically, these models achieve competitive results with state-of-the-art on property prediction tasks.
Structuring Representations Using Group Invariants