Portrait of Siamak Ravanbakhsh

Siamak Ravanbakhsh

Core Academic Member
Canada CIFAR AI Chair
Assistant Professor, McGill University, School of Computer Science
Research Topics
Causality
Deep Learning
Dynamical Systems
Generative Models
Graph Neural Networks
Information Theory
Learning on Graphs
Machine Learning Theory
Molecular Modeling
Probabilistic Models
Reasoning
Reinforcement Learning
Representation Learning

Biography

Siamak Ravanbakhsh is an assistant professor at McGill University’s School of Computer Science and a core academic member of Mila – Quebec Artificial Intelligence Institute.

Before joining McGill and Mila, he held a similar position at the University of British Columbia. Prior to that, he was a postdoctoral fellow at the Machine Learning Department and Robotics Institute of Carnegie Mellon University. He completed his PhD at the University of Alberta.

Ravanbakhsh’s research is centred around problems of representation learning, in particular the principled use of geometry, probabilistic inference and symmetry.

Current Students

PhD - McGill University
Professional Master's - McGill University
PhD - McGill University
PhD - McGill University
PhD - McGill University
Co-supervisor :
PhD - McGill University
Master's Research - McGill University
Master's Research - McGill University
Collaborating Alumni - McGill University
Postdoctorate - McGill University
Master's Research - McGill University
PhD - McGill University
Collaborating Alumni - McGill University
Professional Master's - McGill University

Publications

SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models
Daniel Levy
Siba Smarak Panigrahi
Sékou-Oumar Kaba
Qiang Zhu
Kin Long Kelvin Lee
Mikhail Galkin
Santiago Miret
On the Identifiability of Causal Abstractions
Xiusi Li
Sékou-Oumar Kaba
Causal representation learning methods seek to enhance machine learning models' robustness and generalization capabilities by learning laten… (see more)t representations and causal graphs aligned with the data generating process. In many systems, fully recovering the true causal structure is challenging because we cannot intervene on all latent variables individually. We introduce a theoretical framework that calculates the degree to which we can identify a causal structure in the more realistic setting of interventions on arbitrary subsets of latent variables. We find that in that case, we can only identify a causal model up to a \emph{causal abstraction}. These causal abstractions are still meaningful in that they describe the system at a higher level of granularity. Conversely, given a causal abstraction, our framework provides sufficient conditions for its identifiability. Our findings extend existing identifiability results in two areas: those that address abstractions of latent variables without considering graphical structures and those that focus on graphical structures without incorporating their abstractions.
On the Identifiability of Causal Abstractions
Xiusi Li
Sékou-Oumar Kaba
Causal representation learning (CRL) enhances machine learning models' robustness and generalizability by learning structural causal models … (see more)associated with data-generating processes. We focus on a family of CRL methods that uses contrastive data pairs in the observable space, generated before and after a random, unknown intervention, to identify the latent causal model. (Brehmer et al., 2022) showed that this is indeed possible, given that all latent variables can be intervened on individually. However, this is a highly restrictive assumption in many systems. In this work, we instead assume interventions on arbitrary subsets of latent variables, which is more realistic. We introduce a theoretical framework that calculates the degree to which we can identify a causal model, given a set of possible interventions, up to an abstraction that describes the system at a higher level of granularity.
Symmetry-Aware Generative Modeling through Learned Canonicalization
Kusha Sareen
Daniel Levy
Arnab Kumar Mondal
Sékou-Oumar Kaba
Tara Akhound-Sadegh
Generative modeling of symmetric densities has a range of applications in AI for science, from drug discovery to physics simulations. The ex… (see more)isting generative modeling paradigm for invariant densities combines an invariant prior with an equivariant generative process. However, we observe that this technique is not necessary and has several drawbacks resulting from the limitations of equivariant networks. Instead, we propose to model a learned slice of the density so that only one representative element per orbit is learned. To accomplish this, we learn a group-equivariant canonicalization network that maps training samples to a canonical pose and train a non-equivariant generative model over these canonicalized samples. We implement this idea in the context of diffusion models. Our preliminary experimental results on molecular modeling are promising, demonstrating improved sample quality and faster inference time.
Symmetry-Aware Generative Modeling through Learned Canonicalization
Kusha Sareen
Daniel Levy
Arnab Kumar Mondal
Sékou-Oumar Kaba
Tara Akhound-Sadegh
Generative modeling of symmetric densities has a range of applications in AI for science, from drug discovery to physics simulations. The ex… (see more)isting generative modeling paradigm for invariant densities combines an invariant prior with an equivariant generative process. However, we observe that this technique is not necessary and has several drawbacks resulting from the limitations of equivariant networks. Instead, we propose to model a learned slice of the density so that only one representative element per orbit is learned. To accomplish this, we learn a group-equivariant canonicalization network that maps training samples to a canonical pose and train a non-equivariant generative model over these canonicalized samples. We implement this idea in the context of diffusion models. Our preliminary experimental results on molecular modeling are promising, demonstrating improved sample quality and faster inference time.
Sampling from Energy-based Policies using Diffusion
Vineet Jain
Tara Akhound-Sadegh
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
Tara Akhound-Sadegh
Jarrid Rector-Brooks
Sarthak Mittal
Pablo Lemos
Cheng-Hao Liu
Marcin Sendera
Nikolay Malkin
Alexander Tong
Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-… (see more)body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient---and no data samples---to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is *simulation-free*, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant
Weight-Sharing Regularization
Mehran Shakerinava
Motahareh Sohrabi
Scalable Hierarchical Self-Attention with Learnable Hierarchy for Long-Range Interactions
Thuan Nguyen Anh Trang
Khang Nhat Ngo
Hugo Sonnery
Thieu Vo
Truong Son Hy
Self-attention models have made great strides toward accurately modeling a wide array of data modalities, including, more recently, graph-st… (see more)ructured data. This paper demonstrates that adaptive hierarchical attention can go a long way toward successfully applying transformers to graphs. Our proposed model Sequoia provides a powerful inductive bias towards long-range interaction modeling, leading to better generalization. We propose an end-to-end mechanism for a data-dependent construction of a hierarchy which in turn guides the self-attention mechanism. Using adaptive hierarchy provides a natural pathway toward sparse attention by constraining node-to-node interactions with the immediate family of each node in the hierarchy (e.g., parent, children, and siblings). This in turn dramatically reduces the computational complexity of a self-attention layer from quadratic to log-linear in terms of the input size while maintaining or sometimes even surpassing the standard transformer's ability to model long-range dependencies across the entire input. Experimentally, we report state-of-the-art performance on long-range graph benchmarks while remaining computationally efficient. Moving beyond graphs, we also display competitive performance on long-range sequence modeling, point-clouds classification, and segmentation when using a fixed hierarchy. Our source code is publicly available at https://github.com/HySonLab/HierAttention
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
Tara Akhound-Sadegh
Jarrid Rector-Brooks
Sarthak Mittal
Pablo Lemos
Cheng-Hao Liu
Marcin Sendera
Nikolay Malkin
Alexander Tong
Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-… (see more)body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and no data samples -- to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is simulation-free, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
Tara Akhound-Sadegh
Jarrid Rector-Brooks
Sarthak Mittal
Pablo Lemos
Cheng-Hao Liu
Marcin Sendera
Nikolay Malkin
Alexander Tong
Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-… (see more)body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and no data samples -- to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is simulation-free, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant
E(3)-Equivariant Mesh Neural Networks
Thuan N.a. Trang
Nhat-Khang Ngô
Daniel Levy
Thieu N. Vo
Truong Son Hy
Triangular meshes are widely used to represent three-dimensional objects. As a result, many recent works have address the need for geometric… (see more) deep learning on 3D mesh. However, we observe that the complexities in many of these architectures does not translate to practical performance, and simple deep models for geometric graphs are competitive in practice. Motivated by this observation, we minimally extend the update equations of E(n)-Equivariant Graph Neural Networks (EGNNs) (Satorras et al., 2021) to incorporate mesh face information, and further improve it to account for long-range interactions through hierarchy. The resulting architecture, Equivariant Mesh Neural Network (EMNN), outperforms other, more complicated equivariant methods on mesh tasks, with a fast run-time and no expensive pre-processing. Our implementation is available at https://github.com/HySonLab/EquiMesh