Portrait of Derek Nowrouzezahrai

Derek Nowrouzezahrai

Core Academic Member
Canada CIFAR AI Chair
Associate Professor, McGill University, Department of Electrical and Computer Engineering
Research Topics
Computational Photography
Computer Vision
Deep Learning
Dynamical Systems
Generative Models
Reinforcement Learning
Representation Learning

Biography

Derek Nowrouzezahrai is a full professor at McGill University, where he directs the Centre for Intelligent Machines and co-directs the Graphics Lab.

He is also a Canada CIFAR AI Chair and holds the Ubisoft–Mila research Chair, Scaling Game Worlds with Responsible AI.

Nowrouzezahrai’s research tackles the simulation of various physical phenomena, such as the dynamics of moving objects and the simulation of lighting for realistic image synthesis, which have applications in virtual reality, video games, fluid simulation and control, digital manufacturing, computationally augmented optics and geometry processing. He is also interested in the development of differentiable simulators of these dynamical systems and their applications to inverse problems in robotics and vision.

This work relies fundamentally on developing high performance and sample efficient (Markov chain) Monte Carlo-based methods, high-order statistics and computational methods for complex multi-dimensional integration problems, differentiable physics-based simulators and numerical methods for dynamical systems, and on applying machine learning to 3D, visual and interactive media.

Current Students

PhD - McGill University
Collaborating researcher - McGill University
Co-supervisor :
Master's Research - Université de Montréal
Principal supervisor :
PhD - McGill University
PhD - McGill University
PhD - McGill University
PhD - McGill University
Principal supervisor :
PhD - McGill University
PhD - McGill University
Collaborating researcher - McGill University
Co-supervisor :
Research Intern - McGill University
Master's Research - McGill University
Co-supervisor :

Publications

Hierarchical Differentiable Fluid Simulation
Xiangyu Kong
Arnaud Schoentgen
Damien Rioux‐Lavoie
Paul G. Kry
Differentiable simulation is an emerging field that offers a powerful and flexible route to fluid control. In grid‐based settings, high me… (see more)mory consumption is a long‐standing bottleneck that constrains optimization resolution. We introduce a two‐step algorithm that significantly reduces memory usage: our method first optimizes for bulk forces at reduced resolution, then refines local details over sub‐domains while maintaining differentiability. In trading runtime for memory, it enables optimization at previously unattainable resolutions. We validate its effectiveness and memory savings on a series of fluid control problems.
Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play
Recent advances in Competitive Self-Play (CSP) have achieved, or even surpassed, human level performance in complex game environments such a… (see more)s Dota 2 and StarCraft II using Distributed Multi-Agent Reinforcement Learning (MARL). One core component of these methods relies on creating a pool of learning agents -- consisting of the Main Agent, past versions of this agent, and Exploiter Agents -- where Exploiter Agents learn counter-strategies to the Main Agents. A key drawback of these approaches is the large computational cost and physical time that is required to train the system, making them impractical to deploy in highly iterative real-life settings such as video game productions. In this paper, we propose the Minimax Exploiter, a game theoretic approach to exploiting Main Agents that leverages knowledge of its opponents, leading to significant increases in data efficiency. We validate our approach in a diversity of settings, including simple turn based games, the arcade learning environment, and For Honor, a modern video game. The Minimax Exploiter consistently outperforms strong baselines, demonstrating improved stability and data efficiency, leading to a robust CSP-MARL method that is both flexible and easy to deploy.
A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem
Training multiple agents to coordinate is an essential problem with applications in robotics, game theory, economics, and social sciences. H… (see more)owever, most existing Multi-Agent Reinforcement Learning (MARL) methods are online and thus impractical for real-world applications in which collecting new interactions is costly or dangerous. While these algorithms should leverage offline data when available, doing so gives rise to what we call the offline coordination problem. Specifically, we identify and formalize the strategy agreement (SA) and the strategy fine-tuning (SFT) coordination challenges, two issues at which current offline MARL algorithms fail. Concretely, we reveal that the prevalent model-free methods are severely deficient and cannot handle coordination-intensive offline multi-agent tasks in either toy or MuJoCo domains. To address this setback, we emphasize the importance of inter-agent interactions and propose the very first model-based offline MARL method. Our resulting algorithm, Model-based Offline Multi-Agent Proximal Policy Optimization (MOMA-PPO) generates synthetic interaction data and enables agents to converge on a strategy while fine-tuning their policies accordingly. This simple model-based solution solves the coordination-intensive offline tasks, significantly outperforming the prevalent model-free methods even under severe partial observability and with learned world models.
Neural Implicit Reduced Fluid Simulation
Ivan Puhachov
Paul Kry
High-fidelity simulation of fluid dynamics is challenging because of the high dimensional state data needed to capture fine details and the … (see more)large computational cost associated with advancing the system in time. We present neural implicit reduced fluid simulation (NIRFS), a reduced fluid simulation technique that combines an implicit neural representation of fluid shapes and a neural ordinary differential equation to model the dynamics of fluid in the reduced latent space. The latent trajectories are computed at very little cost in comparison to simulations for training, while preserving fine physical details. We show that this approach can work well, capturing the shapes and dynamics involved in a variety of scenarios with constrained initial conditions, e.g., droplet-droplet collisions, crown splashes, and fluid slosh in a container. In each scenario, we learn the latent implicit representation of fluid shapes with a deep-network signed distance function, as well as the energy function and parameters of a damped Hamiltonian system, which helps guarantee desirable properties of the latent dynamics. To ensure that latent shape representations form smooth and physically meaningful trajectories, we simultaneously learn the latent representation and dynamics. We evaluate novel simulations for conservation of volume and momentum conservation, discuss design decisions, and demonstrate an application of our method to fluid control.
MeshDiffusion: Score-Based Generative 3D Mesh Modeling
Yao Feng
Michael J. Black
Weiyang Liu
We consider the task of generating realistic 3D shapes, which is useful for a variety of applications such as automatic scene generation and… (see more) physical simulation. Compared to other 3D representations like voxels and point clouds, meshes are more desirable in practice, because (1) they enable easy and arbitrary manipulation of shapes for relighting and simulation, and (2) they can fully leverage the power of modern graphics pipelines which are mostly optimized for meshes. Previous scalable methods for generating meshes typically rely on sub-optimal post-processing, and they tend to produce overly-smooth or noisy surfaces without fine-grained geometric details. To overcome these shortcomings, we take advantage of the graph structure of meshes and use a simple yet very effective generative modeling method to generate 3D meshes. Specifically, we represent meshes with deformable tetrahedral grids, and then train a diffusion model on this direct parametrization. We demonstrate the effectiveness of our model on multiple generative tasks.
Visual Question Answering From Another Perspective: CLEVR Mental Rotation Tests
Different types of mental rotation tests have been used extensively in psychology to understand human visual reasoning and perception. Under… (see more)standing what an object or visual scene would look like from another viewpoint is a challenging problem that is made even harder if it must be performed from a single image. We explore a controlled setting whereby questions are posed about the properties of a scene if that scene was observed from another viewpoint. To do this we have created a new version of the CLEVR dataset that we call CLEVR Mental Rotation Tests (CLEVR-MRT). Using CLEVR-MRT we examine standard methods, show how they fall short, then explore novel neural architectures that involve inferring volumetric representations of a scene. These volumes can be manipulated via camera-conditioned transformations to answer the question. We examine the efficacy of different model variants through rigorous ablations and demonstrate the efficacy of volumetric representations.
Learning to Guide and to Be Guided in the Architect-Builder Problem
Tristan Karch
Clément Moulin-Frier
Christopher Pal
We are interested in interactive agents that learn to coordinate, namely, a …
Attention-based Neural Cellular Automata
Recent extensions of Cellular Automata (CA) have incorporated key ideas from modern deep learning, dramatically extending their capabilities… (see more) and catalyzing a new family of Neural Cellular Automata (NCA) techniques. Inspired by Transformer-based architectures, our work presents a new class of
Overcoming Challenges in Leveraging GANs for Few-Shot Data Augmentation
In this paper, we explore the use of GAN-based few-shot data augmentation as a method to improve few-shot classification performance. We per… (see more)form an exploration into how a GAN can be fine-tuned for such a task (one of which is in a class-incremental manner), as well as a rigorous empirical investigation into how well these models can perform to improve few-shot classification. We identify issues related to the difficulty of training such generative models under a purely supervised regime with very few examples, as well as issues regarding the evaluation protocols of existing works. We also find that in this regime, classification accuracy is highly sensitive to how the classes of the dataset are randomly split. Therefore, we propose a semi-supervised fine-tuning approach as a more pragmatic way forward to address these problems.
Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments
Millions of blind and visually-impaired (BVI) people navigate urban environments every day, using smartphones for high-level path-planning a… (see more)nd white canes or guide dogs for local information. However, many BVI people still struggle to travel to new places. In our endeavor to create a navigation assistant for the BVI, we found that existing Reinforcement Learning (RL) environments were unsuitable for the task. This work introduces SEVN, a sidewalk simulation environment and a neural network-based approach to creating a navigation agent. SEVN contains panoramic images with labels for house numbers, doors, and street name signs, and formulations for several navigation tasks. We study the performance of an RL algorithm (PPO) in this setting. Our policy model fuses multi-modal observations in the form of variable resolution images, visible text, and simulated GPS data to navigate to a goal door. We hope that this dataset, simulator, and experimental results will provide a foundation for further research into the creation of agents that can assist members of the BVI community with outdoor navigation.
Pix2Shape – Towards Unsupervised Learning of 3D Scenes from Images using a View-based Representation
We infer and generate three-dimensional (3D) scene information from a single input image and without supervision. This problem is under-expl… (see more)ored, with most prior work relying on supervision from, e.g., 3D ground-truth, multiple images of a scene, image silhouettes or key-points. We propose Pix2Shape, an approach to solve this problem with four components: (i) an encoder that infers the latent 3D representation from an image, (ii) a decoder that generates an explicit 2.5D surfel-based reconstruction of a scene from the latent code (iii) a differentiable renderer that synthesizes a 2D image from the surfel representation, and (iv) a critic network trained to discriminate between images generated by the decoder-renderer and those from a training distribution. Pix2Shape can generate complex 3D scenes that scale with the view-dependent on-screen resolution, unlike representations that capture world-space resolution, i.e., voxels or meshes. We show that Pix2Shape learns a consistent scene representation in its encoded latent space and that the decoder can then be applied to this latent representation in order to synthesize the scene from a novel viewpoint. We evaluate Pix2Shape with experiments on the ShapeNet dataset as well as on a novel benchmark we developed, called 3D-IQTT, to evaluate models based on their ability to enable 3d spatial reasoning. Qualitative and quantitative evaluation demonstrate Pix2Shape's ability to solve scene reconstruction, generation, and understanding tasks.
Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization
Adversarial Imitation Learning alternates between learning a discriminator -- which tells apart expert's demonstrations from generated ones … (see more)-- and a generator's policy to produce trajectories that can fool this discriminator. This alternated optimization is known to be delicate in practice since it compounds unstable adversarial training with brittle and sample-inefficient reinforcement learning. We propose to remove the burden of the policy optimization steps by leveraging a novel discriminator formulation. Specifically, our discriminator is explicitly conditioned on two policies: the one from the previous generator's iteration and a learnable policy. When optimized, this discriminator directly learns the optimal generator's policy. Consequently, our discriminator's update solves the generator's optimization problem for free: learning a policy that imitates the expert does not require an additional optimization loop. This formulation effectively cuts by half the implementation and computational burden of Adversarial Imitation Learning algorithms by removing the Reinforcement Learning phase altogether. We show on a variety of tasks that our simpler approach is competitive to prevalent Imitation Learning methods.