Portrait de Derek Nowrouzezahrai

Derek Nowrouzezahrai

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur agrégé, McGill University, Département de génie électrique et informatique
Sujets de recherche
Apprentissage de représentations
Apprentissage par renforcement
Apprentissage profond
Modèles génératifs
Photographie computationnelle
Systèmes dynamiques
Vision par ordinateur

Biographie

Derek Nowrouzezahrai est professeur titulaire à l'Université McGill, directeur du Centre sur les machines intelligentes et codirecteur du Laboratoire de graphisme et d’imagerie de McGill (MGIL), ainsi que titulaire d’une chaire en IA Canada-CIFAR et de la chaire Ubisoft-Mila de mise à l'échelle des univers de jeux grâce à une IA responsable.

Ses recherches portent sur la simulation de divers phénomènes physiques - tels que la dynamique des objets en mouvement et l'éclairage pour la synthèse d'images réalistes - avec des applications dans les domaines de la réalité virtuelle, des jeux vidéo, de la simulation fluide et contrôlée, de la fabrication numérique, de l'optique augmentée par le calcul et du traitement de la géométrie. En outre, Derek s'intéresse au développement de simulateurs dérivables de ces systèmes dynamiques et à leurs applications aux problèmes inverses en robotique et dans le domaine de la vision.

Son travail repose sur le développement de méthodes Monte Carlo à haute performance et efficaces en matière d'échantillonnage (chaîne de Markov), de statistiques d'ordre élevé et de méthodes de calcul pour les problèmes d'intégration multidimensionnelle complexes, de simulateurs dérivables basés sur la physique et de méthodes numériques pour les systèmes dynamiques, ainsi que sur l'application de l'apprentissage automatique aux médias 3D, visuels et interactifs.

Étudiants actuels

Doctorat - McGill
Collaborateur·rice de recherche - McGill
Co-superviseur⋅e :
Maîtrise recherche - UdeM
Superviseur⋅e principal⋅e :
Doctorat - McGill
Maîtrise recherche - McGill
Doctorat - McGill
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Doctorat - McGill
Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill

Publications

MeshDiffusion: Score-based Generative 3D Mesh Modeling
Zhen Liu
Yao Feng
Michael J. Black
Weiyang Liu
We consider the task of generating realistic 3D shapes, which is useful for a variety of applications such as automatic scene generation and… (voir plus) physical simulation. Compared to other 3D representations like voxels and point clouds, meshes are more desirable in practice, because (1) they enable easy and arbitrary manipulation of shapes for relighting and simulation, and (2) they can fully leverage the power of modern graphics pipelines which are mostly optimized for meshes. Previous scalable methods for generating meshes typically rely on sub-optimal post-processing, and they tend to produce overly-smooth or noisy surfaces without fine-grained geometric details. To overcome these shortcomings, we take advantage of the graph structure of meshes and use a simple yet very effective generative modeling method to generate 3D meshes. Specifically, we represent meshes with deformable tetrahedral grids, and then train a diffusion model on this direct parameterization. We demonstrate the effectiveness of our model on multiple generative tasks.
From Words to Blocks: Building Objects by Grounding Language Models with Reinforcement Learning
Michael Ahn
Anthony Brohan
Noah Brown
liang Dai
Dan Su
Holy Lovenia Ziwei Bryan Wilie
Tiezheng Yu
Willy Chung
Quyet V. Do
Paul Barde
Tristan Karch
C. Bonial
Mitchell Abrams
David R. Traum
Hyung Won
Le Hou
Shayne Longpre
Yi Zoph
William Tay … (voir 32 de plus)
Eric Fedus
Xuezhi Li
Lasse Espeholt
Hubert Soyer
Remi Munos
Karen Si-801
Vlad Mnih
Tom Ward
Yotam Doron
Wenlong Huang
Pieter Abbeel
Deepak Pathak
Julia Kiseleva
Ziming Li
Mohammad Aliannejadi
Shrestha Mohanty
Maartje Ter Hoeve
Mikhail Burtsev
Alexey Skrynnik
Artem Zholus
A. Panov
Kavya Srinet
A. Szlam
Yuxuan Sun
Katja Hofmann
Marc-Alexandre Côté
Ahmed Hamid Awadallah
Linar Abdrazakov
Igor Churin
Putra Manggala
Kata Naszádi
Michiel Van Der Meer
Leveraging pre-trained language models to gen-001 erate action plans for embodied agents is an 002 emerging research direction. However, exe… (voir plus)-003 cuting instructions in real or simulated envi-004 ronments necessitates verifying the feasibility 005 of actions and their relevance in achieving a 006 goal. We introduce a novel method that in-007 tegrates a language model and reinforcement 008 learning for constructing objects in a Minecraft-009 like environment, based on natural language 010 instructions. Our method generates a set of 011 consistently achievable sub-goals derived from 012 the instructions and subsequently completes the 013 associated sub-tasks using a pre-trained RL pol-014 icy. We employ the IGLU competition, which 015 is based on the Minecraft-like simulator, as our 016 test environment, and compare our approach 017 to the competition’s top-performing solutions. 018 Our approach outperforms existing solutions in 019 terms of both the quality of the language model 020 and the quality of the structures built within the 021 IGLU environment. 022
Learning Latent Structural Causal Models
Jithendaraa Subramanian
Yashas Annadani
Ivaxi Sheth
Nan Rosemary Ke
Tristan Deleu
Stefan Bauer
Causal learning has long concerned itself with the accurate recovery of underlying causal mechanisms. Such causal modelling enables better e… (voir plus)xplanations of out-of-distribution data. Prior works on causal learning assume that the high-level causal variables are given. However, in machine learning tasks, one often operates on low-level data like image pixels or high-dimensional vectors. In such settings, the entire Structural Causal Model (SCM) -- structure, parameters, \textit{and} high-level causal variables -- is unobserved and needs to be learnt from low-level data. We treat this problem as Bayesian inference of the latent SCM, given low-level data. For linear Gaussian additive noise SCMs, we present a tractable approximate inference method which performs joint inference over the causal variables, structure and parameters of the latent SCM from random, known interventions. Experiments are performed on synthetic datasets and a causally generated image dataset to demonstrate the efficacy of our approach. We also perform image generation from unseen interventions, thereby verifying out of distribution generalization for the proposed causal model.
Single‐pass stratified importance resampling
Ege Ciklabakkal
Adrien Gruson
Iliyan Georgiev
Toshiya Hachisuka
Resampling is the process of selecting from a set of candidate samples to achieve a distribution (approximately) proportional to a desired t… (voir plus)arget. Recent work has revisited its application to Monte Carlo integration, yielding powerful and practical importance sampling methods. One drawback of existing resampling methods is that they cannot generate stratified samples. We propose two complementary techniques to achieve efficient stratified resampling. We first introduce bidirectional CDF sampling which yields the same result as conventional inverse CDF sampling but in a single pass over the candidates, without needing to store them, similarly to reservoir sampling. We then order the candidates along a space‐filling curve to ensure that stratified CDF sampling of candidate indices yields stratified samples in the integration domain. We showcase our method on various resampling‐based rendering problems.
Kubric: A scalable dataset generator
Klaus Greff
Francois Belletti
Lucas Beyer
Carl Doersch
Yilun Du
Daniel Duckworth
David J Fleet
Dan Gnanapragasam
Florian Golemo
Charles Herrmann
Thomas Kipf
Abhijit Kundu
Dmitry Lagun
Issam Hadj Laradji
Hsueh-Ti Liu
Henning Meyer
Yishu Miao
Cengiz Oztireli
Etienne Pot … (voir 14 de plus)
Noha Radwan
Daniel Rebain
Sara Sabour
Mehdi S. M. Sajjadi
Matan Sela
Vincent Sitzmann
Austin Stone
Deqing Sun
Suhani Vora
Ziyu Wang
Tianhao Wu
Kwang Moo Yi
Fangcheng Zhong
Andrea Tagliasacchi
Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance o… (voir plus)f a system than architecture and training details. But collecting, processing and annotating real data at scale is difficult, expensive, and frequently raises additional privacy, fairness and legal concerns. Synthetic data is a powerful tool with the potential to address these shortcomings: 1) it is cheap 2) supports rich ground-truth annotations 3) offers full control over data and 4) can circumvent or mitigate problems regarding bias, privacy and licensing. Unfortunately, software tools for effective data generation are less mature than those for architecture design and training, which leads to fragmented generation efforts. To address these problems we introduce Kubric, an open-source Python framework that interfaces with PyBullet and Blender to generate photo-realistic scenes, with rich annotations, and seamlessly scales to large jobs distributed over thousands of machines, and generating TBs of data. We demonstrate the effectiveness of Kubric by presenting a series of 13 different generated datasets for tasks ranging from studying 3D NeRF models to optical flow estimation. We release Kubric, the used assets, all of the generation code, as well as the rendered datasets for reuse and modification.
Kubric: A scalable dataset generator
Klaus Greff
Francois Belletti
Lucas Beyer
Carl Doersch
Yilun Du
Daniel Duckworth
David J. Fleet
Dan Gnanapragasam
Florian Golemo
Charles Herrmann
Thomas N. Kipf
Abhijit Kundu
Dmitry Lagun
Issam Hadj Laradji
Hsueh-Ti Liu
H. Meyer
Yishu Miao
Cengiz Oztireli
Etienne Pot … (voir 14 de plus)
Noha Radwan
Daniel Rebain
Sara Sabour
Mehdi S. M. Sajjadi
Matan Sela
Vincent Sitzmann
Austin Stone
Deqing Sun
Suhani Vora
Ziyu Wang
Tianhao Wu
Kwang Moo Yi
Fangcheng Zhong
Andrea Tagliasacchi
Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance o… (voir plus)f a system than architecture and training details. But collecting, processing and annotating real data at scale is difficult, expensive, and frequently raises additional privacy, fairness and legal concerns. Synthetic data is a powerful tool with the potential to address these shortcomings: 1) it is cheap 2) supports rich ground-truth annotations 3) offers full control over data and 4) can circumvent or mitigate problems regarding bias, privacy and licensing. Unfortunately, software tools for effective data generation are less mature than those for architecture design and training, which leads to fragmented generation efforts. To address these problems we introduce Kubric, an open-source Python framework that interfaces with PyBullet and Blender to generate photo-realistic scenes, with rich annotations, and seamlessly scales to large jobs distributed over thousands of machines, and generating TBs of data. We demonstrate the effectiveness of Kubric by presenting a series of 13 different generated datasets for tasks ranging from studying 3D NeRF models to optical flow estimation. We release Kubric, the used assets, all of the generation code, as well as the rendered datasets for reuse and modification.
Learning to Guide and to Be Guided in the Architect-Builder Problem
Paul Barde
Tristan Karch
Clément Moulin-Frier
Pierre-Yves Oudeyer
We are interested in interactive agents that learn to coordinate, namely, a …
Attention-based Neural Cellular Automata
Mattie Tesfaldet
Recent extensions of Cellular Automata (CA) have incorporated key ideas from modern deep learning, dramatically extending their capabilities… (voir plus) and catalyzing a new family of Neural Cellular Automata (NCA) techniques. Inspired by Transformer-based architectures, our work presents a new class of _attention-based_ NCAs formed using a spatially localized—yet globally organized—self-attention scheme. We introduce an instance of this class named _Vision Transformer Cellular Automata (ViTCA)_. We present quantitative and qualitative results on denoising autoencoding across six benchmark datasets, comparing ViTCA to a U-Net, a U-Net-based CA baseline (UNetCA), and a Vision Transformer (ViT). When comparing across architectures configured to similar parameter complexity, ViTCA architectures yield superior performance across all benchmarks and for nearly every evaluation metric. We present an ablation study on various architectural configurations of ViTCA, an analysis of its effect on cell states, and an investigation on its inductive biases. Finally, we examine its learned representations via linear probes on its converged cell state hidden representations, yielding, on average, superior results when compared to our U-Net, ViT, and UNetCA baselines.
Challenges in leveraging GANs for few-shot data augmentation
Christopher Beckham
Issam Hadj Laradji
Pau Rodriguez
David Vazquez
Overcoming challenges in leveraging GANs for few-shot data augmentation
Christopher Beckham
Issam Hadj Laradji
Pau Rodriguez
David Vazquez
Robust motion in-betweening
Félix Harvey
Mike Yurick
In this work we present a novel, robust transition generation technique that can serve as a new tool for 3D animators, based on adversarial … (voir plus)recurrent neural networks. The system synthesises high-quality motions that use temporally-sparse keyframes as animation constraints. This is reminiscent of the job of in-betweening in traditional animation pipelines, in which an animator draws motion frames between provided keyframes. We first show that a state-of-the-art motion prediction model cannot be easily converted into a robust transition generator when only adding conditioning information about future keyframes. To solve this problem, we then propose two novel additive embedding modifiers that are applied at each timestep to latent representations encoded inside the network's architecture. One modifier is a time-to-arrival embedding that allows variations of the transition length with a single model. The other is a scheduled target noise vector that allows the system to be robust to target distortions and to sample different transitions given fixed keyframes. To qualitatively evaluate our method, we present a custom MotionBuilder plugin that uses our trained model to perform in-betweening in production scenarios. To quantitatively evaluate performance on transitions and generalizations to longer time horizons, we present well-defined in-betweening benchmarks on a subset of the widely used Human3.6M dataset and on LaFAN1, a novel high quality motion capture dataset that is more appropriate for transition generation. We are releasing this new dataset along with this work, with accompanying code for reproducing our baseline results.
Pix2Shape: Towards Unsupervised Learning of 3D Scenes from Images Using a View-Based Representation
Sai Rajeswar
Fahim Mannan
Florian Golemo
Jérôme Parent-Lévesque
David Vazquez