Derek Nowrouzezahrai

Informatique visuelle différentiable : passer de la deuxième à la troisième dimension dans les applications d’apprentissage automatique

paul.b.barde@gmail.com

Maîtrise recherche - UdeM

Superviseur⋅e principal⋅e :

Daniel Kong

Doctorat - McGill

Github

Ken Ming Lee

Maîtrise recherche - McGill

Site web

Github

Hossam Mohamed S M Ahmed

Doctorat - McGill

Github

Somjit Nath

Doctorat - McGill

Superviseur⋅e principal⋅e :

Samira Ebrahimi Kahou

Site web

Google Scholar

Yasaman Sabbagh Ziarani

Doctorat - McGill

Yuanyuan Tao

Doctorat - McGill

Mattie Tesfaldet

Doctorat - McGill

Co-superviseur⋅e :

Junha Yoo

Maîtrise recherche - McGill

Billets de blogue

An illustration of the forward and backward processes of a differentiable renderer.

29 janvier 2025

par

Jordan J. Bannister

Derek Nowrouzezahrai

Lire l'article

Publications

MeshDiffusion: Score-based Generative 3D Mesh Modeling

Zhen Liu

Yao Feng

Michael J. Black

Liam Paull

Weiyang Liu

We consider the task of generating realistic 3D shapes, which is useful for a variety of applications such as automatic scene generation and… (voir plus) physical simulation. Compared to other 3D representations like voxels and point clouds, meshes are more desirable in practice, because (1) they enable easy and arbitrary manipulation of shapes for relighting and simulation, and (2) they can fully leverage the power of modern graphics pipelines which are mostly optimized for meshes. Previous scalable methods for generating meshes typically rely on sub-optimal post-processing, and they tend to produce overly-smooth or noisy surfaces without fine-grained geometric details. To overcome these shortcomings, we take advantage of the graph structure of meshes and use a simple yet very effective generative modeling method to generate 3D meshes. Specifically, we represent meshes with deformable tetrahedral grids, and then train a diffusion model on this direct parameterization. We demonstrate the effectiveness of our model on multiple generative tasks.

2023-02-01

ICLR.cc/2023/Conference (notable)

From Words to Blocks: Building Objects by Grounding Language Models with Reinforcement Learning

Michael Ahn

Anthony Brohan

Noah Brown

liang Dai

Dan Su

Holy Lovenia Ziwei Bryan Wilie

Tiezheng Yu

Willy Chung

Quyet V. Do

Paul Barde

Tristan Karch

C. Bonial

Mitchell Abrams

David R. Traum

Hyung Won

Le Hou

Shayne Longpre

Yi Zoph

William Tay … (voir 32 de plus)

Eric Fedus

Xuezhi Li

Lasse Espeholt

Hubert Soyer

Remi Munos

Karen Si-801

Vlad Mnih

Tom Ward

Yotam Doron

Wenlong Huang

Pieter Abbeel

Deepak Pathak

Julia Kiseleva

Ziming Li

Mohammad Aliannejadi

Shrestha Mohanty

Maartje Ter Hoeve

Mikhail Burtsev

Alexey Skrynnik

Artem Zholus

A. Panov

Kavya Srinet

A. Szlam

Yuxuan Sun

Katja Hofmann

Marc-Alexandre Côté

Ahmed Hamid Awadallah

Linar Abdrazakov

Igor Churin

Putra Manggala

Kata Naszádi

Michiel Van Der Meer

Leveraging pre-trained language models to gen-001 erate action plans for embodied agents is an 002 emerging research direction. However, exe… (voir plus)-003 cuting instructions in real or simulated envi-004 ronments necessitates verifying the feasibility 005 of actions and their relevance in achieving a 006 goal. We introduce a novel method that in-007 tegrates a language model and reinforcement 008 learning for constructing objects in a Minecraft-009 like environment, based on natural language 010 instructions. Our method generates a set of 011 consistently achievable sub-goals derived from 012 the instructions and subsequently completes the 013 associated sub-tasks using a pre-trained RL pol-014 icy. We employ the IGLU competition, which 015 is based on the Minecraft-like simulator, as our 016 test environment, and compare our approach 017 to the competition’s top-performing solutions. 018 Our approach outperforms existing solutions in 019 terms of both the quality of the language model 020 and the quality of the structures built within the 021 IGLU environment. 022

Learning Latent Structural Causal Models

Jithendaraa Subramanian

Yashas Annadani

Ivaxi Sheth

Nan Rosemary Ke

Tristan Deleu

Stefan Bauer

Samira Ebrahimi Kahou

Causal learning has long concerned itself with the accurate recovery of underlying causal mechanisms. Such causal modelling enables better e… (voir plus)xplanations of out-of-distribution data. Prior works on causal learning assume that the high-level causal variables are given. However, in machine learning tasks, one often operates on low-level data like image pixels or high-dimensional vectors. In such settings, the entire Structural Causal Model (SCM) -- structure, parameters, \textit{and} high-level causal variables -- is unobserved and needs to be learnt from low-level data. We treat this problem as Bayesian inference of the latent SCM, given low-level data. For linear Gaussian additive noise SCMs, we present a tractable approximate inference method which performs joint inference over the causal variables, structure and parameters of the latent SCM from random, known interventions. Experiments are performed on synthetic datasets and a causally generated image dataset to demonstrate the efficacy of our approach. We also perform image generation from unseen interventions, thereby verifying out of distribution generalization for the proposed causal model.

2022-10-24

ArXiv (prépublication)

Single‐pass stratified importance resampling

Ege Ciklabakkal

Adrien Gruson

Iliyan Georgiev

Toshiya Hachisuka

Resampling is the process of selecting from a set of candidate samples to achieve a distribution (approximately) proportional to a desired t… (voir plus)arget. Recent work has revisited its application to Monte Carlo integration, yielding powerful and practical importance sampling methods. One drawback of existing resampling methods is that they cannot generate stratified samples. We propose two complementary techniques to achieve efficient stratified resampling. We first introduce bidirectional CDF sampling which yields the same result as conventional inverse CDF sampling but in a single pass over the candidates, without needing to store them, similarly to reservoir sampling. We then order the candidates along a space‐filling curve to ensure that stratified CDF sampling of candidate indices yields stratified samples in the integration domain. We showcase our method on various resampling‐based rendering problems.

2022-07-30

Computer Graphics Forum (publié)

Kubric: A scalable dataset generator

Klaus Greff

Francois Belletti

Lucas Beyer

Carl Doersch

Yilun Du

Daniel Duckworth

David J Fleet

Dan Gnanapragasam

Florian Golemo

Charles Herrmann

Thomas Kipf

Abhijit Kundu

Dmitry Lagun

Issam Hadj Laradji

Hsueh-Ti Liu

Henning Meyer

Yishu Miao

Cengiz Oztireli

Etienne Pot … (voir 14 de plus)

Noha Radwan

Daniel Rebain

Sara Sabour

Mehdi S. M. Sajjadi

Matan Sela

Vincent Sitzmann

Austin Stone

Deqing Sun

Suhani Vora

Ziyu Wang

Tianhao Wu

Kwang Moo Yi

Fangcheng Zhong

Andrea Tagliasacchi

Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance o… (voir plus)f a system than architecture and training details. But collecting, processing and annotating real data at scale is difficult, expensive, and frequently raises additional privacy, fairness and legal concerns. Synthetic data is a powerful tool with the potential to address these shortcomings: 1) it is cheap 2) supports rich ground-truth annotations 3) offers full control over data and 4) can circumvent or mitigate problems regarding bias, privacy and licensing. Unfortunately, software tools for effective data generation are less mature than those for architecture design and training, which leads to fragmented generation efforts. To address these problems we introduce Kubric, an open-source Python framework that interfaces with PyBullet and Blender to generate photo-realistic scenes, with rich annotations, and seamlessly scales to large jobs distributed over thousands of machines, and generating TBs of data. We demonstrate the effectiveness of Kubric by presenting a series of 13 different generated datasets for tasks ranging from studying 3D NeRF models to optical flow estimation. We release Kubric, the used assets, all of the generation code, as well as the rendered datasets for reuse and modification.

2022-06-18

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (publié)

Kubric: A scalable dataset generator

Klaus Greff

Francois Belletti

Lucas Beyer

Carl Doersch

Yilun Du

Daniel Duckworth

David J. Fleet

Dan Gnanapragasam

Florian Golemo

Charles Herrmann

Thomas N. Kipf

Abhijit Kundu

Dmitry Lagun

Issam Hadj Laradji

Hsueh-Ti Liu

H. Meyer

Yishu Miao

Cengiz Oztireli

Etienne Pot … (voir 14 de plus)

Noha Radwan

Daniel Rebain

Sara Sabour

Mehdi S. M. Sajjadi

Matan Sela

Vincent Sitzmann

Austin Stone

Deqing Sun

Suhani Vora

Ziyu Wang

Tianhao Wu

Kwang Moo Yi

Fangcheng Zhong

Andrea Tagliasacchi

2022-03-07

ArXiv (preprint)

Learning to Guide and to Be Guided in the Architect-Builder Problem

Paul Barde

Tristan Karch

Clément Moulin-Frier

Pierre-Yves Oudeyer

We are interested in interactive agents that learn to coordinate, namely, a …

2022-01-28

ICLR.cc/2022/Conference (poster)

Attention-based Neural Cellular Automata

Mattie Tesfaldet

Recent extensions of Cellular Automata (CA) have incorporated key ideas from modern deep learning, dramatically extending their capabilities… (voir plus) and catalyzing a new family of Neural Cellular Automata (NCA) techniques. Inspired by Transformer-based architectures, our work presents a new class of _attention-based_ NCAs formed using a spatially localized—yet globally organized—self-attention scheme. We introduce an instance of this class named _Vision Transformer Cellular Automata (ViTCA)_. We present quantitative and qualitative results on denoising autoencoding across six benchmark datasets, comparing ViTCA to a U-Net, a U-Net-based CA baseline (UNetCA), and a Vision Transformer (ViT). When comparing across architectures configured to similar parameter complexity, ViTCA architectures yield superior performance across all benchmarks and for nearly every evaluation metric. We present an ablation study on various architectural configurations of ViTCA, an analysis of its effect on cell states, and an investigation on its inductive biases. Finally, we examine its learned representations via linear probes on its converged cell state hidden representations, yielding, on average, superior results when compared to our U-Net, ViT, and UNetCA baselines.

Challenges in leveraging GANs for few-shot data augmentation

Christopher Beckham

Issam Hadj Laradji

Pau Rodriguez

David Vazquez

2022-01-01

arXiv.org (prépublication)

Overcoming challenges in leveraging GANs for few-shot data augmentation

Christopher Beckham

Issam Hadj Laradji

Pau Rodriguez

David Vazquez

2022-01-01

CoLLAs (publié)

proceedings.mlr.press

Robust motion in-betweening

Félix Harvey

Mike Yurick

In this work we present a novel, robust transition generation technique that can serve as a new tool for 3D animators, based on adversarial … (voir plus)recurrent neural networks. The system synthesises high-quality motions that use temporally-sparse keyframes as animation constraints. This is reminiscent of the job of in-betweening in traditional animation pipelines, in which an animator draws motion frames between provided keyframes. We first show that a state-of-the-art motion prediction model cannot be easily converted into a robust transition generator when only adding conditioning information about future keyframes. To solve this problem, we then propose two novel additive embedding modifiers that are applied at each timestep to latent representations encoded inside the network's architecture. One modifier is a time-to-arrival embedding that allows variations of the transition length with a single model. The other is a scheduled target noise vector that allows the system to be robust to target distortions and to sample different transitions given fixed keyframes. To qualitatively evaluate our method, we present a custom MotionBuilder plugin that uses our trained model to perform in-betweening in production scenarios. To quantitatively evaluate performance on transitions and generalizations to longer time horizons, we present well-defined in-betweening benchmarks on a subset of the widely used Human3.6M dataset and on LaFAN1, a novel high quality motion capture dataset that is more appropriate for transition generation. We are releasing this new dataset along with this work, with accompanying code for reproducing our baseline results.

2020-08-12

ACM Transactions on Graphics (publié)

Pix2Shape: Towards Unsupervised Learning of 3D Scenes from Images Using a View-Based Representation

Sai Rajeswar

Fahim Mannan

Florian Golemo

Jérôme Parent-Lévesque

David Vazquez

Aaron Courville

2020-03-20

International Journal of Computer Vision (publié)