Yoshua Bengio

ahmad.ghawanmeh@mila.quebec

Biography

*For media requests, please write to medias@mila.quebec.

For more information please contact Julie Mongeau, executive assistant at julie.mongeau@mila.quebec.

Yoshua Bengio is recognized worldwide as a leading expert in AI. He is most known for his pioneering work in deep learning, which earned him the 2018 A.M. Turing Award, “the Nobel Prize of computing,” with Geoffrey Hinton and Yann LeCun.

Bengio is a full professor at Université de Montréal, and the founder and scientific director of Mila – Quebec Artificial Intelligence Institute. He is also a senior fellow at CIFAR and co-directs its Learning in Machines & Brains program, serves as scientific director of IVADO, and holds a Canada CIFAR AI Chair.

In 2019, Bengio was awarded the prestigious Killam Prize and in 2022, he was the most cited computer scientist in the world by h-index. He is a Fellow of the Royal Society of London, Fellow of the Royal Society of Canada, Knight of the Legion of Honor of France and Officer of the Order of Canada. In 2023, he was appointed to the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology.

Concerned about the social impact of AI, Bengio helped draft the Montréal Declaration for the Responsible Development of Artificial Intelligence and continues to raise awareness about the importance of mitigating the potentially catastrophic risks associated with future AI systems.

Current Students

Aayush Bajaj

Professional Master's - Université de Montréal

Co-supervisor :

Samira Ebrahimi Kahou

aayush.bajaj@mila.quebec

Ahmad Ghawanmeh

Professional Master's - Université de Montréal

Akram Erraqabi

PhD - Université de Montréal

akram.erraqabi@mila.quebec

Alex Hernandez-Garcia

Postdoctorate - Université de Montréal

Co-supervisor :

Postdoctorate - Université de Montréal

alexander.tong@mila.quebec

Sasha Volokhova

PhD - Université de Montréal

alexandra.volokhova@mila.quebec

Alexandre Duval

Collaborating researcher - Université Paris-Saclay

Principal supervisor :

alexandre.duval@mila.quebec

andres.campero@mila.quebec

Aman Dalmia

Professional Master's - Université de Montréal

aman.dalmia@mila.quebec

Andrés Campero

Independent visiting researcher - MIT

aniket.didolkar@mila.quebec

Aniket Didolkar

PhD - Université de Montréal

ayoub.atanane@mila.quebec

Anja Surina

PhD - École Polytechnique Montréal Fédérale de Lausanne

anja.surina@mila.quebec

Ayoub Atanane

Research Intern - Université du Québec à Rimouski

Basile Terver

Collaborating researcher

Principal supervisor :

basile.terver@mila.quebec

Camille Rochefort-Boulanger

PhD - Université de Montréal

Principal supervisor :

Julie Hussin

rochefoc@mila.quebec

clemence.granade@mila.quebec

Chen Chen

Postdoctorate - Université de Montréal

Co-supervisor :

Collaborating Alumni

Professional Master's - Université de Montréal

Cristian Meo

Collaborating Alumni

cristian.meo@mila.quebec

cristian-dragos.manta@mila.quebec

Cristian Dragos Manta

PhD - Université de Montréal

Co-supervisor :

Dhanya Sridhar

damiano.fornasiere@mila.quebec

Damiano Fornasiere

PhD - Barcelona University

Dan Assouline

Collaborating Alumni

dan.assouline@mila.quebec

dinghuai.zhang@mila.quebec

Dinghuai Zhang

PhD - Université de Montréal

Principal supervisor :

Aaron Courville

Divya Sharma

Collaborating Alumni

divya.sharma@mila.quebec

Donna Vakalis

Postdoctorate - Université de Montréal

Co-supervisor :

donna.vakalis@mila.quebec

dragos.secrieru@mila.quebec

Dragos Secrieru

Master's Research - Université de Montréal

Edward Hu

PhD - Université de Montréal

edward.hu@mila.quebec

Elmimouni Zakaria

Research Intern - Université de Montréal

zakarya.elmimouni@mila.quebec

Eric Elmoznino

PhD - Université de Montréal

Co-supervisor :

eric.elmoznino@mila.quebec

Research Intern - UQAR

ghait.boukachab@mila.quebec

jack.richter-powell@mila.quebec

Hae-Beom Lee

Collaborating Alumni

hae-beom.lee@mila.quebec

Jessie Richter-Powell

Independent visiting researcher - Université de Montréal

hussein-mohamu.jama@mila.quebec

Jama Mohamud

PhD - Université de Montréal

Principal supervisor :

Mirco Ravanelli

Research Intern - McGill University

jamal.abouhaibeh@mila.quebec

james.requeima@mila.quebec

James Requeima

Independent visiting researcher - Université de Montréal

Jarrid Rector-Brooks

PhD - Université de Montréal

Co-supervisor :

Sarath Chandar Anbil Parthipan

jarrid.rector-brooks@mila.quebec

Jean-pierre Falet

PhD - Université de Montréal

Co-supervisor :

jean-pierre.falet@mila.quebec

Professional Master's - Université de Montréal

jerome.francis@mila.quebec

katie-elizabeth.everett@mila.quebec

George Jiangyan Ma

Research Intern - Université de Montréal

jiangyan.ma@mila.quebec

PhD - Université de Montréal

madankan@mila.quebec

Katie Everett

PhD - Massachusetts Institute of Technology

Léna Ezzine

PhD - Université de Montréal

lena-nehale.ezzine@mila.quebec

Leo Feng

PhD - Université de Montréal

leo.feng@mila.quebec

Leon Hetzel

Independent visiting researcher - Technical University Munich (TUM)

leon.hetzel@mila.quebec

Ling Pan

Independent visiting researcher - Hong Kong University of Science and Technology (HKUST)

ling.pan@mila.quebec

loubna.benabbou@mila.quebec

Loic Mandine

DESS - Université de Montréal

loic.mandine@mila.quebec

Loubna Benabbou

Independent visiting researcher - UQAR

marcin.sendera@mila.quebec

Luca Scimeca

Postdoctorate - Université de Montréal

luca.scimeca@mila.quebec

PhD - Université de Montréal

korablym@mila.quebec

Marcin Sendera

Research Intern - Université de Montréal

Marco STOCK

Independent visiting researcher - Technical University of Munich

marco.stock@mila.quebec

matthew.macdermott@mila.quebec

Matt MacDermott

Research Intern - Imperial College London

Mélisande Astrid Crystal Teng

PhD - Université de Montréal

Co-supervisor :

Postdoctorate - Université de Montréal

michal.koziarski@mila.quebec

Harry Zhao

PhD - McGill University

Principal supervisor :

Mingze Li

Professional Master's - Université de Montréal

mingze2.li@mila.quebec

Minsu Kim

Collaborating researcher - Université de Montréal

minsu.kim@mila.quebec

Research Intern - Université de Montréal

mohammed.mahfoud@mila.quebec

Mohammed Abukalam

Research Intern - Université de Montréal

mohammed.abukalam@mila.quebec

Mohsin Hasan

PhD - Université de Montréal

mohsin.hasan@mila.quebec

nikolay.malkin@mila.quebec

Moksh Jain

PhD - Université de Montréal

moksh.jain@mila.quebec

PhD - Max-Planck-Institute for Intelligent Systems

rahamann@mila.quebec

Nicole Zhang

PhD - McGill University

Principal supervisor :

Collaborating Alumni - Université de Montréal

PhD - Université de Montréal

oussama.boussif@mila.quebec

pierre-paul.de-breuck@mila.quebec

Param Raval

Professional Master's - Université de Montréal

param.raval@mila.quebec

Paul Bertin

PhD - Université de Montréal

bertinpa@mila.quebec

Phong Nguyen

Independent visiting researcher - Université de Montréal

nguyenph@mila.quebec

Pierre-Paul De Breuck

Collaborating Alumni - Université de Montréal

Collaborating researcher

pietro.greiner@mila.quebec

Priya Nama Venkatesh

Professional Master's - Université de Montréal

priya.nama@mila.quebec

Prudencio Tossou

Collaborating researcher - Valence

Principal supervisor :

Dominique Beaini

prudencio.tossou@mila.quebec

Rim Assouel

PhD - Université de Montréal

assouelr@mila.quebec

Ruixiang Zhang

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

lahlosal@mila.quebec

Sarthak Mittal

PhD - Université de Montréal

Principal supervisor :

Seanie Lee

Research Intern - Université de Montréal

seanie.lee@mila.quebec

Professional Master's

aasheesh.singh@mila.quebec

Collaborating researcher - Université de Montréal

soren.mindermann@mila.quebec

Stefan Bauer

Independent visiting researcher

Co-supervisor :

stefan.bauer@mila.quebec

stefano.massaroli@mila.quebec

Stefano Massaroli

Postdoctorate - Université de Montréal

Stephen Lu

Research Intern - McGill University

stephen.lu@mila.quebec

Professional Master's - Université de Montréal

subhrajyoti.dasgupta@mila.quebec

Theo Saulus

Collaborating researcher

Principal supervisor :

thomas.jiralerspong@mila.quebec

theo.saulus@mila.quebec

Thomas Jiralerspong

Master's Research - Université de Montréal

Co-supervisor :

Doina Precup

PhD - Université de Montréal

tianyu.zhang@mila.quebec

PhD - Université de Montréal

Vedant Shah

Master's Research - Université de Montréal

vedant.shah@mila.quebec

PhD - Université de Montréal

Todosijevic Viktor Todosijevic

Collaborating researcher - RWTH Aachen University (Rheinisch-Westfälische Technische Hochschule Aachen)

Principal supervisor :

viktor.todosijevic@mila.quebec

vincent.quirion@mila.quebec

Vincent Quirion

Undergraduate - Université de Montréal

Xiaoyin Chen

PhD - Université de Montréal

xiaoyin.chen@mila.quebec

Yashaswi Pupneja

Professional Master's - Université de Montréal

yashaswi.pupneja@mila.quebec

younesse.kaddar@mila.quebec

Yizhao Wang

Professional Master's - Université de Montréal

yizhao.wang@mila.quebec

Younesse Kaddar

Research Intern - Université de Montréal

Skipper: Combining Spatial and Temporal Abstraction for Better Generalization

Zhen Liu

PhD - Université de Montréal

Principal supervisor :

Liam Paull

liuzhen@mila.quebec

Zibo Shang

Professional Master's - Université de Montréal

zibo.shang@mila.quebec

Zichao Yan

Postdoctorate - Université de Montréal

yanzicha@mila.quebec

Blog Posts

Generic thumbnail for Mila Blog articles.

February 22, 2024

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Scaling in the Service of Reasoning & Model-Based ML

April 4, 2023

Yoshua Bengio

Edward J. Hu

A collaboration between Mila and Relation Therapeutics to discover novel synergistic combinations of drugs in vitro

March 23, 2022

Paul Bertin

Jake P. Taylor-King

Yoshua Bengio

March 15, 2022

Generative Flow Networks

Yoshua Bengio

Publications

Discrete, compositional, and symbolic representations through attractor dynamics

Andrew Nam

Eric Elmoznino

Nikolay Malkin

Chen Sun

Compositionality is an important feature of discrete symbolic systems, such as language and programs, as it enables them to have infinite ca… (see more)pacity despite a finite symbol set. It serves as a useful abstraction for reasoning in both cognitive science and in AI, yet the interface between continuous and symbolic processing is often imposed by fiat at the algorithmic level, such as by means of quantization or a softmax sampling step. In this work, we explore how discretization could be implemented in a more neurally plausible manner through the modeling of attractor dynamics that partition the continuous representation space into basins that correspond to sequences of symbols. Building on established work in attractor networks and introducing novel training methods, we show that imposing structure in the symbolic space can produce compositionality in the attractor-supported representation space of rich sensory inputs. Lastly, we argue that our model exhibits the process of an information bottleneck that is thought to play a role in conscious experience, decomposing the rich information of a sensory input into stable components encoding symbolic information.

2023-10-27

NeurIPS.cc/2023/Workshop/InfoCog (oral)

Learning to Scale Logits for Temperature-Conditional GFlowNets

Minsu Kim

Joohwan Ko

Dinghuai Zhang

Ling Pan

Taeyoung Yun

Woo Chang Kim

Jinkyoo Park

GFlowNets are probabilistic models that learn a stochastic policy that sequentially generates compositional structures, such as molecular gr… (see more)aphs. They are trained with the objective of sampling such objects with probability proportional to the object's reward. Among GFlowNets, the temperature-conditional GFlowNets represent a family of policies indexed by temperature, and each is associated with the correspondingly tempered reward function. The major benefit of temperature-conditional GFlowNets is the controllability of GFlowNets' exploration and exploitation through adjusting temperature. We propose a \textit{Learning to Scale Logits for temperature-conditional GFlowNets} (LSL-GFN), a novel architectural design that greatly accelerates the training of temperature-conditional GFlowNets. It is based on the idea that previously proposed temperature-conditioning approaches introduced numerical challenges in the training of the deep network because different temperatures may give rise to very different gradient profiles and ideal scales of the policy's logits. We find that the challenge is greatly reduced if a learned function of the temperature is used to scale the policy's logits directly. We empirically show that our strategy dramatically improves the performances of GFlowNets, outperforming other baselines, including reinforcement learning and sampling methods, in terms of discovering diverse modes in multiple biochemical tasks.

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Science (poster)

Multi-Fidelity Active Learning with GFlowNets

Alex Hernandez-Garcia

Nikita Saxena

Moksh J. Jain

Cheng-Hao Liu

In the last decades, the capacity to generate large amounts of data in science and engineering applications has been growing steadily. Meanw… (see more)hile, the progress in machine learning has turned it into a suitable tool to process and utilise the available data. Nonetheless, many relevant scientific and engineering problems present challenges where current machine learning methods cannot yet efficiently leverage the available data and resources. For example, in scientific discovery, we are often faced with the problem of exploring very large, high-dimensional spaces, where querying a high fidelity, black-box objective function is very expensive. Progress in machine learning methods that can efficiently tackle such problems would help accelerate currently crucial areas such as drug and materials discovery. In this paper, we propose the use of GFlowNets for multi-fidelity active learning, where multiple approximations of the black-box function are available at lower fidelity and cost. GFlowNets are recently proposed methods for amortised probabilistic inference that have proven efficient for exploring large, high-dimensional spaces and can hence be practical in the multi-fidelity setting too. Here, we describe our algorithm for multi-fidelity active learning with GFlowNets and evaluate its performance in both well-studied synthetic tasks and practically relevant applications of molecular discovery. Our results show that multi-fidelity active learning with GFlowNets can efficiently leverage the availability of multiple oracles with different costs and fidelities to accelerate scientific discovery and engineering design.

2023-10-27

NeurIPS.cc/2023/Workshop/ReALML (published)

On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions

Alvaro Carbonero

Alexandre AGM Duval

Victor Schmidt

Santiago Miret

Alex Hernandez-Garcia

The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorpor… (see more)ate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Mat (poster)

Towards equilibrium molecular conformation generation with GFlowNets

Alexandra Volokhova

Michał Koziarski

Alex Hernandez-Garcia

Cheng-Hao Liu

Santiago Miret

Pablo Lemos

Luca Thiede

Zichao Yan

Alán Aspuru-Guzik

Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule. In this pa… (see more)per we propose to use GFlowNet for sampling conformations of small molecules from the Boltzmann distribution, as determined by the molecule's energy. The proposed approach can be used in combination with energy estimation methods of different fidelity and discovers a diverse set of low-energy conformations for highly flexible drug-like molecules. We demonstrate that GFlowNet can reproduce molecular potential energy surfaces by sampling proportionally to the Boltzmann distribution.

2023-10-27

NeurIPS.cc/2023/Workshop/AI4Mat (poster)

Causal machine learning for single-cell genomics

Alejandro Tejada-Lapuerta

Paul Bertin

Stefan Bauer

Hananeh Aliee

Fabian J. Theis

2023-10-23

ArXiv (preprint)

A community effort in SARS-CoV-2 drug discovery.

Johannes Schimunek

Philipp Seidl

Katarina Elez

Tim Hempel

Tuan Le

Frank Noé

Simon Olsson

Lluís Raich

Robin Winter

Hatice Gokcan

Filipp Gusev

Evgeny M. Gutkin

Olexandr Isayev

Maria G. Kurnikova

Chamali H. Narangoda

Roman Zubatyuk

Ivan P. Bosko

Konstantin V. Furs

Anna D. Karpenko

Yury V. Kornoushenko … (see 133 more)

Mikita Shuldau

Artsemi Yushkevich

Mohammed B. Benabderrahmane

Patrick Bousquet‐Melou

Ronan Bureau

Beatrice Charton

Bertrand C. Cirou

Gérard Gil

William J. Allen

Suman Sirimulla

Stanley Watowich

Nick Antonopoulos

Nikolaos Epitropakis

Agamemnon Krasoulis

Vassilis Pitsikalis

Stavros Theodorakis

Igor Kozlovskii

Anton Maliutin

Alexander Medvedev

Petr Popov

Mark Zaretckii

Hamid Eghbal‐Zadeh

Christina Halmich

Sepp Hochreiter

Andreas Mayr

Peter Ruch

Michael Widrich

Francois Berenger

Ashutosh Kumar

Yoshihiro Yamanishi

Kam Y. J. Zhang

Emmanuel Bengio

Moksh J. Jain

Maksym Korablyov

Cheng-Hao Liu

Gilles Marcou

Marcous Gilles

Enrico Glaab

Kelly Barnsley

Suhasini M. Iyengar

Mary Jo Ondrechen

V. Joachim Haupt

Florian Kaiser

Michael Schroeder

Luisa Pugliese

Simone Albani

Christina Athanasiou

Andrea Beccari

Paolo Carloni

Giulia D'Arrigo

Eleonora Gianquinto

Jonas Goßen

Anton Hanke

Benjamin P. Joseph

Daria B. Kokh

Sandra Kovachka

Candida Manelfi

Goutam Mukherjee

Abraham Muñiz‐Chicharro

Francesco Musiani

Ariane Nunes‐Alves

Giulia Paiardi

Giulia Rossetti

S. Kashif Sadiq

Francesca Spyrakis

Carmine Talarico

Alexandros Tsengenes

Rebecca C. Wade

Conner Copeland

Jeremiah Gaiser

Daniel R. Olson

Amitava Roy

Vishwesh Venkatraman

Travis J. Wheeler

Haribabu Arthanari

Klara Blaschitz

Marco Cespugli

Vedat Durmaz

Konstantin Fackeldey

Patrick D. Fischer

Christoph Gorgulla

Christian Gruber

Karl Gruber

Michael Hetmann

Jamie E. Kinney

Krishna M. Padmanabha Das

Shreya Pandita

Amit Singh

Georg Steinkellner

Guilhem Tesseyre

Gerhard Wagner

Zi‐Fu Wang

Ryan J. Yust

Dmitry S. Druzhilovskiy

Dmitry A. Filimonov

Pavel V. Pogodin

Vladimir Poroikov

Anastassia V. Rudik

Leonid A. Stolbov

Alexander V. Veselovsky

Maria De Rosa

Giada De Simone

Maria R. Gulotta

Jessica Lombino

Nedra Mekni

Ugo Perricone

Arturo Casini

Amanda Embree

D. Benjamin Gordon

David Lei

Katelin Pratt

Christopher A. Voigt

Kuang‐Yu Chen

Yves Jacob

Tim Krischuns

Pierre Lafaye

Agnès Zettor

M. Luis Rodríguez

Kris M. White

Daren Fearon

Frank Von Delft

Martin A. Walsh

Dragos Horvath

Charles L. Brooks

Babak Falsafi

Bryan Ford

Adolfo García‐Sastre

Sang Yup Lee

Nadia Naffakh

Alexandre Varnek

Günter Klambauer

Thomas M. Hermans

The COVID-19 pandemic continues to pose a substantial threat to human lives and is likely to do so for years to come. Despite the availabili… (see more)ty of vaccines, searching for efficient small-molecule drugs that are widely available, including in low- and middle-income countries, is an ongoing challenge. In this work, we report the results of an open science community effort, the "Billion molecules against Covid-19 challenge", to identify small-molecule inhibitors against SARS-CoV-2 or relevant human receptors. Participating teams used a wide variety of computational methods to screen a minimum of 1 billion virtual molecules against 6 protein targets. Overall, 31 teams participated, and they suggested a total of 639,024 molecules, which were subsequently ranked to find 'consensus compounds'. The organizing team coordinated with various contract research organizations (CROs) and collaborating institutions to synthesize and test 878 compounds for biological activity against proteases (Nsp5, Nsp3, TMPRSS2), nucleocapsid N, RdRP (only the Nsp12 domain), and (alpha) spike protein S. Overall, 27 compounds with weak inhibition/binding were experimentally identified by binding-, cleavage-, and/or viral suppression assays and are presented here. Open science approaches such as the one presented here contribute to the knowledge base of future drug discovery efforts in finding better SARS-CoV-2 treatments.

2023-10-13

Molecular Informatics (published)

A cry for help: Early detection of brain injury in newborns

Charles Onu

Samantha Latremouille

Arsenii Gorin

Junhao Wang

Uchenna Ekwochi

P. Ubuane

O. Kehinde

Muhammad A. Salisu

Datonye Briggs

Doina Precup

2023-10-12

ArXiv (preprint)

Crystal-GFN: sampling crystals with desirable properties and constraints

Alex Hernandez-Garcia

Alexandre AGM Duval

Alexandra Volokhova

Divya Sharma

pierre luc carrier

Michał Koziarski

Victor Schmidt

Accelerating material discovery holds the potential to greatly help mitigate the climate crisis. Discovering new solid-state materials such … (see more)as electrocatalysts, super-ionic conductors or photovoltaic materials can have a crucial impact, for instance, in improving the efficiency of renewable energy production and storage. In this paper, we introduce Crystal-GFN, a generative model of crystal structures that sequentially samples structural properties of crystalline materials, namely the space group, composition and lattice parameters. This domain-inspired approach enables the flexible incorporation of physical and structural hard constraints, as well as the use of any available predictive model of a desired physicochemical property as an objective function. To design stable materials, one must target the candidates with the lowest formation energy. Here, we use as objective the formation energy per atom of a crystal structure predicted by a new proxy machine learning model trained on MatBench. The results demonstrate that Crystal-GFN is able to sample highly diverse crystals with low (median -3.1 eV/atom) predicted formation energy.

2023-10-07

ArXiv (preprint)

Causal Inference in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems

Trang Nguyen

Alexander Tong

Kanika Madan

Dianbo Liu

Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular proc… (see more)esses. However, causal discovery in GRNs is a challenging problem for multiple reasons including the existence of cyclic feedback loops and uncertainty that yields diverse possible causal structures. Previous works in this area either ignore cyclic dynamics (assume acyclic structure) or struggle with scalability. We introduce Swift-DynGFN as a novel framework that enhances causal structure learning in GRNs while addressing scalability concerns. Specifically, Swift-DynGFN exploits gene-wise independence to boost parallelization and to lower computational cost. Experiments on real single-cell RNA velocity and synthetic GRN datasets showcase the advancement in learning causal structure in GRNs and scalability in larger systems.

2023-10-05

ArXiv (preprint)

Local Search GFlowNets

Minsu Kim

Taeyoung Yun

Emmanuel Bengio

Dinghuai Zhang

Sungsoo Ahn

Jinkyoo Park

Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their re… (see more)wards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which focuses on exploiting high-rewarded sample space to resolve this issue. Our main idea is to explore the local neighborhood via backtracking and reconstruction guided by backward and forward policies, respectively. This allows biasing the samples toward high-reward solutions, which is not possible for a typical GFlowNet solution generation scheme, which uses the forward policy to generate the solution from scratch. Extensive experiments demonstrate a remarkable performance improvement in several biochemical tasks. Source code is available: https://github.com/dbsxodud-11/ls_gfn.

2023-10-04

ArXiv (preprint)

Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts in Underspecified Visual Tasks

Luca Scimeca

Alexander Rubinstein

Armand Nicolicioiu

Damien Teney

Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to shortcut learning phenomena, where… (see more) a model may rely on erroneous, easy-to-learn, cues while ignoring reliable ones. In this work, we propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs). We discover that DPMs have the inherent capability to represent multiple visual cues independently, even when they are largely correlated in the training data. We leverage this characteristic to encourage model diversity and empirically show the efficacy of the approach with respect to several diversification objectives. We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.

2023-10-03

ArXiv (preprint)