Portrait of Emmanuel Bengio

Emmanuel Bengio

Associate Industry Member
Staff Machine Learning Scientist, Recursion
Research Topics
Deep Learning
Generative Models
GFlowNets
Molecular Modeling
Reinforcement Learning

Biography

Emmanuel Bengio is an ML Scientist at Valence Labs/Recursion, working on the intersection of GFlowNets and drug discovery. He did his PhD under Joelle Pineau and Doina Precup at McGill/Mila - Quebec Artificial Intelligence Institute, focusing on understanding generalization in deep RL.

Publications

Efficient Biological Data Acquisition through Inference Set Design
Ihor Neporozhnii
Julien Roy
Jason Hartford
In drug discovery, highly automated high-throughput laboratories are used to screen a large number of compounds in search of effective drugs… (see more). These experiments are expensive, so one might hope to reduce their cost by only experimenting on a subset of the compounds, and predicting the outcomes of the remaining experiments. In this work, we model this scenario as a sequential subset selection problem: we aim to select the smallest set of candidates in order to achieve some desired level of accuracy for the system as a whole. Our key observation is that, if there is heterogeneity in the difficulty of the prediction problem across the input space, selectively obtaining the labels for the hardest examples in the acquisition pool will leave only the relatively easy examples to remain in the inference set, leading to better overall system performance. We call this mechanism inference set design, and propose the use of a confidence-based active learning solution to prune out these challenging examples. Our algorithm includes an explicit stopping criterion that interrupts the acquisition loop when it is sufficiently confident that the system has reached the target performance. Our empirical studies on image and molecular datasets, as well as a real-world large-scale biological assay, show that active learning for inference set design leads to significant reduction in experimental cost while retaining high system performance.
Improved Off-policy Reinforcement Learning in Biological Sequence Design
Minsu Kim
Alex Hern'andez-Garc'ia
Jinkyoo Park
Designing biological sequences with desired properties is a significant challenge due to the combinatorially vast search space and the high … (see more)cost of evaluating each candidate sequence. To address these challenges, reinforcement learning (RL) methods, such as GFlowNets, utilize proxy models for rapid reward evaluation and annotated data for policy training. Although these approaches have shown promise in generating diverse and novel sequences, the limited training data relative to the vast search space often leads to the misspecification of proxy for out-of-distribution inputs. We introduce
Improved Off-policy Reinforcement Learning in Biological Sequence Design
Minsu Kim
Alex Hernandez-Garcia
Jinkyoo Park
Designing biological sequences with desired properties is a significant challenge due to the combinatorially vast search space and the high … (see more)cost of evaluating each candidate sequence. To address these challenges, reinforcement learning (RL) methods, such as GFlowNets, utilize proxy models for rapid reward evaluation and annotated data for policy training. Although these approaches have shown promise in generating diverse and novel sequences, the limited training data relative to the vast search space often leads to the misspecification of proxy for out-of-distribution inputs. We introduce
Amortizing intractable inference in diffusion models for vision, language, and control
Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors … (see more)in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data,
QGFN: Controllable Greediness with Action Values
Stephen Zhewen Lu
Ling Pan
Generative Flow Networks (GFlowNets; GFNs) are a family of energy-based generative methods for combinatorial objects, capable of generating … (see more)diverse and high-utility samples. However, consistently biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate,
GFlowNet Pretraining with Inexpensive Rewards
Generative Flow Networks (GFlowNets), a class of generative models have recently emerged as a suitable framework for generating diverse and … (see more)high-quality molecular structures by learning from unnormalized reward distributions. Previous works in this direction often restrict exploration by using predefined molecular fragments as building blocks, limiting the chemical space that can be accessed. In this work, we introduce Atomic GFlowNets (A-GFNs), a foundational generative model leveraging individual atoms as building blocks to explore drug-like chemical space more comprehensively. We propose an unsupervised pre-training approach using offline drug-like molecule datasets, which conditions A-GFNs on inexpensive yet informative molecular descriptors such as drug-likeliness, topological polar surface area, and synthetic accessibility scores. These properties serve as proxy rewards, guiding A-GFNs towards regions of chemical space that exhibit desirable pharmacological properties. We further our method by implementing a goal-conditioned fine-tuning process, which adapts A-GFNs to optimize for specific target properties. In this work, we pretrain A-GFN on the ZINC15 offline dataset and employ robust evaluation metrics to show the effectiveness of our approach when compared to other relevant baseline methods in drug design.
GFlowNet Pretraining with Inexpensive Rewards
Generative Flow Networks (GFlowNets), a class of generative models have recently emerged as a suitable framework for generating diverse and … (see more)high-quality molecular structures by learning from unnormalized reward distributions. Previous works in this direction often restrict exploration by using predefined molecular fragments as building blocks, limiting the chemical space that can be accessed. In this work, we introduce Atomic GFlowNets (A-GFNs), a foundational generative model leveraging individual atoms as building blocks to explore drug-like chemical space more comprehensively. We propose an unsupervised pre-training approach using offline drug-like molecule datasets, which conditions A-GFNs on inexpensive yet informative molecular descriptors such as drug-likeliness, topological polar surface area, and synthetic accessibility scores. These properties serve as proxy rewards, guiding A-GFNs towards regions of chemical space that exhibit desirable pharmacological properties. We further our method by implementing a goal-conditioned fine-tuning process, which adapts A-GFNs to optimize for specific target properties. In this work, we pretrain A-GFN on the ZINC15 offline dataset and employ robust evaluation metrics to show the effectiveness of our approach when compared to other relevant baseline methods in drug design.
QGFN: Controllable Greediness with Action Values
Stephen Zhewen Lu
Ling Pan
Generative Flow Networks (GFlowNets; GFNs) are a family of reward/energy-based generative methods for combinatorial objects, capable of gene… (see more)rating diverse and high-utility samples. However, biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate,
Random Policy Evaluation Uncovers Policies of Generative Flow Networks
Haoran He
Qingpeng Cai 0001
Ling Pan
The Generative Flow Network (GFlowNet) is a probabilistic framework in which an agent learns a stochastic policy and flow functions to sampl… (see more)e objects with probability proportional to an unnormalized reward function. GFlowNets share a strong connection with reinforcement learning (RL) that typically aims to maximize reward. A number of recent works explored connections between GFlowNets and maximum entropy (MaxEnt) RL, which incorporates entropy regularization into the standard RL objective. However, the relationship between GFlowNets and standard RL remains largely unexplored, despite the inherent similarities in their sequential decision-making nature. While GFlowNets can discover diverse solutions through specialized flow-matching objectives, connecting them to standard RL can simplify their implementation through well-established RL principles and also improve RL's capabilities in diverse solution discovery (a critical requirement in many real-world applications), and bridging this gap can further unlock the potential of both fields. In this paper, we bridge this gap by revealing a fundamental connection between GFlowNets and one of the most basic components of RL -- policy evaluation. Surprisingly, we find that the value function obtained from evaluating a uniform policy is closely associated with the flow functions in GFlowNets. Building upon these insights, we introduce a rectified random policy evaluation (RPE) algorithm, which achieves the same reward-matching effect as GFlowNets based on simply evaluating a fixed random policy, offering a new perspective. Empirical results across extensive benchmarks demonstrate that RPE achieves competitive results compared to previous approaches, shedding light on the previously overlooked connection between (non-MaxEnt) RL and GFlowNets.
Random Policy Evaluation Uncovers Policies of Generative Flow Networks
Haoran He
Qingpeng Cai 0001
Ling Pan
Amortizing intractable inference in diffusion models for vision, language, and control
Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors … (see more)in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data,
Amortizing intractable inference in diffusion models for vision, language, and control
Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors … (see more)in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data,