Taeyoung YUN

Improved Off-policy Reinforcement Learning in Biological Sequence Design

Minsu Kim

Jinkyoo Park

Designing biological sequences with desired properties is a significant challenge due to the combinatorially vast search space and the high … (voir plus)cost of evaluating each candidate sequence. To address these challenges, reinforcement learning (RL) methods, such as GFlowNets, utilize proxy models for rapid reward evaluation and annotated data for policy training. Although these approaches have shown promise in generating diverse and novel sequences, the limited training data relative to the vast search space often leads to the misspecification of proxy for out-of-distribution inputs. We introduce

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (publié)

doi.org

arxiv.org

Improved Off-policy Reinforcement Learning in Biological Sequence Design

Minsu Kim

Alex Hernandez-Garcia

Jinkyoo Park

Designing biological sequences with desired properties is challenging due to vast search spaces and limited evaluation budgets. Although rei… (voir plus)nforcement learning methods use proxy models for rapid reward evaluation, insufficient training data can cause proxy misspecification on out-of-distribution inputs. To address this, we propose a novel off-policy search,

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (publié)

proceedings.mlr.press

Active Attacks: Red-teaming LLMs via Adaptive Environments

Taeyoung YUN

Pierre-Luc St-Charles

Jinkyoo Park

Yoshua Bengio

Minsu Kim

We address the challenge of generating diverse attack prompts for large language models (LLMs) that elicit harmful behaviors (e.g., insults,… (voir plus) sexual content) and are used for safety fine-tuning. Rather than relying on manual prompt engineering, attacker LLMs can be trained with reinforcement learning (RL) to automatically generate such prompts using only a toxicity classifier as a reward. However, capturing a wide range of harmful behaviors is a significant challenge that requires explicit diversity objectives. Existing diversity-seeking RL methods often collapse to limited modes: once high-reward prompts are found, exploration of new regions is discouraged. Inspired by the active learning paradigm that encourages adaptive exploration, we introduce \textit{Active Attacks}, a novel RL-based red-teaming algorithm that adapts its attacks as the victim evolves. By periodically safety fine-tuning the victim LLM with collected attack prompts, rewards in exploited regions diminish, which forces the attacker to seek unexplored vulnerabilities. This process naturally induces an easy-to-hard exploration curriculum, where the attacker progresses beyond easy modes toward increasingly difficult ones. As a result, Active Attacks uncovers a wide range of local attack modes step by step, and their combination achieves wide coverage of the multi-mode distribution. Active Attacks, a simple plug-and-play module that seamlessly integrates into existing RL objectives, unexpectedly outperformed prior RL-based methods -- including GFlowNets, PPO, and REINFORCE -- by improving cross-attack success rates against GFlowNets, the previous state-of-the-art, from 0.07% to 31.28% (a relative gain greater than

2025-09-26

ArXiv (prépublication)

arxiv.org

Active Attacks: Red-teaming LLMs via Adaptive Environments

Taeyoung YUN

Pierre-Luc St-Charles

Jinkyoo Park

Yoshua Bengio

Minsu Kim

2025-09-26

ArXiv (prépublication)

doi.org

arxiv.org

Offline Model-Based Optimization: Comprehensive Review

Minsu Kim

Jiayao Gu

Ye Yuan

Taeyoung YUN

Zixuan Liu

Yoshua Bengio

Can Chen

2025-03-21

ArXiv (prépublication)

doi.org

arxiv.org

Offline Model-Based Optimization: Comprehensive Review

Minsu Kim

Jiayao Gu

Ye Yuan

Taeyoung YUN

Zixuan Liu

Yoshua Bengio

Can Chen

Offline optimization is a fundamental challenge in science and engineering, where the goal is to optimize black-box functions using only off… (voir plus)line datasets. This setting is particularly relevant when querying the objective function is prohibitively expensive or infeasible, with applications spanning protein engineering, material discovery, neural architecture search, and beyond. The main difficulty lies in accurately estimating the objective landscape beyond the available data, where extrapolations are fraught with significant epistemic uncertainty. This uncertainty can lead to objective hacking(reward hacking), exploiting model inaccuracies in unseen regions, or other spurious optimizations that yield misleadingly high performance estimates outside the training distribution. Recent advances in model-based optimization(MBO) have harnessed the generalization capabilities of deep neural networks to develop offline-specific surrogate and generative models. Trained with carefully designed strategies, these models are more robust against out-of-distribution issues, facilitating the discovery of improved designs. Despite its growing impact in accelerating scientific discovery, the field lacks a comprehensive review. To bridge this gap, we present the first thorough review of offline MBO. We begin by formalizing the problem for both single-objective and multi-objective settings and by reviewing recent benchmarks and evaluation metrics. We then categorize existing approaches into two key areas: surrogate modeling, which emphasizes accurate function approximation in out-of-distribution regions, and generative modeling, which explores high-dimensional design spaces to identify high-performing designs. Finally, we examine the key challenges and propose promising directions for advancement in this rapidly evolving field including safe control of superintelligent systems.

2025-03-21

ArXiv (prépublication)

doi.org

arxiv.org

Offline Model-Based Optimization: Comprehensive Review

Minsu Kim

Jiayao Gu

Ye Yuan

Taeyoung YUN

Zixuan Liu

Yoshua Bengio

Can Chen

Offline optimization is a fundamental challenge in science and engineering, where the goal is to optimize black-box functions using only off… (voir plus)line datasets. This setting is particularly relevant when querying the objective function is prohibitively expensive or infeasible, with applications spanning protein engineering, material discovery, neural architecture search, and beyond. The main difficulty lies in accurately estimating the objective landscape beyond the available data, where extrapolations are fraught with significant epistemic uncertainty. This uncertainty can lead to objective hacking(reward hacking), exploiting model inaccuracies in unseen regions, or other spurious optimizations that yield misleadingly high performance estimates outside the training distribution. Recent advances in model-based optimization(MBO) have harnessed the generalization capabilities of deep neural networks to develop offline-specific surrogate and generative models. Trained with carefully designed strategies, these models are more robust against out-of-distribution issues, facilitating the discovery of improved designs. Despite its growing impact in accelerating scientific discovery, the field lacks a comprehensive review. To bridge this gap, we present the first thorough review of offline MBO. We begin by formalizing the problem for both single-objective and multi-objective settings and by reviewing recent benchmarks and evaluation metrics. We then categorize existing approaches into two key areas: surrogate modeling, which emphasizes accurate function approximation in out-of-distribution regions, and generative modeling, which explores high-dimensional design spaces to identify high-performing designs. Finally, we examine the key challenges and propose promising directions for advancement in this rapidly evolving field including safe control of superintelligent systems.

2025-03-21

ArXiv (prépublication)

doi.org

arxiv.org

Offline Model-Based Optimization: Comprehensive Review

Minsu Kim

Jiayao Gu

Ye Yuan

Taeyoung YUN

Zixuan Liu

Yoshua Bengio

Can Chen

2025-03-21

ArXiv (prépublication)

arxiv.org

Adaptive teachers for amortized samplers

Minsu Kim

Sungsoo Ahn

Jinkyoo Park

Nikolay Malkin

Yoshua Bengio

Amortized inference is the task of training a parametric model, such as a neural network, to approximate a distribution with a given unnorma… (voir plus)lized density where exact sampling is intractable. When sampling is implemented as a sequential decision-making process, reinforcement learning (RL) methods, such as generative flow networks, can be used to train the sampling policy. Off-policy RL training facilitates the discovery of diverse, high-reward candidates, but existing methods still face challenges in efficient exploration. We propose to use an adaptive training distribution (the Teacher) to guide the training of the primary amortized sampler (the Student) by prioritizing high-loss regions. The Teacher, an auxiliary behavior model, is trained to sample high-error regions of the Student and can generalize across unexplored modes, thereby enhancing mode coverage by providing an efficient training curriculum. We validate the effectiveness of this approach in a synthetic environment designed to present an exploration challenge, two diffusion-based sampling tasks, and four biochemical discovery tasks demonstrating its ability to improve sample efficiency and mode coverage.

2025-01-22

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

Improved Off-policy Reinforcement Learning in Biological Sequence Design

Minsu Kim

Alex Hern'andez-Garc'ia

Jinkyoo Park

Designing biological sequences with desired properties is a significant challenge due to the combinatorially vast search space and the high … (voir plus)cost of evaluating each candidate sequence. To address these challenges, reinforcement learning (RL) methods, such as GFlowNets, utilize proxy models for rapid reward evaluation and annotated data for policy training. Although these approaches have shown promise in generating diverse and novel sequences, the limited training data relative to the vast search space often leads to the misspecification of proxy for out-of-distribution inputs. We introduce

2024-10-06

ArXiv (prépublication)

doi.org

arxiv.org

Adaptive teachers for amortized samplers

Minsu Kim

Sungsoo Ahn

Jinkyoo Park

Nikolay Malkin

Yoshua Bengio

Amortized inference is the task of training a parametric model, such as a neural network, to approximate a distribution with a given unnorma… (voir plus)lized density where exact sampling is intractable. When sampling is implemented as a sequential decision-making process, reinforcement learning (RL) methods, such as generative flow networks, can be used to train the sampling policy. Off-policy RL training facilitates the discovery of diverse, high-reward candidates, but existing methods still face challenges in efficient exploration. We propose to use an adaptive training distribution (the \teacher) to guide the training of the primary amortized sampler (the \student). The \teacher, an auxiliary behavior model, is trained to sample high-loss regions of the \student and can generalize across unexplored modes, thereby enhancing mode coverage by providing an efficient training curriculum. We validate the effectiveness of this approach in a synthetic environment designed to present an exploration challenge, two diffusion-based sampling tasks, and four biochemical discovery tasks demonstrating its ability to improve sample efficiency and mode coverage. Source code is available at https://github.com/alstn12088/adaptive-teacher.

2024-10-02

ArXiv (prépublication)

doi.org

arxiv.org

Adaptive teachers for amortized samplers

Minsu Kim

Sungsoo Ahn

Jinkyoo Park

Nikolay Malkin

Yoshua Bengio

Amortized inference is the task of training a parametric model, such as a neural network, to approximate a distribution with a given unnorma… (voir plus)lized density where exact sampling is intractable. When sampling is implemented as a sequential decision-making process, reinforcement learning (RL) methods, such as generative flow networks, can be used to train the sampling policy. Off-policy RL training facilitates the discovery of diverse, high-reward candidates, but existing methods still face challenges in efficient exploration. We propose to use an adaptive training distribution (the \teacher) to guide the training of the primary amortized sampler (the \student). The \teacher, an auxiliary behavior model, is trained to sample high-loss regions of the \student and can generalize across unexplored modes, thereby enhancing mode coverage by providing an efficient training curriculum. We validate the effectiveness of this approach in a synthetic environment designed to present an exploration challenge, two diffusion-based sampling tasks, and four biochemical discovery tasks demonstrating its ability to improve sample efficiency and mode coverage. Source code is available at https://github.com/alstn12088/adaptive-teacher.

2024-10-02

ArXiv (prépublication)

doi.org

arxiv.org

Conférence sur les politiques de l'IA de Mila

À l’avant-garde d’une nouvelle ère

TRAIL : IA responsable pour les professionnels et les leaders

Taeyoung YUN

Publications

Conférence sur les politiques de l'IA de Mila

À l’avant-garde d’une nouvelle ère

TRAIL : IA responsable pour les professionnels et les leaders

Mots-clés populaires:

Taeyoung YUN

Publications