Kirill Neklyudov

Efficient Evolutionary Search Over Chemical Space with Large Language Models

Haorui Wang

Marta Skreta

Cher Tian Ser

Wenhao Gao

Lingkai Kong

Felix Streith-Kalthoff

Chenru Duan

Yuchen Zhuang

Yue Yu

Yanqiao Zhu 0001

Yuanqi Du

Alan Aspuru-Guzik

Kirill Neklyudov

Chao Zhang

Molecular discovery, when formulated as an optimization problem, presents significant computational challenges because optimization objectiv… (voir plus)es can be non-differentiable. Evolutionary Algorithms (EAs), often used to optimize black-box objectives in molecular discovery, traverse chemical space by performing random mutations and crossovers, leading to a large number of expensive objective evaluations. In this work, we ameliorate this shortcoming by incorporating chemistry-aware Large Language Models (LLMs) into EAs. Namely, we redesign crossover and mutation operations in EAs using LLMs trained on large corpora of chemical information. We perform extensive empirical studies on both commercial and open-source models on multiple tasks involving property optimization, molecular rediscovery, and structure-based drug design, demonstrating that the joint usage of LLMs with EAs yields superior performance over all baseline models across single- and multi-objective settings. We demonstrate that our algorithm improves both the quality of the final solution and convergence speed, thereby reducing the number of required objective evaluations. Our code is available at http://github.com/zoom-wang112358/MOLLEO

2024-06-23

ArXiv (prépublication)

doi.org

arxiv.org

Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

Lazar Atanackovic

Xi Zhang

Brandon Amos

Leo J Lee

Numerous biological and physical processes can be modeled as systems of interacting samples evolving continuously over time, e.g. the dynami… (voir plus)cs of communicating cells or physical particles. Flow-based models allow for learning these dynamics at the population level --- they model the evolution of the entire distribution of samples. However, current flow-based models are limited to a single initial population and a set of predefined conditions which describe different dynamics. We propose

2024-06-17

ICML.cc/2024/Workshop/GRaM (publié)

openreview.net

Efficient Evolutionary Search Over Chemical Space with Large Language Models

Haorui Wang

Marta Skreta

Cher Tian Ser

Wenhao Gao

Lingkai Kong

Felix Streith-Kalthoff

Chenru Duan

Yuchen Zhuang

Yue Yu

Yanqiao Zhu 0001

Yuanqi Du

Alan Aspuru-Guzik

Kirill Neklyudov

Chao Zhang

Molecular discovery, when formulated as an optimization problem, presents significant computational challenges because optimization objectiv… (voir plus)es can be non-differentiable. Evolutionary Algorithms (EAs), often used to optimize black-box objectives in molecular discovery, traverse chemical space by performing random mutations and crossovers, leading to a large number of expensive objective evaluations. In this work, we ameliorate this shortcoming by incorporating chemistry-aware Large Language Models (LLMs) into EAs. Namely, we redesign crossover and mutation operations in EAs using LLMs trained on large corpora of chemical information. We perform extensive empirical studies on both commercial and open-source models on multiple tasks involving property optimization, molecular rediscovery, and structure-based drug design, demonstrating that the joint usage of LLMs with EAs yields superior performance over all baseline models across single- and multi-objective settings. We demonstrate that our algorithm improves both the quality of the final solution and convergence speed, thereby reducing the number of required objective evaluations. Our code is available at http://github.com/zoom-wang112358/MOLLEO

2024-06-01

arXiv (publié)

doi.org

arxiv.org

Structured Inverse-Free Natural Gradient Descent: Memory-Efficient & Numerically-Stable KFAC

Wu Lin

Felix Dangel

Runa Eschenhagen

Kirill Neklyudov

Agustinus Kristiadi

Richard E. Turner

Alireza Makhzani

Second-order methods such as KFAC can be useful for neural net training. However, they are often memory-inefficient since their precondition… (voir plus)ing Kronecker factors are dense, and numerically unstable in low precision as they require matrix inversion or decomposition. These limitations render such methods unpopular for modern mixed-precision training. We address them by (i) formulating an inverse-free KFAC update and (ii) imposing structures in the Kronecker factors, resulting in structured inverse-free natural gradient descent (SINGD). On modern neural networks, we show that SINGD is memory-efficient and numerically robust, in contrast to KFAC, and often outperforms AdamW even in half precision. Our work closes a gap between first- and second-order methods in modern low-precision training.

2024-01-01

ICML (publié)

proceedings.mlr.press

arxiv.org

Programme d’apprentissage IA sur mesure

Mil'Haq Fest 2025

Communauté de pratique de Mila

Demandes de supervision

Kirill Neklyudov

Étudiants actuels

Publications

Programme d’apprentissage IA sur mesure

Mil'Haq Fest 2025

Communauté de pratique de Mila

Demandes de supervision

Mots-clés populaires:

Kirill Neklyudov

Étudiants actuels

Publications