Doina Precup

Sumana Basu

Collaborateur·rice alumni - McGill

Co-superviseur⋅e :

Adriana Romero Soriano

Collaborateur·rice alumni - McGill

Lynn Cherif

Collaborateur·rice alumni - McGill

Co-superviseur⋅e :

Doctorat - McGill

Co-superviseur⋅e :

Doctorat - McGill

Superviseur⋅e principal⋅e :

David Meger

Jonathan Colaço Carr

Maîtrise recherche - McGill

Superviseur⋅e principal⋅e :

Prakash Panangaden

Élodie Coté-Gauthier

Collaborateur·rice de recherche - McGill

Co-superviseur⋅e :

Isabeau Prémont-Schwarz

Franco Del Balso

Stagiaire de recherche - UdeM

Jesse Farebrother

Doctorat - McGill

Superviseur⋅e principal⋅e :

Marc Gendron-Bellemare

Doctorat - McGill

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - Birla Institute of Technology

Howard Huang

Doctorat - McGill

Haque Ishfaq

Collaborateur·rice alumni - McGill

Mohammad Sami Nur Islam Islam

Maîtrise recherche - McGill

Arushi Jain

Collaborateur·rice alumni - McGill

Doctorat - Polytechnique

Flemming Kondrup

Postdoctorat - McGill

Jonathan Lebensold

Collaborateur·rice alumni - McGill

Collaborateur·rice alumni - McGill

Ray Luo

Doctorat - McGill

Superviseur⋅e principal⋅e :

G McCracken

Doctorat - McGill

Nazanin Mohammadi Sepahvand

Collaborateur·rice alumni - McGill

Shahrad Mohammadzadeh

Maîtrise recherche - McGill

Superviseur⋅e principal⋅e :

Gabriela Moisescu-Pareja

Collaborateur·rice de recherche - McGill

Co-superviseur⋅e :

Irina Rish

Padideh Nouri

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - McGill

Co-superviseur⋅e :

Nate Rahn

Doctorat - McGill

Superviseur⋅e principal⋅e :

Marc Gendron-Bellemare

Sahand Rezaei-Shoshtari

Doctorat - McGill

Co-superviseur⋅e :

Doctorat - McGill

Co-superviseur⋅e :

Doctorat - McGill

Co-superviseur⋅e :

Doctorat - McGill

Nishanth Anand Vemgal

Doctorat - McGill

Doctorat - McGill

Co-superviseur⋅e :

Samira Ebrahimi Kahou

Zihan Wang

Doctorat - McGill

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

Guangyuan Wang

Stagiaire de recherche - McGill

Steve Wen

Maîtrise recherche - McGill

Co-superviseur⋅e :

Gregory Dudek

Zijing Wu

Doctorat - McGill

Superviseur⋅e principal⋅e :

Doctorat - McGill

Harry Zhao

Collaborateur·rice alumni - McGill

Co-superviseur⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Lire l'article

Publications

MaestroMotif: Skill Design from Artificial Intelligence Feedback

Martin Klissarov

Mikael Henaff

Roberta Raileanu

Marlos C. Machado

Describing skills in natural language has the potential to provide an accessible way to inject human knowledge about decision-making into an… (voir plus) AI system. We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and adaptable agents. MaestroMotif leverages the capabilities of Large Language Models (LLMs) to effectively create and reuse skills. It first uses an LLM's feedback to automatically design rewards corresponding to each skill, starting from their natural language description. Then, it employs an LLM's code generation abilities, together with reinforcement learning, for training the skills and combining them to implement complex behaviors specified in language. We evaluate MaestroMotif using a suite of complex tasks in the NetHack Learning Environment (NLE), demonstrating that it surpasses existing approaches in both performance and usability.

2024-12-11

ArXiv (prépublication)

Towards AI-designed genomes using a variational autoencoder

Natasha K. Dudek

N.K. Dudek

Synthetic biology holds great promise for bioengineering applications such as environmental bioremediation, probiotic formulation, and produ… (voir plus)ction of renewable biofuels. Humans’ capacity to design biological systems from scratch is limited by their sheer size and complexity. We introduce a framework for training a machine learning model to learn the basic genetic principles underlying the gene composition of bacterial genomes. Our variational autoencoder model, DeepGenomeVector, was trained to take as input corrupted bacterial genetic blueprints (i.e. complete gene sets, henceforth ‘genome vectors’) in which most genes had been “removed”, and re-create the original. The resulting model effectively captures the complex dependencies in genomic networks, as evaluated by both qualitative and quantitative metrics. An in-depth functional analysis of a generated gene vector shows that its encoded pathways are interconnected and nearly complete. On the test set, where the model’s ability to re-generate the original, uncorrupted genome vector was evaluated, an AUC score of 0.98 and an F1 score of 0.82 provide support for the model’s ability to generate diverse, high-quality genome vectors. This work showcases the power of machine learning approaches for synthetic biology and highlights the possibility that just as humans can design an AI that animates a robot, AIs may one day be able to design a genomic blueprint that animates a carbon-based cell. SIGNIFICANCE STATEMENT Genomes serve as the blueprints for life, encoding complex networks of genes whose products must seamlessly interact to result in living organisms. In this work, we develop a framework for training a machine learning algorithm to learn the basic genetic principles that underlie genome composition. This innovation may eventually lead to improvements in the genome design process, increasing the speed and reliability of designs while decreasing cost. It further suggests that AI agents may one day have the potential to design blueprints for carbon-based life.

2024-12-11

Proceedings of the Royal Society B: Biological Sciences (publié)

Parseval Regularization for Continual Reinforcement Learning

Loss of plasticity, trainability loss, and primacy bias have been identified as issues arising when training deep neural networks on sequenc… (voir plus)es of tasks -- all referring to the increased difficulty in training on new tasks. We propose to use Parseval regularization, which maintains orthogonality of weight matrices, to preserve useful optimization properties and improve training in a continual reinforcement learning setting. We show that it provides significant benefits to RL agents on a suite of gridworld, CARL and MetaWorld tasks. We conduct comprehensive ablations to identify the source of its benefits and investigate the effect of certain metrics associated to network trainability including weight matrix rank, weight norms and policy entropy.

2024-12-10

ArXiv (prépublication)

Parseval Regularization for Continual Reinforcement Learning

2024-12-10

ArXiv (prépublication)

Towards AI-designed genomes using a variational autoencoder

N.K. Dudek

Genomes encode elaborate networks of genes whose products must seamlessly interact to support living organisms. Humans’ capacity to unders… (voir plus)tand these biological systems is limited by their sheer size and complexity. In this work, we develop a proof of concept framework for training a machine learning algorithm to model bacterial genome composition. To achieve this, we create simplified representations of genomes in the form of binary vectors that indicate the encoded genes, henceforth referred to as genome vectors. A denoising variational autoencoder was trained to accept corrupted genome vectors, in which most genes had been masked, and reconstruct the original. The resulting model, DeepGenomeVector, effectively captures complex dependencies in genomic networks, as evaluated by both qualitative and quantitative metrics. An in-depth functional analysis of a generated genome vector shows that its encoded pathways are interconnected, near complete, and ecologically cohesive. On the test set, where the model’s ability to reconstruct uncorrupted genome vectors was evaluated, AUC and F1 scores of 0.98 and 0.83, respectively, support the model’s strong performance. This work showcases the power of machine learning approaches for synthetic biology and highlights the possibility that AI agents may one day be able to design genomes that animate carbon-based cells.

2024-11-17

bioRxiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Yang Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

The introduction of models like RFDiffusionAA, AlphaFold3, AlphaProteo, and Chai1 has revolutionized protein structure modeling and interact… (voir plus)ion prediction, primarily from a binding perspective, focusing on creating ideal lock-and-key models. However, these methods can fall short for enzyme-substrate interactions, where perfect binding models are rare, and induced fit states are more common. To address this, we shift to a functional perspective for enzyme design, where the enzyme function is defined by the reaction it catalyzes. Here, we introduce \textsc{GENzyme}, a \textit{de novo} enzyme design model that takes a catalytic reaction as input and generates the catalytic pocket, full enzyme structure, and enzyme-substrate binding complex. \textsc{GENzyme} is an end-to-end, three-staged model that integrates (1) a catalytic pocket generation and sequence co-design module, (2) a pocket inpainting and enzyme inverse folding module, and (3) a binding and screening module to optimize and predict enzyme-substrate complexes. The entire design process is driven by the catalytic reaction being targeted. This reaction-first approach allows for more accurate and biologically relevant enzyme design, potentially surpassing structure-based and binding-focused models in creating enzymes capable of catalyzing specific reactions. We provide \textsc{GENzyme} code at https://github.com/WillHua127/GENzyme.

2024-11-10

ArXiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Yong Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

2024-11-10

ArXiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Yong Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

2024-11-10

ArXiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Yong Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

2024-11-10

ArXiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Yong Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

2024-11-10

ArXiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Yong Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

2024-11-10

ArXiv (prépublication)

Soft Condorcet Optimization for Ranking of General Agents

Marc Lanctot

Kate Larson

Michael Kaisers

Quentin Berthet

Ian Gemp

Manfred Diaz

Roberto-Rafael Maura-Rivero

Yoram Bachrach

Anna Koop

A common way to drive progress of AI models and agents is to compare their performance on standardized benchmarks. Comparing the performance… (voir plus) of general agents requires aggregating their individual performances across a potentially wide variety of different tasks. In this paper, we describe a novel ranking scheme inspired by social choice frameworks, called Soft Condorcet Optimization (SCO), to compute the optimal ranking of agents: the one that makes the fewest mistakes in predicting the agent comparisons in the evaluation data. This optimal ranking is the maximum likelihood estimate when evaluation data (which we view as votes) are interpreted as noisy samples from a ground truth ranking, a solution to Condorcet's original voting system criteria. SCO ratings are maximal for Condorcet winners when they exist, which we show is not necessarily true for the classical rating system Elo. We propose three optimization algorithms to compute SCO ratings and evaluate their empirical performance. When serving as an approximation to the Kemeny-Young voting method, SCO rankings are on average 0 to 0.043 away from the optimal ranking in normalized Kendall-tau distance across 865 preference profiles from the PrefLib open ranking archive. In a simulated noisy tournament setting, SCO achieves accurate approximations to the ground truth ranking and the best among several baselines when 59\% or more of the preference data is missing. Finally, SCO ranking provides the best approximation to the optimal ranking, measured on held-out test sets, in a problem containing 52,958 human players across 31,049 games of the classic seven-player game of Diplomacy.

2024-11-01

arXiv (publié)