Doina Precup

Mohammad Sami Nur Islam Islam

Jesse Farebrother

Doctorat - McGill

Superviseur⋅e principal⋅e :

Marc Gendron-Bellemare

Doctorat - McGill

Superviseur⋅e principal⋅e :

Eilif Benjamin Muller

Doctorat - McGill

Doctorat - McGill

Maîtrise recherche - McGill

Arushi Jain

Doctorat - McGill

Doctorat - McGill

Postdoctorat - McGill

Elaine Lau

Maîtrise recherche - McGill

Jonathan Lebensold

Collaborateur·rice alumni - McGill

Baccalauréat - McGill

Ray Luo

Doctorat - McGill

Superviseur⋅e principal⋅e :

G McCracken

Doctorat - McGill

Nazanin Mohammadi Sepahvand

Doctorat - McGill

Shahrad Mohammadzadeh

Maîtrise recherche - McGill

Superviseur⋅e principal⋅e :

Gabriela Moisescu-Pareja

Collaborateur·rice de recherche - McGill

Co-superviseur⋅e :

Irina Rish

Padideh Nouri

Doctorat - UdeM

Co-superviseur⋅e :

Charles Onu

Doctorat - McGill

Doctorat - McGill

Co-superviseur⋅e :

Nate Rahn

Doctorat - McGill

Superviseur⋅e principal⋅e :

Marc Gendron-Bellemare

Sahand Rezaei-Shoshtari

Doctorat - McGill

Co-superviseur⋅e :

Doctorat - McGill

Co-superviseur⋅e :

Doctorat - McGill

Co-superviseur⋅e :

Blake Richards

samiemandana@gmail.com

Doctorat - McGill

Nishanth Anand Vemgal

Doctorat - McGill

Doctorat - McGill

Doctorat - McGill

Co-superviseur⋅e :

Samira Ebrahimi Kahou

Site web

Zihan Wang

Doctorat - McGill

Site web

Guangyuan Wang

Stagiaire de recherche - McGill

Steve Wen

Maîtrise recherche - McGill

Co-superviseur⋅e :

Gregory Dudek

Skipper : combiner l’abstraction spatiale et temporelle afin d’améliorer la généralisation

Zijing Wu

Doctorat - McGill

Co-superviseur⋅e :

Doctorat - McGill

Harry Zhao

Collaborateur·rice alumni - McGill

Co-superviseur⋅e :

Billets de blogue

Generic thumbnail for Mila Blog articles.

22 février 2024

par

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Lire l'article

Publications

Parseval Regularization for Continual Reinforcement Learning

Loss of plasticity, trainability loss, and primacy bias have been identified as issues arising when training deep neural networks on sequenc… (voir plus)es of tasks -- all referring to the increased difficulty in training on new tasks. We propose to use Parseval regularization, which maintains orthogonality of weight matrices, to preserve useful optimization properties and improve training in a continual reinforcement learning setting. We show that it provides significant benefits to RL agents on a suite of gridworld, CARL and MetaWorld tasks. We conduct comprehensive ablations to identify the source of its benefits and investigate the effect of certain metrics associated to network trainability including weight matrix rank, weight norms and policy entropy.

2024-12-10

ArXiv (prépublication)

Parseval Regularization for Continual Reinforcement Learning

2024-12-10

ArXiv (prépublication)

Towards AI-designed genomes using a variational autoencoder

N.K. Dudek

Genomes encode elaborate networks of genes whose products must seamlessly interact to support living organisms. Humans’ capacity to unders… (voir plus)tand these biological systems is limited by their sheer size and complexity. In this work, we develop a proof of concept framework for training a machine learning algorithm to model bacterial genome composition. To achieve this, we create simplified representations of genomes in the form of binary vectors that indicate the encoded genes, henceforth referred to as genome vectors. A denoising variational autoencoder was trained to accept corrupted genome vectors, in which most genes had been masked, and reconstruct the original. The resulting model, DeepGenomeVector, effectively captures complex dependencies in genomic networks, as evaluated by both qualitative and quantitative metrics. An in-depth functional analysis of a generated genome vector shows that its encoded pathways are interconnected, near complete, and ecologically cohesive. On the test set, where the model’s ability to reconstruct uncorrupted genome vectors was evaluated, AUC and F1 scores of 0.98 and 0.83, respectively, support the model’s strong performance. This work showcases the power of machine learning approaches for synthetic biology and highlights the possibility that AI agents may one day be able to design genomes that animate carbon-based cells.

2024-11-17

bioRxiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Chenqing Hua

Yong Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

The introduction of models like RFDiffusionAA, AlphaFold3, AlphaProteo, and Chai1 has revolutionized protein structure modeling and interact… (voir plus)ion prediction, primarily from a binding perspective, focusing on creating ideal lock-and-key models. However, these methods can fall short for enzyme-substrate interactions, where perfect binding models are rare, and induced fit states are more common. To address this, we shift to a functional perspective for enzyme design, where the enzyme function is defined by the reaction it catalyzes. Here, we introduce \textsc{GENzyme}, a \textit{de novo} enzyme design model that takes a catalytic reaction as input and generates the catalytic pocket, full enzyme structure, and enzyme-substrate binding complex. \textsc{GENzyme} is an end-to-end, three-staged model that integrates (1) a catalytic pocket generation and sequence co-design module, (2) a pocket inpainting and enzyme inverse folding module, and (3) a binding and screening module to optimize and predict enzyme-substrate complexes. The entire design process is driven by the catalytic reaction being targeted. This reaction-first approach allows for more accurate and biologically relevant enzyme design, potentially surpassing structure-based and binding-focused models in creating enzymes capable of catalyzing specific reactions. We provide \textsc{GENzyme} code at https://github.com/WillHua127/GENzyme.

2024-11-10

ArXiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Chenqing Hua

Yong Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

2024-11-10

ArXiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Chenqing Hua

Yang Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

2024-11-10

ArXiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Chenqing Hua

Yong Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

2024-11-10

ArXiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Chenqing Hua

Yong Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

2024-11-10

ArXiv (prépublication)

Reaction-conditioned De Novo Enzyme Design with GENzyme

Chenqing Hua

Yong Liu

Odin Zhang

Rex Ying

Wengong Jin

Shuangjia Zheng

2024-11-10

ArXiv (prépublication)

Soft Condorcet Optimization for Ranking of General Agents

Marc Lanctot

Kate Larson

Michael Kaisers

Quentin Berthet

Ian Gemp

Manfred Diaz

Roberto-Rafael Maura-Rivero

Yoram Bachrach

Anna Koop

A common way to drive progress of AI models and agents is to compare their performance on standardized benchmarks. Comparing the performance… (voir plus) of general agents requires aggregating their individual performances across a potentially wide variety of different tasks. In this paper, we describe a novel ranking scheme inspired by social choice frameworks, called Soft Condorcet Optimization (SCO), to compute the optimal ranking of agents: the one that makes the fewest mistakes in predicting the agent comparisons in the evaluation data. This optimal ranking is the maximum likelihood estimate when evaluation data (which we view as votes) are interpreted as noisy samples from a ground truth ranking, a solution to Condorcet's original voting system criteria. SCO ratings are maximal for Condorcet winners when they exist, which we show is not necessarily true for the classical rating system Elo. We propose three optimization algorithms to compute SCO ratings and evaluate their empirical performance. When serving as an approximation to the Kemeny-Young voting method, SCO rankings are on average 0 to 0.043 away from the optimal ranking in normalized Kendall-tau distance across 865 preference profiles from the PrefLib open ranking archive. In a simulated noisy tournament setting, SCO achieves accurate approximations to the ground truth ranking and the best among several baselines when 59\% or more of the preference data is missing. Finally, SCO ranking provides the best approximation to the optimal ranking, measured on held-out test sets, in a problem containing 52,958 human players across 31,049 games of the classic seven-player game of Diplomacy.

2024-11-01

arXiv (publié)