Portrait of Chence Shi is unavailable

Chence Shi

PhD - Université de Montréal
Supervisor
Research Topics
Deep Learning
Generative Models
Graph Neural Networks
Molecular Modeling

Publications

Design of Ligand-Binding Proteins with Atomic Flow Matching
Junqi Liu
Shaoning Li
Zhi Yang
Structure Language Models for Protein Conformation Generation
Fusing Neural and Physical: Augment Protein Conformation Sampling with Tractable Simulations
The protein dynamics are common and important for their biological functions and properties, the study of which usually involves time-consum… (see more)ing molecular dynamics (MD) simulations *in silico*. Recently, generative models has been leveraged as a surrogate sampler to obtain conformation ensembles with orders of magnitude faster and without requiring any simulation data (a "zero-shot" inference). However, being agnostic of the underlying energy landscape, the accuracy of such generative model may still be limited. In this work, we explore the few-shot setting of such pre-trained generative sampler which incorporates MD simulations in a tractable manner. Specifically, given a target protein of interest, we first acquire some seeding conformations from the pre-trained sampler followed by a number of physical simulations in parallel starting from these seeding samples. Then we fine-tuned the generative model using the simulation trajectories above to become a target-specific sampler. Experimental results demonstrated the superior performance of such few-shot conformation sampler at a tractable computational cost.
E3Bind: An End-to-End Equivariant Network for Protein-Ligand Docking
Yang Zhang
Bozitao Zhong
In silico prediction of the ligand binding pose to a given protein target is a crucial but challenging task in drug discovery. This work foc… (see more)uses on blind flexible self-docking, where we aim to predict the positions, orientations and conformations of docked molecules. Traditional physics-based methods usually suffer from inaccurate scoring functions and high inference cost. Recently, data-driven methods based on deep learning techniques are attracting growing interest thanks to their efficiency during inference and promising performance. These methods usually either adopt a two-stage approach by first predicting the distances between proteins and ligands and then generating the final coordinates based on the predicted distances, or directly predicting the global roto-translation of ligands. In this paper, we take a different route. Inspired by the resounding success of AlphaFold2 for protein structure prediction, we propose E3Bind, an end-to-end equivariant network that iteratively updates the ligand pose. E3Bind models the protein-ligand interaction through careful consideration of the geometric constraints in docking and the local context of the binding site. Experiments on standard benchmark datasets demonstrate the superior performance of our end-to-end trainable model compared to traditional and recently-proposed deep learning methods.
Protein Sequence and Structure Co-Design with Equivariant Translation
Chuanrui Wang
Bozitao Zhong
Proteins are macromolecules that perform essential functions in all living organisms. Designing novel proteins with specific structures and … (see more)desired functions has been a long-standing challenge in the field of bioengineering. Existing approaches generate both protein sequence and structure using either autoregressive models or diffusion models, both of which suffer from high inference costs. In this paper, we propose a new approach capable of protein sequence and structure co-design, which iteratively translates both protein sequence and structure into the desired state from random initialization, based on context features given a priori. Our model consists of a trigonometry-aware encoder that reasons geometrical constraints and interactions from context features, and a roto-translation equivariant decoder that translates protein sequence and structure interdependently. Notably, all protein amino acids are updated in one shot in each translation step, which significantly accelerates the inference process. Experimental results across multiple tasks show that our model outperforms previous state-of-the-art baselines by a large margin, and is able to design proteins of high fidelity as regards both sequence and structure, with running time orders of magnitude less than sampling-based methods.
An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming
Minkai Xu
Wujie Wang
Shitong Luo
Rafael G'omez-bombarelli
Predicting molecular conformations (or 3D structures) from molecular graphs is a fundamental problem in many applications. Most existing app… (see more)roaches are usually divided into two steps by first predicting the distances between atoms and then generating a 3D structure through optimizing a distance geometry problem. However, the distances predicted with such two-stage approaches may not be able to consistently preserve the geometry of local atomic neighborhoods, making the generated structures unsatisfying. In this paper, we propose an end-to-end solution for molecular conformation prediction called ConfVAE based on the conditional variational autoencoder framework. Specifically, the molecular graph is first encoded in a latent space, and then the 3D structures are generated by solving a principled bilevel optimization program. Extensive experiments on several benchmark data sets prove the effectiveness of our proposed approach over existing state-of-the-art approaches. Code is available at https://github.com/MinkaiXu/ConfVAE-ICML21.