Jian Tang

Biographie

Jian Tang est professeur agrégé au département de sciences de la décision de HEC. Il est aussi professeur associé au département informatique et recherche opérationnelle (DIRO) de l'Université de Montréal et un membre académique principal à Mila – Institut québécois d’intelligence artificielle. Il est titulaire d'une chaire de recherche en IA Canada-CIFAR et le fondateur de BioGeometry, une entreprise en démarrage spécialisée dans l'IA générative pour la découverte d'anticorps. Ses principaux domaines de recherche sont les modèles génératifs profonds, l'apprentissage automatique des graphes et leurs applications à la découverte de médicaments. Il est un leader international dans le domaine de l'apprentissage automatique des graphes, et son travail représentatif sur l'apprentissage de la représentation des nœuds, LINE, a été largement reconnu et cité plus de 5 000 fois. Il a également réalisé de nombreux travaux pionniers sur l'IA pour la découverte de médicaments, notamment le premier cadre d'apprentissage automatique à source ouverte pour la découverte de médicaments, TorchDrug et TorchProtein.

Étudiants actuels

Huiyu Cai

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Xixian Liu

Doctorat - Université de Montréal

Site web

Jiarui Lu

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Gauthier Gidel

Jinxi Yang

Collaborateur·rice de recherche - Wuhan University

Xinyu Yuan

Doctorat - UdeM

Github

Zhihao Zhan

Doctorat - UdeM

Doctorat - HEC

Doctorat - UdeM

Jianan Zhao

Doctorat - UdeM

Site web

Github

Publications

Enhancing Protein Language Model with Structure-based Encoder and Pre-training

Aurelie Lozano

Vijil Chenthamarakshan

Payel Das

Protein language models (PLMs) pre-trained on large-scale protein sequence corpora have achieved impressive performance on various downstrea… (voir plus)m protein understanding tasks. Despite the ability to implicitly capture inter-residue contact information, transformer-based PLMs cannot encode protein structures explicitly for better structure-aware protein representations. Besides, the power of pre-training on available protein structures has not been explored for improving these PLMs, though structures are important to determine functions. To tackle these limitations, in this work, we enhance the PLM with structure-based encoder and pre-training. We first explore feasible model architectures to combine the advantages of a state-of-the-art PLM (i.e., ESM-1b) and a state-of-the-art protein structure encoder (i.e., GearNet). We empirically verify the ESM-GearNet that connects two encoders in a series way as the most effective combination model. To further improve the effectiveness of ESM-GearNet, we pre-train it on massive unlabeled protein structures with contrastive learning, which aligns representations of co-occurring subsequences so as to capture their biological correlation. Extensive experiments on EC and GO protein function prediction benchmarks demonstrate the superiority of ESM-GearNet over previous PLMs and structure encoders, and clear performance gains are further achieved by structure-based pre-training upon ESM-GearNet. The source code will be made public upon acceptance.

2023-03-05

ICLR.cc/2023/Workshop/MLDD (poster)

EurNet: Efficient Multi-Range Relational Modeling of Protein Structure

Yuanfan Guo

Yi Xu

Xinlei Chen

Yuandong Tian

Modeling the 3D structures of proteins is critical for obtaining effective protein structure representations, which further boosts protein f… (voir plus)unction understanding. Existing protein structure encoders mainly focus on modeling short-range interactions within protein structures, while they neglect modeling the interactions at multiple length scales that are actually complete interactive patterns in protein structures. To attain complete interaction modeling with efficient computation, we introduce the EurNet for Efficient multi-range relational modeling. In EurNet, we represent the protein structure as a multi-relational residue-level graph with different types of edges for modeling short-range, medium-range and long-range interactions. To efficiently process these different interactive relations, we propose a novel modeling layer, called Gated Relational Message Passing (GRMP), as the basic building block of EurNet. GRMP can capture multiple interactive relations in protein structures with little extra computational cost. We verify the state-of-the-art performance of EurNet on EC and GO protein function prediction benchmarks, and the proposed GRMP layer is proved to achieve better efficiency-performance trade-off than the widely-used relational graph convolution.

2023-03-05

ICLR.cc/2023/Workshop/MLDD (poster)

E3Bind: An End-to-End Equivariant Network for Protein-Ligand Docking

Bozitao Zhong

In silico prediction of the ligand binding pose to a given protein target is a crucial but challenging task in drug discovery. This work foc… (voir plus)uses on blind flexible selfdocking, where we aim to predict the positions, orientations and conformations of docked molecules. Traditional physics-based methods usually suffer from inaccurate scoring functions and high inference cost. Recently, data-driven methods based on deep learning techniques are attracting growing interest thanks to their efficiency during inference and promising performance. These methods usually either adopt a two-stage approach by first predicting the distances between proteins and ligands and then generating the final coordinates based on the predicted distances, or directly predicting the global roto-translation of ligands. In this paper, we take a different route. Inspired by the resounding success of AlphaFold2 for protein structure prediction, we propose E3Bind, an end-to-end equivariant network that iteratively updates the ligand pose. E3Bind models the protein-ligand interaction through careful consideration of the geometric constraints in docking and the local context of the binding site. Experiments on standard benchmark datasets demonstrate the superior performance of our end-to-end trainable model compared to traditional and recently-proposed deep learning methods.

2023-01-31

ICLR.cc/2023/Conference (poster)

Learning on Large-Scale Text-Attributed Graphs via Variational Inference

Jianan Zhao

Meng Qu

Chaozhuo Li

Hao Yan

Qian Liu

Rui Li

Xing Xie

This paper studies learning on text-attributed graphs (TAGs), where each node is associated with a text description. An ideal solution for s… (voir plus)uch a problem would be integrating both the text and graph structure information with large language models and graph neural networks (GNNs). However, the problem becomes very challenging when graphs are large due to the high computational complexity brought by training large language models and GNNs together. In this paper, we propose an efficient and effective solution to learning on large text-attributed graphs by fusing graph structure and language learning with a variational Expectation-Maximization (EM) framework, called GLEM. Instead of simultaneously training large language models and GNNs on big graphs, GLEM proposes to alternatively update the two modules in the E-step and M-step. Such a procedure allows training the two modules separately while simultaneously allowing the two modules to interact and mutually enhance each other. Extensive experiments on multiple data sets demonstrate the efficiency and effectiveness of the proposed approach.

2023-01-31

ICLR.cc/2023/Conference (notable)

Molecular Geometry Pretraining with SE(3)-Invariant Denoising Distance Matching

Shengchao Liu

Hongyu Guo

Molecular representation pretraining is critical in various applications for drug and material discovery due to the limited number of labele… (voir plus)d molecules, and most existing work focuses on pretraining on 2D molecular graphs. However, the power of pretraining on 3D geometric structures has been less explored. This is owing to the difficulty of finding a sufficient proxy task that can empower the pretraining to effectively extract essential features from the geometric structures. Motivated by the dynamic nature of 3D molecules, where the continuous motion of a molecule in the 3D Euclidean space forms a smooth potential energy surface, we propose GeoSSL, a 3D coordinate denoising pretraining framework to model such an energy landscape. Further by leveraging an SE(3)-invariant score matching method, we propose GeoSSL-DDM in which the coordinate denoising proxy task is effectively boiled down to denoising the pairwise atomic distances in a molecule. Our comprehensive experiments confirm the effectiveness and robustness of our proposed method.

2023-01-31

ICLR.cc/2023/Conference (poster)

Protein Representation Learning by Geometric Structure Pretraining

Arian Jamasb

Vijil Chenthamarakshan

Aurelie Lozano

Payel Das

Learning effective protein representations is critical in a variety of tasks in biology such as predicting protein function or structure. Ex… (voir plus)isting approaches usually pretrain protein language models on a large number of unlabeled amino acid sequences and then finetune the models with some labeled data in downstream tasks. Despite the effectiveness of sequence-based approaches, the power of pretraining on known protein structures, which are available in smaller numbers only, has not been explored for protein property prediction, though protein structures are known to be determinants of protein function. In this paper, we propose to pretrain protein representations according to their 3D structures. We first present a simple yet effective encoder to learn the geometric features of a protein. We pretrain the protein graph encoder by leveraging multiview contrastive learning and different self-prediction tasks. Experimental results on both function prediction and fold classification tasks show that our proposed pretraining methods outperform or are on par with the state-of-the-art sequence-based methods, while using much less pretraining data. Our implementation is available at https://github.com/DeepGraphLearning/GearNet.

2023-01-31

ICLR.cc/2023/Conference (poster)

Protein Sequence and Structure Co-design with Equivariant Translation

Chence Shi

Chuanrui Wang

Jiarui Lu

Bozitao Zhong

Proteins are macromolecules that perform essential functions in all living organisms. Designing novel proteins with specific structures and … (voir plus)desired functions has been a long-standing challenge in the field of bioengineering. Existing approaches generate both protein sequence and structure using either autoregressive models or diffusion models, both of which suffer from high inference costs. In this paper, we propose a new approach capable of protein sequence and structure co-design, which iteratively translates both protein sequence and structure into the desired state from random initialization, based on context features given a priori. Our model consists of a trigonometry-aware encoder that reasons geometrical constraints and interactions from context features, and a roto-translation equivariant decoder that translates protein sequence and structure interdependently. Notably, all protein amino acids are updated in one shot in each translation step, which significantly accelerates the inference process. Experimental results across multiple tasks show that our model outperforms previous state-of-the-art baselines by a large margin, and is able to design proteins of high fidelity as regards both sequence and structure, with running time orders of magnitude less than sampling-based methods.

2023-01-31

ICLR.cc/2023/Conference (poster)

Design and Application of Adaptive Sparse Deep Echo State Network

Cuili Yang

Sheng Yang

Bing Li

The prediction of appliances energy consumption in building belongs to time series forecasting problem, which can be solved by echo state ne… (voir plus)twork (ESN). However, due to the randomly initialized inputs and reservoir, some redundant or irrelevant components are inevitably generated in original ESN. To solve this problem, the adaptive sparse deep echo state network (ASDESN) is proposed, in which the information is processed layer by layer. Firstly, the principal component analysis (PCA) layer is inserted to penalize the redundant projection transmitted between sub-reservoirs. Secondly, the coordinate descent based adaptive sparse learning method is proposed to generate the sparse output weights. Particularly, the designed adaptive threshold strategy is able to enlarge the sparsity of output weights as network depth increases. Moreover, the echo state property (ESP) of ASDESN is given to ensure its applications. The experiment results in both simulated benchmark and real appliances energy datasets illustrate that the proposed ASDESN outperforms other ESNs with higher prediction accuracy and stability.

2022-12-31

IEEE Transactions on Consumer Electronics (publié)

FusionRetro: Molecule Representation Fusion via Reaction Graph for Retrosynthetic Planning

Songtao Liu

Zhengkai Tu

Minkai Xu

Peilin Zhao

Rex Ying

Lu Lin

Dinghao Wu

Retrosynthetic planning is a fundamental problem in drug discovery and organic chemistry, which aims to ﬁnd a complete multi-step syntheti… (voir plus)c route from a set of starting materials to the target molecule, determining crucial process ﬂow in chemical production. Existing approaches combine single-step retrosynthesis models and search algorithms to ﬁnd synthetic routes. However, these approaches generally consider the two pieces in a decoupled manner, taking only the product as the input to predict the reactants per planning step and largely ignoring the important context information from other intermediates along the synthetic route. In this work, we perform a series of experiments to identify the limitations of this decoupled view and propose a novel retrosynthesis framework that also exploits context information for retrosynthetic planning. We view synthetic routes as reaction graphs, and propose to incorporate the context by three principled steps: encode molecules into embeddings, aggregate information over routes, and readout to predict reactants. The whole framework can be efﬁciently optimized in an end-to-end fashion. Comprehensive experiments show that by fusing in context information over routes, our model sig-niﬁcantly improves the performance of retrosyn-thetic planning over baselines that are not context-aware, especially for long synthetic routes.

2022-12-31

(publié)

www.semanticscholar.org

Pre-Training Protein Encoder via Siamese Sequence-Structure Diffusion Trajectory Prediction

Aurelie Lozano

Vijil Chenthamarakshan

Payel Das

Self-supervised pre-training methods on proteins have recently gained attention, with most approaches focusing on either protein sequences o… (voir plus)r structures, neglecting the exploration of their joint distribution, which is crucial for a comprehensive understanding of protein functions by integrating co-evolutionary information and structural characteristics. In this work, inspired by the success of denoising diffusion models in generative tasks, we propose the DiffPreT approach to pre-train a protein encoder by sequence-structure joint diffusion modeling. DiffPreT guides the encoder to recover the native protein sequences and structures from the perturbed ones along the joint diffusion trajectory, which acquires the joint distribution of sequences and structures. Considering the essential protein conformational variations, we enhance DiffPreT by a method called Siamese Diffusion Trajectory Prediction (SiamDiff) to capture the correlation between different conformers of a protein. SiamDiff attains this goal by maximizing the mutual information between representations of diffusion trajectories of structurally-correlated conformers. We study the effectiveness of DiffPreT and SiamDiff on both atom- and residue-level structure-based protein understanding tasks. Experimental results show that the performance of DiffPreT is consistently competitive on all tasks, and SiamDiff achieves new state-of-the-art performance, considering the mean ranks on all tasks. Our implementation is available at https://github.com/DeepGraphLearning/SiamDiff.

2022-12-31

arXiv.org (prépublication)

Debiasing Graph Neural Networks via Learning Disentangled Causal Substructure

Shaohua Fan

Xiao Wang

Yanhu Mo

Chuan Shi

Most Graph Neural Networks (GNNs) predict the labels of unseen graphs by learning the correlation between the input graphs and labels. Howev… (voir plus)er, by presenting a graph classification investigation on the training graphs with severe bias, surprisingly, we discover that GNNs always tend to explore the spurious correlations to make decision, even if the causal correlation always exists. This implies that existing GNNs trained on such biased datasets will suffer from poor generalization capability. By analyzing this problem in a causal view, we find that disentangling and decorrelating the causal and bias latent variables from the biased graphs are both crucial for debiasing. Inspiring by this, we propose a general disentangled GNN framework to learn the causal substructure and bias substructure, respectively. Particularly, we design a parameterized edge mask generator to explicitly split the input graph into causal and bias subgraphs. Then two GNN modules supervised by causal/bias-aware loss functions respectively are trained to encode causal and bias subgraphs into their corresponding representations. With the disentangled representations, we synthesize the counterfactual unbiased training samples to further decorrelate causal and bias variables. Moreover, to better benchmark the severe bias problem, we construct three new graph datasets, which have controllable bias degrees and are easier to visualize and explain. Experimental results well demonstrate that our approach achieves superior generalization performance over existing baselines. Furthermore, owing to the learned edge mask, the proposed model has appealing interpretability and transferability. Code and data are available at: https://github.com/googlebaba/DisC.

2022-11-28

Conference on Neural Information Processing Systems (Accept)

Inductive Logical Query Answering in Knowledge Graphs

Mikhail Galkin

Zhaocheng Zhu

Hongyu Ren

Formulating and answering logical queries is a standard communication interface for knowledge graphs (KGs). Alleviating the notorious incomp… (voir plus)leteness of real-world KGs, neural methods achieved impressive results in link prediction and complex query answering tasks by learning representations of entities, relations, and queries. Still, most existing query answering methods rely on transductive entity embeddings and cannot generalize to KGs containing new entities without retraining the entity embeddings. In this work, we study the inductive query answering task where inference is performed on a graph containing new entities with queries over both seen and unseen entities. To this end, we devise two mechanisms leveraging inductive node and relational structure representations powered by graph neural networks (GNNs). Experimentally, we show that inductive models are able to perform logical reasoning at inference time over unseen nodes generalizing to graphs up to 500% larger than training ones. Exploring the efficiency--effectiveness trade-off, we find the inductive relational structure representation method generally achieves higher performance, while the inductive node representation method is able to answer complex queries in the inference-only regime without any training on queries and scales to graphs of millions of nodes. Code is available at https://github.com/DeepGraphLearning/InductiveQE.

2022-11-28

Conference on Neural Information Processing Systems (Accept)