Portrait de Jian Tang

Jian Tang

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur agrégé, HEC Montréal, Département de sciences de la décision
Professeur associé, Université de Montréal, Département d'informatique et de recherche opérationnelle (DIRO)
Fondateur, BioGeometry
Sujets de recherche
Apprentissage profond
Biologie computationnelle
Modèles génératifs
Modélisation moléculaire
Réseaux de neurones en graphes

Biographie

Jian Tang est professeur agrégé au département de sciences de la décision de HEC. Il est aussi professeur associé au département informatique et recherche opérationnelle (DIRO) de l'Université de Montréal et un membre académique principal à Mila – Institut québécois d’intelligence artificielle. Il est titulaire d'une chaire de recherche en IA Canada-CIFAR et le fondateur de BioGeometry, une entreprise en démarrage spécialisée dans l'IA générative pour la découverte d'anticorps. Ses principaux domaines de recherche sont les modèles génératifs profonds, l'apprentissage automatique des graphes et leurs applications à la découverte de médicaments. Il est un leader international dans le domaine de l'apprentissage automatique des graphes, et son travail représentatif sur l'apprentissage de la représentation des nœuds, LINE, a été largement reconnu et cité plus de 5 000 fois. Il a également réalisé de nombreux travaux pionniers sur l'IA pour la découverte de médicaments, notamment le premier cadre d'apprentissage automatique à source ouverte pour la découverte de médicaments, TorchDrug et TorchProtein.

Étudiants actuels

Stagiaire de recherche - Beijing Institute of Technology
Stagiaire de recherche - HEC
Collaborateur·rice de recherche
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Collaborateur·rice de recherche
Maîtrise recherche - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche
Doctorat - UdeM
Doctorat - UdeM

Publications

Zero-shot Logical Query Reasoning on any Knowledge Graph
Mikhail Galkin
Jincheng Zhou
Bruno Ribeiro
Zhaocheng Zhu
Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional querie… (voir plus)s comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being deployed on a new graph. Here we present UltraQuery, an inductive reasoning model that can zero-shot answer logical queries on any KG. The core idea of UltraQuery is to derive both projections and logical operations as vocabulary-independent functions which generalize to new entities and relations in any KG. With the projection operation initialized from a pre-trained inductive KG reasoning model, UltraQuery can solve CLQA on any KG even if it is only finetuned on a single dataset. Experimenting on 23 datasets, UltraQuery in the zero-shot inference mode shows competitive or better query answering performance than best available baselines and sets a new state of the art on 14 of them.
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science
Xiangru Tang
Qiao Jin
Kunlun Zhu
Tongxin Yuan
Yichi Zhang
Wangchunshu Zhou
Meng Qu
Yilun Zhao
Zhuosheng Zhang
Arman Cohan
Zhiyong Lu
Mark Gerstein
F$^3$low: Frame-to-Frame Coarse-grained Molecular Dynamics with SE(3) Guided Flow Matching
Shaoning Li
Yusong Wang
Mingyu Li
Bin Shao
Nanning Zheng
Zhang Jian
Fusing Neural and Physical: Augment Protein Conformation Sampling with Tractable Simulations
Jiarui Lu
Zuobai Zhang
Bozitao Zhong
Chence Shi
The protein dynamics are common and important for their biological functions and properties, the study of which usually involves time-consum… (voir plus)ing molecular dynamics (MD) simulations *in silico*. Recently, generative models has been leveraged as a surrogate sampler to obtain conformation ensembles with orders of magnitude faster and without requiring any simulation data (a "zero-shot" inference). However, being agnostic of the underlying energy landscape, the accuracy of such generative model may still be limited. In this work, we explore the few-shot setting of such pre-trained generative sampler which incorporates MD simulations in a tractable manner. Specifically, given a target protein of interest, we first acquire some seeding conformations from the pre-trained sampler followed by a number of physical simulations in parallel starting from these seeding samples. Then we fine-tuned the generative model using the simulation trajectories above to become a target-specific sampler. Experimental results demonstrated the superior performance of such few-shot conformation sampler at a tractable computational cost.
Structure-Informed Protein Language Model
Zuobai Zhang
Jiarui Lu
Vijil Chenthamarakshan
Aurelie Lozano
Payel Das
Protein language models are a powerful tool for learning protein representations through pre-training on vast protein sequence datasets. Ho… (voir plus)wever, traditional protein language models lack explicit structural supervision, despite its relevance to protein function. To address this issue, we introduce the integration of remote homology detection to distill structural information into protein language models without requiring explicit protein structures as input. We evaluate the impact of this structure-informed training on downstream protein function prediction tasks. Experimental results reveal consistent improvements in function annotation accuracy for EC number and GO term prediction. Performance on mutant datasets, however, varies based on the relationship between targeted properties and protein structures. This underscores the importance of considering this relationship when applying structure-aware training to protein function prediction tasks. Code and model weights will be made available upon acceptance.
Heterogeneous ensemble prediction model of CO emission concentration in municipal solid waste incineration process using virtual data and real data hybrid-driven
Runyu Zhang
Heng Xia
Jiakun Chen
Wen Yu
JunFei Qiao
Iterative Graph Self-Distillation
Hanlin Zhang
Shuai Lin
Weiyang Liu
Pan Zhou
Xiaodan Liang
Eric P. Xing
Recently, there has been increasing interest in the challenge of how to discriminatively vectorize graphs. To address this, we propose a met… (voir plus)hod called Iterative Graph Self-Distillation (IGSD) which learns graph-level representation in an unsupervised manner through instance discrimination using a self-supervised contrastive learning approach. IGSD involves a teacher-student distillation process that uses graph diffusion augmentations and constructs the teacher model using an exponential moving average of the student model. The intuition behind IGSD is to predict the teacher network representation of the graph pairs under different augmented views. As a natural extension, we also apply IGSD to semi-supervised scenarios by jointly regularizing the network with both supervised and self-supervised contrastive loss. Finally, we show that fine-tuning the IGSD-trained models with self-training can further improve graph representation learning. Empirically, we achieve significant and consistent performance gain on various graph datasets in both unsupervised and semi-supervised settings, which well validates the superiority of IGSD.
Deep Equilibrium Models For Algorithmic Reasoning
Sophie Xhonneux
Yu He
Andreea Deac
In this blogpost we discuss the idea of teaching neural networks to reach fixed points when reasoning. Specifically, on the algorithmic reas… (voir plus)oning benchmark CLRS the current neural networks are told the number of reasoning steps they need. While a quick fix is to add a termination network that predicts when to stop, a much more salient inductive bias is that the neural network shouldn't change it's answer any further once the answer is correct, i.e. it should reach a fixed point. This is supported by denotational semantics, which tells us that while loops that terminate are the minimum fixed points of a function. We implement this idea with the help of deep equilibrium models and discuss several hurdles one encounters along the way. We show on several algorithms from the CLRS benchmark the partial success of this approach and the difficulty in making it work robustly across all algorithms.
Deep Equilibrium Models For Algorithmic Reasoning
Sophie Xhonneux
Yu He
Andreea Deac
In this blogpost we discuss the idea of teaching neural networks to reach fixed points when reasoning. Specifically, on the algorithmic reas… (voir plus)oning benchmark CLRS the current neural networks are told the number of reasoning steps they need. While a quick fix is to add a termination network that predicts when to stop, a much more salient inductive bias is that the neural network shouldn't change it's answer any further once the answer is correct, i.e. it should reach a fixed point. This is supported by denotational semantics, which tells us that while loops that terminate are the minimum fixed points of a function. We implement this idea with the help of deep equilibrium models and discuss several hurdles one encounters along the way. We show on several algorithms from the CLRS benchmark the partial success of this approach and the difficulty in making it work robustly across all algorithms.
In-Context Learning Can Re-learn Forbidden Tasks
Sophie Xhonneux
David Dobre
Despite significant investment into safety training, large language models (LLMs) deployed in the real world still suffer from numerous vuln… (voir plus)erabilities. One perspective on LLM safety training is that it algorithmically forbids the model from answering toxic or harmful queries. To assess the effectiveness of safety training, in this work, we study forbidden tasks, i.e., tasks the model is designed to refuse to answer. Specifically, we investigate whether in-context learning (ICL) can be used to re-learn forbidden tasks despite the explicit fine-tuning of the model to refuse them. We first examine a toy example of refusing sentiment classification to demonstrate the problem. Then, we use ICL on a model fine-tuned to refuse to summarise made-up news articles. Finally, we investigate whether ICL can undo safety training, which could represent a major security risk. For the safety task, we look at Vicuna-7B, Starling-7B, and Llama2-7B. We show that the attack works out-of-the-box on Starling-7B and Vicuna-7B but fails on Llama2-7B. Finally, we propose an ICL attack that uses the chat template tokens like a prompt injection attack to achieve a better attack success rate on Vicuna-7B and Starling-7B. Trigger Warning: the appendix contains LLM-generated text with violence, suicide, and misinformation.
Machine Learning Informed Diagnosis for Congenital Heart Disease in Large Claims Data Source
Ariane Marelli
Chao Li
Aihua Liu
Hanh Nguyen
Harry Moroz
James M. Brophy
Liming Guo
Unsupervised Discovery of Steerable Factors When Graph Deep Generative Models Are Entangled
Shengchao Liu
Chengpeng Wang
Jiarui Lu
Weili Nie
Hanchen Wang
Zhuoxinran Li
Bolei Zhou