Jian Tang

Biographie

Jian Tang est professeur agrégé au département de sciences de la décision de HEC. Il est aussi professeur associé au département informatique et recherche opérationnelle (DIRO) de l'Université de Montréal et un membre académique principal à Mila – Institut québécois d’intelligence artificielle. Il est titulaire d'une chaire de recherche en IA Canada-CIFAR et le fondateur de BioGeometry, une entreprise en démarrage spécialisée dans l'IA générative pour la découverte d'anticorps. Ses principaux domaines de recherche sont les modèles génératifs profonds, l'apprentissage automatique des graphes et leurs applications à la découverte de médicaments. Il est un leader international dans le domaine de l'apprentissage automatique des graphes, et son travail représentatif sur l'apprentissage de la représentation des nœuds, LINE, a été largement reconnu et cité plus de 5 000 fois. Il a également réalisé de nombreux travaux pionniers sur l'IA pour la découverte de médicaments, notamment le premier cadre d'apprentissage automatique à source ouverte pour la découverte de médicaments, TorchDrug et TorchProtein.

Étudiants actuels

Huiyu Cai

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Xixian Liu

Doctorat - Université de Montréal

Site web

Jiarui Lu

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Gauthier Gidel

Xinyu Yuan

Doctorat - UdeM

Github

Zhihao Zhan

Doctorat - UdeM

Doctorat - UdeM

Doctorat - HEC

Jianan Zhao

Doctorat - UdeM

Site web

Github

Publications

Multi-objective PSO semi-supervised random forest method for dioxin soft sensor

Wen Xu

Heng Xia

Wen Yu

JunFei Qiao

2023-12-31

Eng. Appl. Artif. Intell. (publié)

Multi-reservoir ESN-based prediction strategy for dynamic multi-objective optimization

Cuili Yang

Danlei Wang

JunFei Qiao

Wen Yu

2023-12-31

Information Sciences (publié)

NOx emissions prediction for MSWI process based on dynamic modular neural network

Haoshan Duan

Xi Meng

JunFei Qiao

2023-12-31

Expert systems with applications (publié)

Online Measurement of Dioxin Emission in Solid Waste Incineration Using Fuzzy Broad Learning

Heng Xia

Wen Yu

JunFei Qiao

Dioxin (DXN) is a persistent organic pollutant produced from municipal solid waste incineration (MSWI) processes. It is a crucial environmen… (voir plus)tal indicator to minimize emission concentration by using optimization control, but it is difficult to monitor in real time. Aiming at online soft-sensing of DXN emission, a novel fuzzy tree broad learning system (FTBLS) is proposed, which includes offline training and online measurement. In the offline training part, weighted k-means is presented to construct a typical sample pool for reduced learning costs of offline and online phases. Moreover, the novel FTBLS, which contains a feature mapping layer, enhance layer, and increment layer, by replacing the fuzzy decision tree with neurons applied to construct the offline model. In the online measurement part, recursive principal component analysis is used to monitor the time-varying characteristic of the MSWI process. To measure DXN emission, offline FTBLS is reused for normal samples; for drift samples, fast incremental learning is used for online updates. A DXN data from the actual MSWI process is employed to prove the usefulness of FTBLS, where the RMSE of training and testing data are 0.0099 and 0.0216, respectively. This result shows that FTBLS can effectively realize DXN online prediction.

2023-12-31

IEEE Transactions on Industrial Informatics (publié)

The Heterophilic Graph Learning Handbook: Benchmarks, Models, Theoretical Analysis, Applications and Challenges

Sitao Luan

Chenqing Hua

Qincheng Lu

Liheng Ma

Lirong Wu

Xinyu Wang

Minkai Xu

Xiao-Wen Chang

Doina Precup

Rex Ying

Stan Z. Li

Guy Wolf

Stefanie Jegelka

Homophily principle, \ie{} nodes with the same labels or similar attributes are more likely to be connected, has been commonly believed to b… (voir plus)e the main reason for the superiority of Graph Neural Networks (GNNs) over traditional Neural Networks (NNs) on graph-structured data, especially on node-level tasks. However, recent work has identified a non-trivial set of datasets where GNN's performance compared to the NN's is not satisfactory. Heterophily, i.e. low homophily, has been considered the main cause of this empirical observation. People have begun to revisit and re-evaluate most existing graph models, including graph transformer and its variants, in the heterophily scenario across various kinds of graphs, e.g. heterogeneous graphs, temporal graphs and hypergraphs. Moreover, numerous graph-related applications are found to be closely related to the heterophily problem. In the past few years, considerable effort has been devoted to studying and addressing the heterophily issue. In this survey, we provide a comprehensive review of the latest progress on heterophilic graph learning, including an extensive summary of benchmark datasets and evaluation of homophily metrics on synthetic graphs, meticulous classification of the most updated supervised and unsupervised learning methods, thorough digestion of the theoretical analysis on homophily/heterophily, and broad exploration of the heterophily-related applications. Notably, through detailed experiments, we are the first to categorize benchmark heterophilic datasets into three sub-categories: malignant, benign and ambiguous heterophily. Malignant and ambiguous datasets are identified as the real challenging datasets to test the effectiveness of new models on the heterophily challenge. Finally, we propose several challenges and future directions for heterophilic graph representation learning.

2023-12-31

arXiv (prépublication)

Tree Broad Learning System for Small Data Modeling.

Heng Xia

Wen Yu

JunFei Qiao

Broad learning system based on neural network (BLS-NN) has poor efficiency for small data modeling with various dimensions. Tree-based BLS (… (voir plus)TBLS) is designed for small data modeling by introducing nondifferentiable modules and an ensemble strategy to the traditional broad learning system (BLS). TBLS replaces the neurons of BLS with the tree modules to map the input data. Moreover, we present three new TBLS variant methods and their incremental learning implementations, which are motivated by deep, broad, and ensemble learning. Their major distinction is reflected in the incremental learning strategies based on: 1) mean square error (mse); 2) pseudo-inverse; and 3) pseudo-inverse theory and stack representation. Therefore, this study further explores the domain of BLS based on the nondifferentiable modules. The simulations are compared with some state-of-the-art (SOTA) BLS-NN and tree methods under high-, medium-, and low-dimensional benchmark datasets. Results show that the proposed method outperforms the BLS-NN, and the modeling accuracy is remarkably improved with the small training data of the proposed TBLS.

2023-12-31

IEEE Trans. Neural Networks Learn. Syst. (publié)

Giant Correlated Gap and Possible Room-Temperature Correlated States in Twisted Bilayer MoS_{2}.

Fanfan Wu

Qiaoling Xu

Qinqin Wang

Yanbang Chu

Li Li

Jieying Liu

Jinpeng Tian

Yiru Ji

Le Liu

Yalong Yuan

Zhiheng Huang

Jiaojiao Zhao

Xiaozhou Zan

Kenji Watanabe

Takashi Taniguchi

Dongxia Shi

Gangxu Gu

Yang Xu

Lede Xian … (voir 3 de plus)

Wei Yang

Luojun Du

Guangyu Zhang

Moiré superlattices have emerged as an exciting condensed-matter quantum simulator for exploring the exotic physics of strong electronic co… (voir plus)rrelations. Notable progress has been witnessed, but such correlated states are achievable usually at low temperatures. Here, we report evidence of possible room-temperature correlated electronic states and layer-hybridized SU(4) model simulator in AB-stacked MoS_{2} homobilayer moiré superlattices. Correlated insulating states at moiré band filling factors v=1, 2, 3 are unambiguously established in twisted bilayer MoS_{2}. Remarkably, the correlated electronic state at v=1 shows a giant correlated gap of ∼126 meV and may persist up to a record-high critical temperature over 285 K. The realization of a possible room-temperature correlated state with a large correlated gap in twisted bilayer MoS_{2} can be understood as the cooperation effects of the stacking-specific atomic reconstruction and the resonantly enhanced interlayer hybridization, which largely amplify the moiré superlattice effects on electronic correlations. Furthermore, extreme large nonlinear Hall responses up to room temperature are uncovered near correlated electronic states, demonstrating the quantum geometry of moiré flat conduction band.

2023-12-17

Physical Review Letters (publié)

Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing

Shengchao Liu

Weili Nie

Chengpeng Wang

Jiarui Lu

Zhuoran Qiao

Ling Liu

Chaowei Xiao

Animashree Anandkumar

There is increasing adoption of artificial intelligence in drug discovery. However, existing studies use machine learning to mainly utilize … (voir plus)the chemical structures of molecules but ignore the vast textual knowledge available in chemistry. Incorporating textual knowledge enables us to realize new drug design objectives, adapt to text-based instructions and predict complex biological activities. Here we present a multi-modal molecule structure–text model, MoleculeSTM, by jointly learning molecules’ chemical structures and textual descriptions via a contrastive learning strategy. To train MoleculeSTM, we construct a large multi-modal dataset, namely, PubChemSTM, with over 280,000 chemical structure–text pairs. To demonstrate the effectiveness and utility of MoleculeSTM, we design two challenging zero-shot tasks based on text instructions, including structure–text retrieval and molecule editing. MoleculeSTM has two main properties: open vocabulary and compositionality via natural language. In experiments, MoleculeSTM obtains the state-of-the-art generalization ability to novel biochemical concepts across various benchmarks. Machine learning methods in cheminformatics have made great progress in using chemical structures of molecules, but a large portion of textual information remains scarcely explored. Liu and colleagues trained MoleculeSTM, a foundation model that aligns the structure and text modalities through contrastive learning, and show its utility on the downstream tasks of structure–text retrieval, text-guided editing and molecular property prediction.

2023-12-17

Nature Machine Intelligence (inconnu)

Room-temperature correlated states in twisted bilayer MoS$_2$

Fanfan Wu

Qiaoling Xu

Qinqin Wang

Yanbang Chu

Li Li

Jieying Liu

Jinpeng Tian

Yiru Ji

Le Liu

Yalong Yuan

Zhiheng Huang

Jiaojiao Zhao

Xiaozhou Zan

Kenji Watanabe

Takashi Taniguchi

Dongxia Shi

Gangxu Gu

Yang Xu

Lede Xian … (voir 3 de plus)

Wei Yang

Luojun Du

Guangyu Zhang

2023-11-27

ArXiv (prépublication)

PDB-Struct: A Comprehensive Benchmark for Structure-based Protein Design

Chuanrui Wang

Bozitao Zhong

Zuobai Zhang

Narendra Chaudhary

Sanchit Misra

Structure-based protein design has attracted increasing interest, with numerous methods being introduced in recent years. However, a univers… (voir plus)ally accepted method for evaluation has not been established, since the wet-lab validation can be overly time-consuming for the development of new algorithms, and the

2023-10-24

NeurIPS.cc/2023/Workshop/AI4D3 (poster)

openreview.net

Large Language Models can Learn Rules

Zhaocheng Zhu

Yuan Xue

Xinyun Chen

Denny Zhou

Dale Schuurmans

Hanjun Dai

2023-10-09

ArXiv (prépublication)

openreview.net

An Empirical Study of Retrieval-Enhanced Graph Neural Networks

Dingmin Wang

Shengchao Liu

Hanchen Wang

Bernardo Cuenca Grau

Linfeng Song

Le Song

Qi Liu

Graph Neural Networks (GNNs) are effective tools for graph representation learning. Most GNNs rely on a recursive neighborhood aggregation s… (voir plus)cheme, named message passing, thereby their theoretical expressive power is limited to the first-order Weisfeiler-Lehman test (1-WL). An effective approach to this challenge is to explicitly retrieve some annotated examples used to enhance GNN models. While retrieval-enhanced models have been proved to be effective in many language and vision domains, it remains an open question how effective retrieval-enhanced GNNs are when applied to graph datasets. Motivated by this, we want to explore how the retrieval idea can help augment the useful information learned in the graph neural networks, and we design a retrieval-enhanced scheme called GRAPHRETRIEVAL, which is agnostic to the choice of graph neural network models. In GRAPHRETRIEVAL, for each input graph, similar graphs together with their ground-true labels are retrieved from an existing database. Thus they can act as a potential enhancement to complete various graph property predictive tasks. We conduct comprehensive experiments over 13 datasets, and we observe that GRAPHRETRIEVAL is able to reach substantial improvements over existing GNNs. Moreover, our empirical study also illustrates that retrieval enhancement is a promising remedy for alleviating the long-tailed label distribution problem.

2023-09-27

Frontiers in Artificial Intelligence and Applications (publié)