Portrait de Sitao Luan n'est pas disponible

Sitao Luan

Postdoctorat - McGill
Superviseur⋅e principal⋅e
Co-supervisor
Sujets de recherche
Exploration des données
IA pour la science
Raisonnement
Réseaux de neurones en graphes

Publications

MUDiff: Unified Diffusion for Complete Molecule Generation
Zhitao Ying
Rex Ying
Stefano Ermon
When Do Graph Neural Networks Help with Node Classification? Investigating the Impact of Homophily Principle on Node Distinguishability
Qincheng Lu
Jiaqi Zhu
Xiao-Wen Chang
Jure Leskovec
When Do Graph Neural Networks Help with Node Classification: Investigating the Homophily Principle on Node Distinguishability
Qincheng Lu
Jiaqi Zhu
Xiao-Wen Chang
Jure Leskovec
Homophily principle, i.e., nodes with the same labels are more likely to be connected, was believed to be the main reason for the performanc… (voir plus)e superiority of Graph Neural Networks (GNNs) over Neural Networks (NNs) on Node Classification (NC) tasks. Recently, people have developed theoretical results arguing that, even though the homophily principle is broken, the advantage of GNNs can still hold as long as nodes from the same class share similar neighborhood patterns [29], which questions the validity of homophily. However, this argument only considers intra-class Node Distinguishability (ND) and ignores inter-class ND, which is insufficient to study the effect of homophily. In this paper, we first demonstrate the aforementioned insufficiency with examples and argue that an ideal situation for ND is to have smaller intra-class ND than inter-class ND. To formulate this idea and have a better understanding of homophily, we propose Contextual Stochastic Block Model for Homophily (CSBM-H) and define two metrics, Probabilistic Bayes Error (PBE) and Expected Negative KL-divergence (ENKL), to quantify ND, through which we can also find how intra- and inter-class ND influence ND together. We visualize the results and give detailed analysis. Through experiments, we verified that the superiority of GNNs is
Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Networks
Mingde Zhao
Xiao-Wen Chang
The core operation of current Graph Neural Networks (GNNs) is the aggregation enabled by the graph Laplacian or message passing, which filte… (voir plus)rs the neighborhood node information. Though effective for various tasks, in this paper, we show that they are potentially a problematic factor underlying all GNN methods for learning on certain datasets, as they force the node representations similar, making the nodes gradually lose their identity and become indistinguishable. Hence, we augment the aggregation operations with their dual, i.e. diversification operators that make the node more distinct and preserve the identity. Such augmentation replaces the aggregation with a two-channel filtering process that, in theory, is beneficial for enriching the node representations. In practice, the proposed two-channel filters can be easily patched on existing GNN methods with diverse training strategies, including spectral and spatial (message passing) methods. In the experiments, we observe desired characteristics of the models and significant performance boost upon the baselines on 9 node classification tasks.
When Do We Need GNN for Node Classification?
Qincheng Lu
Jiaqi Zhu
Xiao-Wen Chang
When Do We Need GNN for Node Classification?
Qincheng Lu
Jiaqi Zhu
Xiao-Wen Chang
Revisiting Heterophily For Graph Neural Networks
Qincheng Lu
Jiaqi Zhu
Mingde Zhao
Xiao-Wen Chang
Graph Neural Networks (GNNs) extend basic Neural Networks (NNs) by using graph structures based on the relational inductive bias (homophily … (voir plus)assumption). While GNNs have been commonly believed to outperform NNs in real-world tasks, recent work has identified a non-trivial set of datasets where their performance compared to NNs is not satisfactory. Heterophily has been considered the main cause of this empirical observation and numerous works have been put forward to address it. In this paper, we first revisit the widely used homophily metrics and point out that their consideration of only graph-label consistency is a shortcoming. Then, we study heterophily from the perspective of post-aggregation node similarity and define new homophily metrics, which are potentially advantageous compared to existing ones. Based on this investigation, we prove that some harmful cases of heterophily can be effectively addressed by local diversification operation. Then, we propose the Adaptive Channel Mixing (ACM), a framework to adaptively exploit aggregation, diversification and identity channels node-wisely to extract richer localized information for diverse node heterophily situations. ACM is more powerful than the commonly used uni-channel framework for node classification tasks on heterophilic graphs and is easy to be implemented in baseline GNN layers. When evaluated on 10 benchmark node classification tasks, ACM-augmented baselines consistently achieve significant performance gain, exceeding state-of-the-art GNNs on most tasks without incurring significant computational burden.
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning
We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state during plan… (voir plus)ning. The agent uses a bottleneck mechanism over a set-based representation to force the number of entities to which the agent attends at each planning step to be small. In experiments, we investigate the bottleneck mechanism with several sets of customized environments featuring different challenges. We consistently observe that the design allows the planning agents to generalize their learned task-solving abilities in compatible unseen environments by attending to the relevant objects, leading to better out-of-distribution generalization performance.
META-Learning State-based Eligibility Traces for More Sample-Efficient Policy Evaluation
Mingde Zhao
Xiao-Wen Chang
Temporal-Difference (TD) learning is a standard and very successful reinforcement learning approach, at the core of both algorithms that lea… (voir plus)rn the value of a given policy, as well as algorithms which learn how to improve policies. TD-learning with eligibility traces provides a way to boost sample efficiency by temporal credit assignment, i.e. deciding which portion of a reward should be assigned to predecessor states that occurred at different previous times, controlled by a parameter