Portrait of Bang Liu

Bang Liu

Associate Academic Member
Canada CIFAR AI Chair
Assistant Professor, Université de Montréal, Department of Computer Science and Operations Research
Research Topics
Data Mining
Deep Learning
Generative Models
Learning on Graphs
Natural Language Processing

Biography

Bang Liu is an assistant professor in the Department of Computer Science and Operations Research (DIRO), and a core member of the Applied Research in Computational Linguistics Lab (RALI) at Université de Montréal. He is also an associate academic member of Mila – Quebec Artificial Intelligence Institute and a Canada CIFAR AI Chair.

Liu received his BEng from the University of Science and Technology of China in 2013, and his MSc and PhD degrees from the University of Alberta in 2015 and 2020, respectively. His research interests lie primarily in the areas of natural language processing, multimodal and embodied learning, theory and techniques for AGI (e.g., understanding and improving large language models), and AI for science (e.g., health, material science, XR).

Current Students

PhD - Université de Montréal
Postdoctorate - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
Master's Research - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
Master's Research - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
Master's Research - Université de Montréal
Master's Research - Université de Montréal

Publications

R5: Rule Discovery with Reinforced and Recurrent Relational Reasoning
Shengyao Lu
Keith G Mills
SHANGLING JUI
Di Niu
Systematicity, i.e., the ability to recombine known parts and rules to form new sequences while reasoning over relational data, is critical … (see more)to machine intelligence. A model with strong systematicity is able to train on small-scale tasks and generalize to large-scale tasks. In this paper, we propose R5, a relational reasoning framework based on reinforcement learning that reasons over relational graph data and explicitly mines underlying compositional logical rules from observations. R5 has strong systematicity and being robust to noisy data. It consists of a policy value network equipped with Monte Carlo Tree Search to perform recurrent relational prediction and a backtrack rewriting mechanism for rule mining. By alternately applying the two components, R5 progressively learns a set of explicit rules from data and performs explainable and generalizable relation prediction. We conduct extensive evaluations on multiple datasets. Experimental results show that R5 outperforms various embedding-based and rule induction baselines on relation prediction tasks while achieving a high recall rate in discovering ground truth rules.
Grow-and-Clip: Informative-yet-Concise Evidence Distillation for Answer Explanation
Yanghua Xiao
Interpreting the predictions of existing Question Answering (QA) models is critical to many real-world intelligent applications, such as QA … (see more)systems for healthcare, education, and finance. However, existing QA models lack interpretability and provide no feedback or explanation for end-users to help them understand why a specific prediction is the answer to a question. In this research, we argue that the evidences of an answer is critical to enhancing the interpretability of QA models. Unlike previous research that simply extracts several sentence(s) in the context as evidence, we are the first to explicitly define the concept of evidence as the supporting facts in a context which are informative, concise, and readable. Besides, we provide effective strategies to quantitatively measure the informativeness, conciseness and readability of evidence. Furthermore, we propose Grow-and-Clip Evidence Distillation (GCED) algorithm to extract evidences from the contexts by trade-off informativeness, conciseness, and readability. We conduct extensive experiments on the SQuAD and TriviaQA datasets with several baseline models to evaluate the effect of GCED on interpreting answers to questions. Human evaluation are also carried out to check the quality of distilled evidences. Experimental results show that automatic distilled evidences have human-like informativeness, conciseness and readability, which can enhance the interpretability of the answers to questions.
Feeding What You Need by Understanding What You Learned
Fangli Xu
Bo Long
Siliang Tang
Lingfei Wu
Learning What You Need from What You Did: Product Taxonomy Expansion with User Behaviors Supervision
Sijie Cheng
Zhouhong Gu
Rui Xie
Wei Wu
Yanghua Xiao
Taxonomies have been widely used in various domains to underpin numerous applications. Specially, product taxonomies serve an essential role… (see more) in the e-commerce domain for the recommendation, browsing, and query understanding. However, taxonomies need to constantly capture the newly emerged terms or concepts in e-commerce platforms to keep up-to-date, which is expensive and labor-intensive if it relies on manual maintenance and updates. Therefore, we target the taxonomy expansion task to attach new concepts to existing taxonomies automatically. In this paper, we present a self-supervised and user behavior-oriented product taxonomy expansion framework to append new concepts into existing taxonomies. Our framework extracts hyponymy relations that conform to users' intentions and cognition. Specifically, i) to fully exploit user behavioral information, we extract candidate hyponymy relations that match user interests from query-click concepts; ii) to enhance the semantic information of new concepts and better detect hyponymy relations, we model concepts and relations through both user-generated content and structural information in existing taxonomies and user click logs, by leveraging Pre-trained Language Models and Graph Neural Network combined with Contrastive Learning; iii) to reduce the cost of dataset construction and overcome data skews, we construct a high-quality and balanced training dataset from existing taxonomy with no supervision. Extensive experiments on real-world product taxonomies in Meituan Platform, a leading Chinese vertical e-commerce platform to order take-out with more than 70 million daily active users, demonstrate the superiority of our proposed framework over state-of-the-art methods. Notably, our method enlarges the size of real-world product taxonomies from 39,263 to 94,698 relations with 88% precision. Our implementation is available: https://github.com/AdaCheng/Product_Taxonomy_Expansion.
S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks
Xinlin Li
Yaoliang Yu
Wulong Liu
Chunjing Xu
Vahid Partovi Nia
Refining BERT Embeddings for Document Hashing via Mutual Information Maximization
Zijing Ou
Qinliang Su
Jianxing Yu
Ruihui Zhao
Yefeng Zheng
Existing unsupervised document hashing methods are mostly established on generative models. Due to the difficulties of capturing long depend… (see more)ency structures, these methods rarely model the raw documents directly, but instead to model the features extracted from them (e.g. bag-of-words (BOW), TFIDF). In this paper, we propose to learn hash codes from BERT embeddings after observing their tremendous successes on downstream tasks. As a first try, we modify existing generative hashing models to accommodate the BERT embeddings. However, little improvement is observed over the codes learned from the old BOW or TFIDF features. We attribute this to the reconstruction requirement in the generative hashing, which will enforce irrelevant information that is abundant in the BERT embeddings also compressed into the codes. To remedy this issue, a new unsupervised hashing paradigm is further proposed based on the mutual information (MI) maximization principle. Specifically, the method first constructs appropriate global and local codes from the documents and then seeks to maximize their mutual information. Experimental results on three benchmark datasets demonstrate that the proposed method is able to generate hash codes that outperform existing ones learned from BOW features by a substantial margin.
Graph Neural Networks in Natural Language Processing
Lingfei Wu
Natural language processing (NLP) and understanding aim to read from unformatted text to accomplish different tasks. While word embeddings l… (see more)earned by deep neural networks are widely used, the underlying linguistic and semantic structures of text pieces cannot be fully exploited in these representations. Graph is a natural way to capture the connections between different text pieces, such as entities, sentences, and documents. To overcome the limits in vector space models, researchers combine deep learning models with graph-structured representations for various tasks in NLP and text mining. Such combinations help to make full use of both the structural information in text and the representation learning ability of deep neural networks. In this chapter, we introduce the various graph representations that are extensively used in NLP, and show how different NLP tasks can be tackled from a graph perspective. We summarize recent research works on graph-based NLP, and discuss two case studies related to graph-based text clustering, matching, and multihop machine reading comprehension in detail. Finally, we provide a synthesis about the important open problems of this subfield.
Guiding the Growth: Difficulty-Controllable Question Generation through Step-by-Step Rewriting
Yi Cheng
Siyao Li
Ruihui Zhao
Sujian Li
Chenhua Lin
Yefeng Zheng
This paper explores the task of Difficulty-Controllable Question Generation (DCQG), which aims at generating questions with required difficu… (see more)lty levels. Previous research on this task mainly defines the difficulty of a question as whether it can be correctly answered by a Question Answering (QA) system, lacking interpretability and controllability. In our work, we redefine question difficulty as the number of inference steps required to answer it and argue that Question Generation (QG) systems should have stronger control over the logic of generated questions. To this end, we propose a novel framework that progressively increases question difficulty through step-by-step rewriting under the guidance of an extracted reasoning chain. A dataset is automatically constructed to facilitate the research, on which extensive experiments are conducted to test the performance of our method.
Integrating Semantics and Neighborhood Information with Graph-Driven Generative Models for Document Retrieval
Zijing Ou
Qinliang Su
Jianxing Yu
Jingwen Wang
Ruihui Zhao
Changyou Chen
Yefeng Zheng
With the need of fast retrieval speed and small memory footprint, document hashing has been playing a crucial role in large-scale informatio… (see more)n retrieval. To generate high-quality hashing code, both semantics and neighborhood information are crucial. However, most existing methods leverage only one of them or simply combine them via some intuitive criteria, lacking a theoretical principle to guide the integration process. In this paper, we encode the neighborhood information with a graph-induced Gaussian distribution, and propose to integrate the two types of information with a graph-driven generative model. To deal with the complicated correlations among documents, we further propose a tree-structured approximation method for learning. Under the approximation, we prove that the training objective can be decomposed into terms involving only singleton or pairwise documents, enabling the model to be trained as efficiently as uncorrelated ones. Extensive experimental results on three benchmark datasets show that our method achieves superior performance over state-of-the-art methods, demonstrating the effectiveness of the proposed model for simultaneously preserving semantic and neighborhood information.
Semantic and Syntactic Enhanced Aspect Sentiment Triplet Extraction
Zhexue Chen
Hong Huang
Xuanhua Feng Shi
Hai-nan Jin
Aspect Sentiment Triplet Extraction (ASTE) aims to extract triplets from sentences, where each triplet includes an entity, its associated se… (see more)ntiment, and the opinion span explaining the reason for the sentiment. Most existing research addresses this problem in a multi-stage pipeline manner, which neglects the mutual information between such three elements and has the problem of error propagation. In this paper, we propose a Semantic and Syntactic Enhanced aspect Sentiment triplet Extraction model (S3E2) to fully exploit the syntactic and semantic relationships between the triplet elements and jointly extract them. Specifically, we design a Graph-Sequence duel representation and modeling paradigm for the task of ASTE: we represent the semantic and syntactic relationships between word pairs in a sentence by graph and encode it by Graph Neural Networks (GNNs), as well as modeling the original sentence by LSTM to preserve the sequential information. Under this setting, we further apply a more efficient inference strategy for the extraction of triplets. Extensive evaluations on four benchmark datasets show that S3E2 significantly outperforms existing approaches, which proves our S3E2's superiority and flexibility in an end-to-end fashion.
Encoder-Decoder Neural Architecture Optimization for Keyword Spotting
Tong Mo
Enquire One’s Parent and Child Before Decision: Fully Exploit Hierarchical Structure for Self-Supervised Taxonomy Expansion
Ruihui Zhao
X. T. Chen
Yefeng Zheng
Taxonomy is a hierarchically structured knowledge graph that plays a crucial role in machine intelligence. The taxonomy expansion task aims … (see more)to find a position for a new term in an existing taxonomy to capture the emerging knowledge in the world and keep the taxonomy dynamically updated. Previous taxonomy expansion solutions neglect valuable information brought by the hierarchical structure and evaluate the correctness of merely an added edge, which downgrade the problem to node-pair scoring or mini-path classification. In this paper, we propose the Hierarchy Expansion Framework (HEF), which fully exploits the hierarchical structure’s properties to maximize the coherence of expanded taxonomy. HEF makes use of taxonomy’s hierarchical structure in multiple aspects: i) HEF utilizes subtrees containing most relevant nodes as self-supervision data for a complete comparison of parental and sibling relations; ii) HEF adopts a coherence modeling module to evaluate the coherence of a taxonomy’s subtree by integrating hypernymy relation detection and several tree-exclusive features; iii) HEF introduces the Fitting Score for position selection, which explicitly evaluates both path and level selections and takes full advantage of parental relations to interchange information for disambiguation and self-correction. Extensive experiments show that by better exploiting the hierarchical structure and optimizing taxonomy’s coherence, HEF vastly surpasses the prior state-of-the-art on three benchmark datasets by an average improvement of 46.7% in accuracy and 32.3% in mean reciprocal rank.