Bang Liu

rushil.gupta@umontreal.ca

Biography

Bang Liu is an assistant professor in the Department of Computer Science and Operations Research (DIRO), and a core member of the Applied Research in Computational Linguistics Lab (RALI) at Université de Montréal. He is also an associate academic member of Mila – Quebec Artificial Intelligence Institute and a Canada CIFAR AI Chair.

Liu received his BEng from the University of Science and Technology of China in 2013, and his MSc and PhD degrees from the University of Alberta in 2015 and 2020, respectively. His research interests lie primarily in the areas of natural language processing, multimodal and embodied learning, theory and techniques for AGI (e.g., understanding and improving large language models), and AI for science (e.g., health, material science, XR).

Current Students

Qianggang Ding

PhD - Université de Montréal

Farshid Effaty

Postdoctorate - Université de Montréal

Master's Research - Université de Montréal

Gauransh Kumar

Master's Research - Université de Montréal

Yizhan Li

PhD - Université de Montréal

Jeremy Qin

Master's Research - Université de Montréal

Kyle Roth

PhD - Université de Montréal

Haochen Shi

PhD - Université de Montréal

Jinghan Sun

Master's Research - Université de Montréal

Suyuchen Wang

PhD - Université de Montréal

Xiaoqiang Wang

PhD - Université de Montréal

Dekun Wu

PhD - Université de Montréal

Sifan Wu

PhD - Université de Montréal

Huan Zhang

Master's Research - Université de Montréal

Revolutionizing Materials Science with NLP: Introducing MatSci-NLP and HoneyBee

Blog Posts

MatSci-Instruct and HoneyBee training workflow.

October 21, 2024

Bang Liu

Read the article

Publications

S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

Xinlin Li

Yaoliang Yu

Wulong Liu

Chunjing Xu

Vahid Partovi Nia

openreview.net

Refining BERT Embeddings for Document Hashing via Mutual Information Maximization

Zijing Ou

Qinliang Su

Jianxing Yu

Ruihui Zhao

Yefeng Zheng

Existing unsupervised document hashing methods are mostly established on generative models. Due to the difficulties of capturing long depend… (see more)ency structures, these methods rarely model the raw documents directly, but instead to model the features extracted from them (e.g. bag-of-words (BOW), TFIDF). In this paper, we propose to learn hash codes from BERT embeddings after observing their tremendous successes on downstream tasks. As a first try, we modify existing generative hashing models to accommodate the BERT embeddings. However, little improvement is observed over the codes learned from the old BOW or TFIDF features. We attribute this to the reconstruction requirement in the generative hashing, which will enforce irrelevant information that is abundant in the BERT embeddings also compressed into the codes. To remedy this issue, a new unsupervised hashing paradigm is further proposed based on the mutual information (MI) maximization principle. Specifically, the method first constructs appropriate global and local codes from the documents and then seeks to maximize their mutual information. Experimental results on three benchmark datasets demonstrate that the proposed method is able to generate hash codes that outperform existing ones learned from BOW features by a substantial margin.

2021-11-01

Findings of the Association for Computational Linguistics: EMNLP 2021 (published)

Graph Neural Networks in Natural Language Processing

Lingfei Wu

Natural language processing (NLP) and understanding aim to read from unformatted text to accomplish different tasks. While word embeddings l… (see more)earned by deep neural networks are widely used, the underlying linguistic and semantic structures of text pieces cannot be fully exploited in these representations. Graph is a natural way to capture the connections between different text pieces, such as entities, sentences, and documents. To overcome the limits in vector space models, researchers combine deep learning models with graph-structured representations for various tasks in NLP and text mining. Such combinations help to make full use of both the structural information in text and the representation learning ability of deep neural networks. In this chapter, we introduce the various graph representations that are extensively used in NLP, and show how different NLP tasks can be tackled from a graph perspective. We summarize recent research works on graph-based NLP, and discuss two case studies related to graph-based text clustering, matching, and multihop machine reading comprehension in detail. Finally, we provide a synthesis about the important open problems of this subfield.

2021-09-30

Deep Learning on Graphs (published)

Guiding the Growth: Difficulty-Controllable Question Generation through Step-by-Step Rewriting

Yi Cheng

Siyao Li

Ruihui Zhao

Sujian Li

Chenhua Lin

Yefeng Zheng

This paper explores the task of Difficulty-Controllable Question Generation (DCQG), which aims at generating questions with required difficu… (see more)lty levels. Previous research on this task mainly defines the difficulty of a question as whether it can be correctly answered by a Question Answering (QA) system, lacking interpretability and controllability. In our work, we redefine question difficulty as the number of inference steps required to answer it and argue that Question Generation (QG) systems should have stronger control over the logic of generated questions. To this end, we propose a novel framework that progressively increases question difficulty through step-by-step rewriting under the guidance of an extracted reasoning chain. A dataset is automatically constructed to facilitate the research, on which extensive experiments are conducted to test the performance of our method.

2021-08-01

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (published)

Integrating Semantics and Neighborhood Information with Graph-Driven Generative Models for Document Retrieval

Zijing Ou

Qinliang Su

Jianxing Yu

Jingwen Wang

Ruihui Zhao

Changyou Chen

Yefeng Zheng

With the need of fast retrieval speed and small memory footprint, document hashing has been playing a crucial role in large-scale informatio… (see more)n retrieval. To generate high-quality hashing code, both semantics and neighborhood information are crucial. However, most existing methods leverage only one of them or simply combine them via some intuitive criteria, lacking a theoretical principle to guide the integration process. In this paper, we encode the neighborhood information with a graph-induced Gaussian distribution, and propose to integrate the two types of information with a graph-driven generative model. To deal with the complicated correlations among documents, we further propose a tree-structured approximation method for learning. Under the approximation, we prove that the training objective can be decomposed into terms involving only singleton or pairwise documents, enabling the model to be trained as efficiently as uncorrelated ones. Extensive experimental results on three benchmark datasets show that our method achieves superior performance over state-of-the-art methods, demonstrating the effectiveness of the proposed model for simultaneously preserving semantic and neighborhood information.

2021-08-01

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (published)

Semantic and Syntactic Enhanced Aspect Sentiment Triplet Extraction

Zhexue Chen

Hong Huang

Xuanhua Feng Shi

Hai-nan Jin

Aspect Sentiment Triplet Extraction (ASTE) aims to extract triplets from sentences, where each triplet includes an entity, its associated se… (see more)ntiment, and the opinion span explaining the reason for the sentiment. Most existing research addresses this problem in a multi-stage pipeline manner, which neglects the mutual information between such three elements and has the problem of error propagation. In this paper, we propose a Semantic and Syntactic Enhanced aspect Sentiment triplet Extraction model (S3E2) to fully exploit the syntactic and semantic relationships between the triplet elements and jointly extract them. Specifically, we design a Graph-Sequence duel representation and modeling paradigm for the task of ASTE: we represent the semantic and syntactic relationships between word pairs in a sentence by graph and encode it by Graph Neural Networks (GNNs), as well as modeling the original sentence by LSTM to preserve the sequential information. Under this setting, we further apply a more efficient inference strategy for the extraction of triplets. Extensive evaluations on four benchmark datasets show that S3E2 significantly outperforms existing approaches, which proves our S3E2's superiority and flexibility in an end-to-end fashion.

2021-08-01

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (published)

Encoder-Decoder Neural Architecture Optimization for Keyword Spotting

Tong Mo

2021-06-04

ArXiv (preprint)

Enquire One’s Parent and Child Before Decision: Fully Exploit Hierarchical Structure for Self-Supervised Taxonomy Expansion

Suyuchen Wang

Ruihui Zhao

X. T. Chen

Yefeng Zheng

Taxonomy is a hierarchically structured knowledge graph that plays a crucial role in machine intelligence. The taxonomy expansion task aims … (see more)to find a position for a new term in an existing taxonomy to capture the emerging knowledge in the world and keep the taxonomy dynamically updated. Previous taxonomy expansion solutions neglect valuable information brought by the hierarchical structure and evaluate the correctness of merely an added edge, which downgrade the problem to node-pair scoring or mini-path classification. In this paper, we propose the Hierarchy Expansion Framework (HEF), which fully exploits the hierarchical structure’s properties to maximize the coherence of expanded taxonomy. HEF makes use of taxonomy’s hierarchical structure in multiple aspects: i) HEF utilizes subtrees containing most relevant nodes as self-supervision data for a complete comparison of parental and sibling relations; ii) HEF adopts a coherence modeling module to evaluate the coherence of a taxonomy’s subtree by integrating hypernymy relation detection and several tree-exclusive features; iii) HEF introduces the Fitting Score for position selection, which explicitly evaluates both path and level selections and takes full advantage of parental relations to interchange information for disambiguation and self-correction. Extensive experiments show that by better exploiting the hierarchical structure and optimizing taxonomy’s coherence, HEF vastly surpasses the prior state-of-the-art on three benchmark datasets by an average improvement of 46.7% in accuracy and 32.3% in mean reciprocal rank.

2021-06-03

Proceedings of the Web Conference 2021 (published)

Imperfect also Deserves Reward: Multi-Level and Sequential Reward Modeling for Better Dialog Management

Zhengxu Hou

Ruihui Zhao

Zijing Ou

Yafei Liu

X. T. Chen

Yefeng Zheng

For task-oriented dialog systems, training a Reinforcement Learning (RL) based Dialog Management module suffers from low sample efficiency a… (see more)nd slow convergence speed due to the sparse rewards in RL. To solve this problem, many strategies have been proposed to give proper rewards when training RL, but their rewards lack interpretability and cannot accurately estimate the distribution of state-action pairs in real dialogs. In this paper, we propose a multi-level reward modeling approach that factorizes a reward into a three-level hierarchy: domain, act, and slot. Based on inverse adversarial reinforcement learning, our designed reward model can provide more accurate and explainable reward signals for state-action pairs. Extensive evaluations show that our approach can be applied to a wide range of reinforcement learning-based dialog systems and significantly improves both the performance and the speed of convergence.

2021-06-01

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (published)

Noised Consistency Training for Text Summarization

J. Y. Liu

Qianren Mao

Hao Peng

Hongdong Zhu

Jian-Xin Li

Neural abstractive summarization methods often require large quantities of labeled training data. However, labeling large amounts of summari… (see more)zation data is often prohibitive due to time, financial, and expertise constraints, which has limited the usefulness of summarization systems to practical applications. In this paper, we argue that this limitation can be overcome by a semi-supervised approach: consistency training which is to leverage large amounts of unlabeled data to improve the performance of supervised learning over a small corpus. The consistency regularization semi-supervised learning can regularize model predictions to be invariant to small noise applied to input articles. By adding noised unlabeled corpus to help regularize consistency training, this framework obtains comparative performance without using the full dataset. In particular, we have verified that leveraging large amounts of unlabeled data decently improves the performance of supervised learning over an insufficient labeled dataset.

2021-05-28

ArXiv (preprint)

QBSUM: a Large-Scale Query-Based Document Summarization Dataset from Real-world Applications

Mingjun Zhao

Shengli Yan

Xinwang Zhong

Qian Hao

Haolan Chen

Di Niu

Bo Long

Wei-dong Guo

2021-03-01

Computer Speech & Language (published)

GIANT: Scalable Creation of a Web-scale Ontology

Weidong Guo

Di Niu

Jinwen Luo

Chaoyue Wang

Zhen Wen

Yu Xu

2020-05-31

Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (published)