Bang Liu

Google Scholar

Haochen Shi

PhD - Université de Montréal

haochen.shi@mila.quebec

Jeremy Qin

Master's Research - Université de Montréal

jeremy.qin@mila.quebec

Jinghan Sun

Master's Research - Université de Montréal

jinghan.sun@mila.quebec

Website

qianggang.ding@mila.quebec

Google Scholar

Kyle Roth

PhD - Université de Montréal

kyle.roth@mila.quebec

Website

Qianggang Ding

PhD - Université de Montréal

Rushil Gupta

Master's Research - Université de Montréal

rushil.gupta@mila.quebec

suyuchen.wang@mila.quebec

Google Scholar

Sifan Wu

PhD - Université de Montréal

sifan.wu@mila.quebec

Suyuchen Wang

PhD - Université de Montréal

xiaoqiang.wang@mila.quebec

Tony Yuan

Master's Research - Université de Montréal

tony.yuan@mila.quebec

Xiaoqiang Wang

PhD - Université de Montréal

Xiaotong Lyu

Research Intern - Université de Montréal

xiaotong.lyu@mila.quebec

Yizhan Li

Master's Research - Université de Montréal

yizhan.li@mila.quebec

Zhiyuan Sun

Master's Research - Université de Montréal

zhiyuan.sun@mila.quebec

Publications

Semantic and Syntactic Enhanced Aspect Sentiment Triplet Extraction

Zhexue Chen

Hong Huang

Xuanhua Feng Shi

Hai-nan Jin

Aspect Sentiment Triplet Extraction (ASTE) aims to extract triplets from sentences, where each triplet includes an entity, its associated se… (see more)ntiment, and the opinion span explaining the reason for the sentiment. Most existing research addresses this problem in a multi-stage pipeline manner, which neglects the mutual information between such three elements and has the problem of error propagation. In this paper, we propose a Semantic and Syntactic Enhanced aspect Sentiment triplet Extraction model (S3E2) to fully exploit the syntactic and semantic relationships between the triplet elements and jointly extract them. Specifically, we design a Graph-Sequence duel representation and modeling paradigm for the task of ASTE: we represent the semantic and syntactic relationships between word pairs in a sentence by graph and encode it by Graph Neural Networks (GNNs), as well as modeling the original sentence by LSTM to preserve the sequential information. Under this setting, we further apply a more efficient inference strategy for the extraction of triplets. Extensive evaluations on four benchmark datasets show that S3E2 significantly outperforms existing approaches, which proves our S3E2's superiority and flexibility in an end-to-end fashion.

2021-08-01

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (published)

Encoder-Decoder Neural Architecture Optimization for Keyword Spotting

Tong Mo

2021-06-04

ArXiv (preprint)

Enquire One’s Parent and Child Before Decision: Fully Exploit Hierarchical Structure for Self-Supervised Taxonomy Expansion

Suyuchen Wang

Ruihui Zhao

Xi Chen

Yefeng Zheng

Taxonomy is a hierarchically structured knowledge graph that plays a crucial role in machine intelligence. The taxonomy expansion task aims … (see more)to find a position for a new term in an existing taxonomy to capture the emerging knowledge in the world and keep the taxonomy dynamically updated. Previous taxonomy expansion solutions neglect valuable information brought by the hierarchical structure and evaluate the correctness of merely an added edge, which downgrade the problem to node-pair scoring or mini-path classification. In this paper, we propose the Hierarchy Expansion Framework (HEF), which fully exploits the hierarchical structure’s properties to maximize the coherence of expanded taxonomy. HEF makes use of taxonomy’s hierarchical structure in multiple aspects: i) HEF utilizes subtrees containing most relevant nodes as self-supervision data for a complete comparison of parental and sibling relations; ii) HEF adopts a coherence modeling module to evaluate the coherence of a taxonomy’s subtree by integrating hypernymy relation detection and several tree-exclusive features; iii) HEF introduces the Fitting Score for position selection, which explicitly evaluates both path and level selections and takes full advantage of parental relations to interchange information for disambiguation and self-correction. Extensive experiments show that by better exploiting the hierarchical structure and optimizing taxonomy’s coherence, HEF vastly surpasses the prior state-of-the-art on three benchmark datasets by an average improvement of 46.7% in accuracy and 32.3% in mean reciprocal rank.

2021-06-03

Proceedings of the Web Conference 2021 (published)

Imperfect also Deserves Reward: Multi-Level and Sequential Reward Modeling for Better Dialog Management

Zhengxu Hou

Ruihui Zhao

Zijing Ou

Yafei Liu

Xi Chen

Yefeng Zheng

For task-oriented dialog systems, training a Reinforcement Learning (RL) based Dialog Management module suffers from low sample efficiency a… (see more)nd slow convergence speed due to the sparse rewards in RL. To solve this problem, many strategies have been proposed to give proper rewards when training RL, but their rewards lack interpretability and cannot accurately estimate the distribution of state-action pairs in real dialogs. In this paper, we propose a multi-level reward modeling approach that factorizes a reward into a three-level hierarchy: domain, act, and slot. Based on inverse adversarial reinforcement learning, our designed reward model can provide more accurate and explainable reward signals for state-action pairs. Extensive evaluations show that our approach can be applied to a wide range of reinforcement learning-based dialog systems and significantly improves both the performance and the speed of convergence.

2021-06-01

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (published)

Noised Consistency Training for Text Summarization

J. Liu

Qianren Mao

Hao Peng

Hongdong Zhu

Jianxin Li

Neural abstractive summarization methods often require large quantities of labeled training data. However, labeling large amounts of summari… (see more)zation data is often prohibitive due to time, financial, and expertise constraints, which has limited the usefulness of summarization systems to practical applications. In this paper, we argue that this limitation can be overcome by a semi-supervised approach: consistency training which is to leverage large amounts of unlabeled data to improve the performance of supervised learning over a small corpus. The consistency regularization semi-supervised learning can regularize model predictions to be invariant to small noise applied to input articles. By adding noised unlabeled corpus to help regularize consistency training, this framework obtains comparative performance without using the full dataset. In particular, we have verified that leveraging large amounts of unlabeled data decently improves the performance of supervised learning over an insufficient labeled dataset.

2021-05-28

ArXiv (preprint)

QBSUM: a Large-Scale Query-Based Document Summarization Dataset from Real-world Applications

Mingjun Zhao

Shengli Yan

Xinwang Zhong

Qian Hao

Haolan Chen

Di Niu

Bowei Long

Wei-dong Guo

2021-03-01

Computer Speech & Language (published)

GIANT: Scalable Creation of a Web-scale Ontology

Weidong Guo

Di Niu

Jinwen Luo

Chaoyue Wang

Zhen Wen

Yu Xu

2020-05-31

Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (published)

Story Forest

Fred X. Han

Di Niu

Linglong Kong

Kunfeng Lai

Yu Xu

Extracting events accurately from vast news corpora and organize events logically is critical for news apps and search engines, which aim to… (see more) organize news information collected from the Internet and present it to users in the most sensible forms. Intuitively speaking, an event is a group of news documents that report the same news incident possibly in different ways. In this article, we describe our experience of implementing a news content organization system at Tencent to discover events from vast streams of breaking news and to evolve news story structures in an online fashion. Our real-world system faces unique challenges in contrast to previous studies on topic detection and tracking (TDT) and event timeline or graph generation, in that we (1) need to accurately and quickly extract distinguishable events from massive streams of long text documents, and (2) must develop the structures of event stories in an online manner, in order to guarantee a consistent user viewing experience. In solving these challenges, we propose Story Forest, a set of online schemes that automatically clusters streaming documents into events, while connecting related events in growing trees to tell evolving stories. A core novelty of our Story Forest system is EventX, a semi-supervised scheme to extract events from massive Internet news corpora. EventX relies on a two-layered, graph-based clustering procedure to group documents into fine-grained events. We conducted extensive evaluations based on (1) 60 GB of real-world Chinese news data, (2) a large Chinese Internet news dataset that contains 11,748 news articles with truth event labels, and (3) the 20 News Groups English dataset, through detailed pilot user experience studies. The results demonstrate the superior capabilities of Story Forest to accurately identify events and organize news text into a logical structure that is appealing to human readers.

2020-05-13

ACM Transactions on Knowledge Discovery from Data (published)

Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus

Haojie Wei

Di Niu

Haolan Chen

Yancheng He

The ability to ask questions is important in both human and machine intelligence. Learning to ask questions helps knowledge acquisition, imp… (see more)roves question-answering and machine reading comprehension tasks, and helps a chatbot to keep the conversation flowing with a human. Existing question generation models are ineffective at generating a large amount of high-quality question-answer pairs from unstructured text, since given an answer and an input passage, question generation is inherently a one-to-many mapping. In this paper, we propose Answer-Clue-Style-aware Question Generation (ACS-QG), which aims at automatically generating high-quality and diverse question-answer pairs from unlabeled text corpus at scale by imitating the way a human asks questions. Our system consists of: i) an information extractor, which samples from the text multiple types of assistive information to guide question generation; ii) neural question generators, which generate diverse and controllable questions, leveraging the extracted assistive information; and iii) a neural quality controller, which removes low-quality generated data based on text entailment. We compare our question generation models with existing approaches and resort to voluntary human evaluation to assess the quality of the generated question-answer pairs. The evaluation results suggest that our system dramatically outperforms state-of-the-art neural question generation models in terms of the generation quality, while being scalable in the meantime. With models trained on a relatively smaller amount of data, we can generate 2.8 million quality-assured question-answer pairs from a million sentences found in Wikipedia.

2020-04-20

Proceedings of The Web Conference 2020 (published)

Natural Language Processing and Text Mining with Graph-Structured Representations

2020-01-01

(published)