Jie Fu

Role-Wise Data Augmentation for Knowledge Distillation

Jie Fu

Xue Geng

Zhijian Duan

Bohan Zhuang

Xingdi Yuan

Adam Trischler

Jie Lin

Vijay Chandrasekhar

Chris Pal

Hao Dong

2020-04-19

ArXiv (preprint)

openreview.net

Revision in Continuous Space: Unsupervised Text Style Transfer without Adversarial Learning

Dayiheng Liu

Jie Fu

Yidan Zhang

Chris Pal

Jiancheng Lv

Typical methods for unsupervised text style transfer often rely on two key ingredients: 1) seeking the explicit disentanglement of the conte… (see more)nt and the attributes, and 2) troublesome adversarial learning. In this paper, we show that neither of these components is indispensable. We propose a new framework that utilizes the gradients to revise the sentence in a continuous space during inference to achieve text style transfer. Our method consists of three key components: a variational auto-encoder (VAE), some attribute predictors (one for each attribute), and a content predictor. The VAE and the two types of predictors enable us to perform gradient-based optimization in the continuous space, which is mapped from sentences in a discrete space, to find the representation of a target sentence with the desired attributes and preserved content. Moreover, the proposed method naturally has the ability to simultaneously manipulate multiple fine-grained attributes, such as sentence length and the presence of specific words, when performing text style transfer tasks. Compared with previous adversarial learning based methods, the proposed method is more interpretable, controllable and easier to train. Extensive experimental studies on three popular text style transfer tasks show that the proposed method significantly outperforms five state-of-the-art methods.

2020-04-03

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Interactive Language Learning by Question Answering

Xingdi Yuan

Marc-Alexandre Côté

Adam Trischler

Humans observe and interact with the world to acquire knowledge. However, most existing machine reading comprehension (MRC) tasks miss the i… (see more)nteractive, information-seeking component of comprehension. Such tasks present models with static documents that contain all necessary information, usually concentrated in a single short substring. Thus, models can achieve strong performance through simple word- and phrase-based pattern matching. We address this problem by formulating a novel text-based question answering task: Question Answering with Interactive Text (QAit). In QAit, an agent must interact with a partially observable text-based environment to gather information required to answer questions. QAit poses questions about the existence, location, and attributes of objects found in the environment. The data is built using a text-based game generator that defines the underlying dynamics of interaction with the environment. We propose and evaluate a set of baseline models for the QAit task that includes deep reinforcement learning agents. Experiments show that the task presents a major challenge for machine reading systems, while humans solve it with relative ease.

2019-11-01

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (published)

doi.org

arxiv.org

Structure Learning for Neural Module Networks

Neural Module Networks, originally proposed for the task of visual question answering, are a class of neural network architectures that invo… (see more)lve human-specified neural modules, each designed for a specific form of reasoning. In current formulations of such networks only the parameters of the neural modules and/or the order of their execution is learned. In this work, we further expand this approach and also learn the underlying internal structure of modules in terms of the ordering and combination of simple and elementary arithmetic operators. We utilize a minimum amount of prior knowledge from the human-specified neural modules in the form of different input types and arithmetic operators used in these modules. Our results show that one is indeed able to simultaneously learn both internal module structure and module sequencing without extra supervisory signals for module execution sequencing. With this approach, we report performance comparable to models using hand-designed modules. In addition, we do a analysis of sensitivity of the learned modules w.r.t. the arithmetic operations and infer the analytical expressions of the learned modules.

2019-11-01

Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN) (published)

doi.org

arxiv.org

Learning Sparse Mixture of Experts for Visual Question Answering

Vardaan Pahuja

Jie Fu

Chris Pal

2019-09-19

ArXiv (preprint)

arxiv.org

Learning Multi-Task Communication with Message Passing for Sequence Learning

Xipeng Qiu

We present two architectures for multi-task learning with neural sequence models. Our approach allows the relationships between different ta… (see more)sks to be learned dynamically, rather than using an ad-hoc pre-defined structure as in previous work. We adopt the idea from message-passing graph neural networks, and propose a general graph multi-task learning framework in which different tasks can communicate with each other in an effective and interpretable way. We conduct extensive experiments in text classification and sequence labelling to evaluate our approach on multi-task learning and transfer learning. The empirical results show that our models not only outperform competitive baselines, but also learn interpretable and transferable patterns across tasks.

2019-07-17

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

Conditional Computation for Continual Learning

Min Lin

Jie Fu

Yoshua Bengio

Catastrophic forgetting of connectionist neural networks is caused by the global sharing of parameters among all training examples. In this … (see more)study, we analyze parameter sharing under the conditional computation framework where the parameters of a neural network are conditioned on each input example. At one extreme, if each input example uses a disjoint set of parameters, there is no sharing of parameters thus no catastrophic forgetting. At the other extreme, if the parameters are the same for every example, it reduces to the conventional neural network. We then introduce a clipped version of maxout networks which lies in the middle, i.e. parameters are shared partially among examples. Based on the parameter sharing analysis, we can locate a limited set of examples that are interfered when learning a new example. We propose to perform rehearsal on this set to prevent forgetting, which is termed as conditional rehearsal. Finally, we demonstrate the effectiveness of the proposed method in an online non-stationary setup, where updates are made after each new example and the distribution of the received example shifts over time.

2019-06-16

ArXiv (preprint)

arxiv.org

Multi-task Learning over Graph Structures

Xipeng Qiu

We present two architectures for multi-task learning with neural sequence models. Our approach allows the relationships between different ta… (see more)sks to be learned dynamically, rather than using an ad-hoc pre-defined structure as in previous work. We adopt the idea from message-passing graph neural networks and propose a general \textbf{graph multi-task learning} framework in which different tasks can communicate with each other in an effective and interpretable way. We conduct extensive experiments in text classification and sequence labeling to evaluate our approach on multi-task learning and transfer learning. The empirical results show that our models not only outperform competitive baselines but also learn interpretable and transferable patterns across tasks.

2018-11-26

ArXiv (preprint)

arxiv.org

Mila Community of Practice

Custom AI Learning Programs

Mil'Haq Fest 2025

Supervision Requests

Publications

Mila Community of Practice

Custom AI Learning Programs

Mil'Haq Fest 2025

Supervision Requests

Popular keywords:

Jie Fu

Publications