Publications

Investigating the Influence of Selected Linguistic Features on Authorship Attribution using German News Articles

Manuel Sage

Pietro Cruciata

Raed Abdo

Jackie CK Cheung

Yaoyao Fiona Zhao

In this work, we perform authorship attri-bution on a new dataset of German news articles. We seek to classify over 3,700 articles to their … (see more)ﬁve corresponding authors, using four conventional machine learning approaches (na¨ıve Bayes, logistic regression, SVM and kNN) and a convolutional neural network. We analyze the effect of character and word n-grams on the prediction accuracy, as well as the inﬂuence of stop words, punctuation, numbers, and lowercasing when preprocessing raw text. The experiments show that higher order character n-grams (n = 5,6) perform better than lower orders and word n-grams slightly outperform those with characters. Combining both in fusion models further improves results up to 92% for SVM. A multilayer convolutional structure allows the CNN to achieve 90.5% accuracy. We found stop words and punctuation to be important features for author identiﬁcation; removing them leads to a measurable decrease in performance. Finally, we evaluate the topic dependency of the algorithms by gradually replacing named entities, nouns, verbs and eventually all to-kens in the dataset according to their POS-tags.

2019-12-31

SwissText/KONVENS (published)

dblp.uni-trier.de

Investigating the interconnections between human, technology and context in the implementation of a AI-based health information technology: a dynamic technological frame perspective

Catherine Régis

2019-12-31

(published)

www.semanticscholar.org

Joint Learning of Generative Translator and Classifier for Visually Similar Classes

Byungin Yoo

Tristan Sylvain

Yoshua Bengio

Junmo Kim

In this paper, we propose a Generative Translation Classification Network (GTCN) for improving visual classification accuracy in settings wh… (see more)ere classes are visually similar and data is scarce. For this purpose, we propose joint learning from a scratch to train a classifier and a generative stochastic translation network end-to-end. The translation network is used to perform on-line data augmentation across classes, whereas previous works have mostly involved domain adaptation. To help the model further benefit from this data-augmentation, we introduce an adaptive fade-in loss and a quadruplet loss. We perform experiments on multiple datasets to demonstrate the proposed method’s performance in varied settings. Of particular interest, training on 40% of the dataset is enough for our model to surpass the performance of baselines trained on the full dataset. When our architecture is trained on the full dataset, we achieve comparable performance with state-of-the-art methods despite using a light-weight architecture.

2019-12-31

IEEE Access (published)

doi.org

arxiv.org

Language Gans Falling Short

Massimo Caccia

Lucas Caccia

William Fedus

Hugo Larochelle

Joelle Pineau

Laurent Charlin

Generating high-quality text with sufficient diversity is essential for a wide range of Natural Language Generation (NLG) tasks. Maximum-Lik… (see more)elihood (MLE) models trained with teacher forcing have consistently been reported as weak baselines, where poor performance is attributed to exposure bias (Bengio et al., 2015; Ranzato et al., 2015); at inference time, the model is fed its own prediction instead of a ground-truth token, which can lead to accumulating errors and poor samples. This line of reasoning has led to an outbreak of adversarial based approaches for NLG, on the account that GANs do not suffer from exposure bias. In this work, we make several surprising observations which contradict common beliefs. First, we revisit the canonical evaluation framework for NLG, and point out fundamental flaws with quality-only evaluation: we show that one can outperform such metrics using a simple, well-known temperature parameter to artificially reduce the entropy of the model's conditional distributions. Second, we leverage the control over the quality / diversity trade-off given by this parameter to evaluate models over the whole quality-diversity spectrum and find MLE models constantly outperform the proposed GAN variants over the whole quality-diversity space. Our results have several implications: 1) The impact of exposure bias on sample quality is less severe than previously thought, 2) temperature tuning provides a better quality / diversity trade-off than adversarial training while being easier to train, easier to cross-validate, and less computationally expensive. Code to reproduce the experiments is available at github.com/pclucas14/GansFallingShort

2019-12-31

ICLR (published)

doi.org

openreview.net

Learning Classical Planning Transition Functions by Deep Neural Networks

Michaela Urbanovská

Ian G Goodfellow

Yoshua Bengio

Aaron Courville

2019-12-31

(published)

www.semanticscholar.org

Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules

Murray Shanahan

Michael Mozer

Robust perception relies on both bottom-up and top-down signals. Bottom-up signals consist of what's directly observed through sensation. To… (see more)p-down signals consist of beliefs and expectations based on past experience and short-term memory, such as how the phrase `peanut butter and~...' will be completed. The optimal combination of bottom-up and top-down information remains an open question, but the manner of combination must be dynamic and both context and task dependent. To effectively utilize the wealth of potential top-down information available, and to prevent the cacophony of intermixed signals in a bidirectional architecture, mechanisms are needed to restrict information flow. We explore deep recurrent neural net architectures in which bottom-up and top-down signals are dynamically combined using attention. Modularity of the architecture further restricts the sharing and communication of information. Together, attention and modularity direct information flow, which leads to reliable performance improvements in perceptual and language tasks, and in particular improves robustness to distractions and noisy data. We demonstrate on a variety of benchmarks in language modeling, sequential image classification, video prediction and reinforcement learning that the \emph{bidirectional} information flow can improve results over strong baselines.

2019-12-31

ICML (published)

doi.org

proceedings.mlr.press

Learning Graph Structure With A Finite-State Automaton Layer

Daniel D. Johnson

Hugo Larochelle

Daniel Tarlow

2019-12-31

Advances in Neural Information Processing Systems 33 (NeurIPS 2020) (published)

arxiv.org

Learning Long-term Dependencies Using Cognitive Inductive Biases in Self-attention RNNs

Attention and self-attention mechanisms, inspired by cognitive processes, are now central to state-of-the-art deep learning on sequential ta… (see more)sks. However, most recent progress hinges on heuristic approaches that rely on considerable memory and computational resources that scale poorly. In this work, we propose a relevancy screening mechanism, inspired by the cognitive process of memory consolidation, that allows for a scalable use of sparse self-attention with recurrence. We use simple numerical experiments to demonstrate that this mechanism helps enable recurrent systems on generalization and transfer learning tasks. Based on our results, we propose a concrete direction of research to improve scalability and generalization of attentive recurrent networks.

2019-12-31

(published)

www.semanticscholar.org

Learning the Arrow of Time for Problems in Reinforcement Learning

Nasim Rahaman

Steffen Wolf

Anirudh Goyal

Roman Remme

Yoshua Bengio

2019-12-31

ICLR (published)

openreview.net

Measuring Systematic Generalization in Neural Proof Generation with Transformers

Nicolas Gontier

Koustuv Sinha

Siva Reddy

Christopher Pal

We are interested in understanding how well Transformer language models (TLMs) can perform reasoning tasks when trained on knowledge encoded… (see more) in the form of natural language. We investigate their systematic generalization abilities on a logical reasoning task in natural language, which involves reasoning over relationships between entities grounded in first-order logical proofs. Specifically, we perform soft theorem-proving by leveraging TLMs to generate natural language proofs. We test the generated proofs for logical consistency, along with the accuracy of the final inference. We observe length-generalization issues when evaluated on longer-than-trained sequences. However, we observe TLMs improve their generalization performance after being exposed to longer, exhaustive proofs. In addition, we discover that TLMs are able to generalize better using backward-chaining proofs compared to their forward-chaining counterparts, while they find it easier to generate forward chaining proofs. We observe that models that are not trained to generate proofs are better at generalizing to problems based on longer proofs. This suggests that Transformers have efficient internal reasoning strategies that are harder to interpret. These results highlight the systematic generalization behavior of TLMs in the context of logical reasoning, and we believe this work motivates deeper inspection of their underlying reasoning strategies.

2019-12-31

NeurIPS (published)

doi.org

arxiv.org

MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining

Zhi Wen

Xing Han Lu

Siva Reddy

2019-12-31

arXiv (preprint)

doi.org

arxiv.org

Medical Imaging with Deep Learning: MIDL 2020 -- Short Paper Track

Tal Arbel

Ismail Ben Ayed

Marleen de Bruijne

Maxime Descoteaux

Hervé Lombaert

Chris Pal

This compendium gathers all the accepted extended abstracts from the Third International Conference on Medical Imaging with Deep Learning (M… (see more)IDL 2020), held in Montreal, Canada, 6-9 July 2020. Note that only accepted extended abstracts are listed here, the Proceedings of the MIDL 2020 Full Paper Track are published in the Proceedings of Machine Learning Research (PMLR).

2019-12-31

arXiv (preprint)

doi.org

arxiv.org

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications