Publications

The Power of Prompt Tuning for Low-Resource Semantic Parsing

Nathan Schucher

Harm de Vries

Prompt tuning has recently emerged as an effective method for adapting pre-trained language models to a number of language understanding and… (voir plus) generation tasks. In this paper, we investigate prompt tuning for semantic parsing—the task of mapping natural language utterances onto formal meaning representations. On the low-resource splits of Overnight and TOPv2, we find that a prompt tuned T5-xl significantly outperforms its fine-tuned counterpart, as well as strong GPT-3 and BART baselines. We also conduct ablation studies across different model scales and target representations, finding that, with increasing model scale, prompt tuned T5 models improve at generating target representations that are far from the pre-training distribution.

2021-10-16

ArXiv (preprint)

doi.org

arxiv.org

Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining

Andreas Madsen

Nicholas Meade

Vaibhav Adlakha

Siva Reddy

To explain NLP models a popular approach is to use importance measures, such as attention, which inform input tokens are important for makin… (voir plus)g a prediction. However, an open question is how well these explanations accurately reflect a model's logic, a property called faithfulness. To answer this question, we propose Recursive ROAR, a new faithfulness metric. This works by recursively masking allegedly important tokens and then retraining the model. The principle is that this should result in worse model performance compared to masking random tokens. The result is a performance curve given a masking-ratio. Furthermore, we propose a summarizing metric using relative area-between-curves (RACU), which allows for easy comparison across papers, models, and tasks. We evaluate 4 different importance measures on 8 different datasets, using both LSTM-attention models and RoBERTa models. We find that the faithfulness of importance measures is both model-dependent and task-dependent. This conclusion contradicts previous evaluations in both computer vision and faithfulness of attention literature.

2021-10-15

ArXiv (preprint)

doi.org

arxiv.org

Evaluation of real-life use of Point-Of-Care Rapid Antigen TEsting for SARS-CoV-2 in schools for outbreak control (EPOCRATES)

A. Blanchard

Marc Desforges

A. Labbé

Christine Nguyen

Y. Petit

Derek Besner

Kate A. Zinszer

Olivier Séguin

Zineb Laghdir

K. Adams

Marie-ève Benoit

Ghislain Leduc

Jean Longtin

Ioannis. Ragoussis

David Buckeridge

Caroline Quach

We evaluated the use of rapid antigen detection tests (RADT) for the diagnosis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-… (voir plus)2) infection in school settings to determine RADT performance characteristics compared to PCR. Methods: We did a real-world, prospective observational cohort study where recruited high-school students and staff from two high-schools in Montreal (Canada) were followed from January 25th to June 10th, 2021. Twenty-five percent of asymptomatic participants were tested weekly by RADT (nasal) and PCR (gargle). Class contacts of a case were tested. Symptomatic participants were tested by RADT (nasal) and PCR (nasal and gargle). The number of cases/outbreak and number of outbreaks were compared to other high schools in the same area. Results: Overall, 2,099 students and 286 school staff members consented to participate. The overall RADT specificity varied from 99.8 to 100%, with a lower sensitivity, varying from 28.6% in asymptomatic to 83.3% in symptomatic participants. The number of outbreaks was not different in the 2 participating schools compared to other high schools in the same area, but included a greater proportion of asymptomatic cases. Returning students to school after a 7-day quarantine, with a negative PCR on D6-7 after exposure, did not lead to subsequent outbreaks, as shown by serial testing. Of cases for whom the source was known, 37 of 57 (72.5%) were secondary to household transmission, 13 (25%) to intra-school transmission and one to community contacts between students in the same school. Conclusion: RADT did not perform well as a screening tool in asymptomatic individuals. Reinforcing policies for symptom screening when entering schools and testing symptomatic individuals with RADT on the spot may avoid subsequent significant exposures in class.

2021-10-14

medRxiv (prépublication)

doi.org

Compositional Generalization in Dependency Parsing

Emily D. Goodwin

Siva Reddy

Timothy O'Donnell

Dzmitry Bahdanau

Compositionality— the ability to combine familiar units like words into novel phrases and sentences— has been the focus of intense inter… (voir plus)est in artificial intelligence in recent years. To test compositional generalization in semantic parsing, Keysers et al. (2020) introduced Compositional Freebase Queries (CFQ). This dataset maximizes the similarity between the test and train distributions over primitive units, like words, while maximizing the compound divergence: the dissimilarity between test and train distributions over larger structures, like phrases. Dependency parsing, however, lacks a compositional generalization benchmark. In this work, we introduce a gold-standard set of dependency parses for CFQ, and use this to analyze the behaviour of a state-of-the art dependency parser (Qi et al., 2020) on the CFQ dataset. We find that increasing compound divergence degrades dependency parsing performance, although not as dramatically as semantic parsing performance. Additionally, we find the performance of the dependency parser does not uniformly degrade relative to compound divergence, and the parser performs differently on different splits with the same compound divergence. We explore a number of hypotheses for what causes the non-uniform degradation in dependency parsing performance, and identify a number of syntactic structures that drive the dependency parser’s lower performance on the most challenging splits.

2021-10-13

ArXiv (preprint)

doi.org

arxiv.org

2D Multi-Class Model for Gray and White Matter Segmentation of the Cervical Spinal Cord at 7T

Nilser Laines Medina

Charley Gros

Julien Cohen-Adad

Virginie Callot

A. Troter

2021-10-13

ArXiv (prépublication)

arxiv.org

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

Gabriele Prato

Simon Guiroy

Ethan Caballero

Irina Rish

Sarath Chandar

Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly… (voir plus) in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense that it includes new image classes not encountered during training. Our current main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers. Our key observations are that (1) such performance improvements are well-approximated by power laws (linear log-log plots) as the training set size increases, (2) this applies to both cases of target data coming from either the same or from a different domain (i.e., new classes) as the training data, and (3) few-shot performance on new classes converges at a faster rate than the standard classification performance on previously seen classes. Our findings shed new light on the relationship between scale and generalization.

2021-10-13

ArXiv (prépublication)

openreview.net

A cognitive fingerprint in human random number generation

Marc-Andre Schulz

Sebastian Baier

Benjamin Timmermann

Benjamin Böhme

Danilo Bzdok

Karsten Witt

2021-10-12

Scientific Reports (publié)

doi.org

Diagnosing as autistic people increasingly distant from prototypes lead neither to clinical benefit nor to the advancement of knowledge

Laurent Mottron

Danilo Bzdok

2021-10-12

Molecular Psychiatry (publié)

doi.org

Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning

Nan Rosemary Ke

Aniket Rajiv Didolkar

Sarthak Mittal

Anirudh Goyal

Guillaume Lajoie

Stefan Bauer

Danilo Jimenez Rezende

Yoshua Bengio

Michael Curtis Mozer

Chris Pal

Inducing causal relationships from observations is a classic problem in machine learning. Most work in causality starts from the premise tha… (voir plus)t the causal variables themselves are observed. However, for AI agents such as robots trying to make sense of their environment, the only observables are low-level variables like pixels in images. To generalize well, an agent must induce high-level variables, particularly those which are causal or are affected by causal variables. A central goal for AI and causality is thus the joint discovery of abstract representations and causal structure. However, we note that existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs which are impossible to manipulate parametrically (e.g., number of nodes, sparsity, causal chain length, etc.). In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them. In order to systematically probe the ability of methods to identify these variables and structures, we design a suite of benchmarking RL environments. We evaluate various representation learning algorithms from the literature and find that explicitly incorporating structure and modularity in models can help causal induction in model-based reinforcement learning.

2021-10-11

NeurIPS.cc/2021/Track/Datasets_and_Benchmarks/Round2 (publié)

openreview.net

Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

Pau Rodriguez

Massimo Caccia

Alexandre Lacoste

Lee Zamparo

Issam Hadj Laradji

Laurent Charlin

David Vazquez

Explainability for machine learning models has gained considerable attention within the research community given the importance of deploying… (voir plus) more reliable machine-learning systems. In computer vision applications, generative counterfactual methods indicate how to perturb a model’s input to change its prediction, providing details about the model’s decision-making. Current methods tend to generate trivial counterfactuals about a model’s decisions, as they often suggest to exaggerate or remove the presence of the attribute being classified. For the machine learning practitioner, these types of counterfactuals offer little value, since they provide no new information about undesired model or data biases. In this work, we identify the problem of trivial counterfactual generation and we propose DiVE to alleviate it. DiVE learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss to uncover multiple valuable explanations about the model’s prediction. Further, we introduce a mechanism to prevent the model from producing trivial explanations. Experiments on CelebA and Synbols demonstrate that our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods. Code is available at https://github.com/ElementAI/beyond-trivial-explanations.

2021-10-10

2021 IEEE/CVF International Conference on Computer Vision (ICCV) (publié)

doi.org

arxiv.org

DoMoBOT: An AI-Empowered Bot for Automated and Interactive Domain Modelling

Rijul Saini

Gunter Mussbacher

Jin Guo

Jörg Kienzle

Domain modelling transforms informal requirements written in natural language in the form of problem descriptions into concise and analyzabl… (voir plus)e domain models. As the manual construction of these domain models is often time-consuming, error-prone, and labor-intensive, several approaches already exist to automate domain modelling. However, the current approaches suffer from lower accuracy of extracted domain models and the lack of support for system-modeller interactions. To better assist modellers, we introduce DoMoBOT, a web-based Domain Modelling BOT. Our proposed bot combines artificial intelligence techniques such as natural language processing and machine learning to extract domain models with higher accuracy. More importantly, our bot incorporates a set of features to bring synergy between automated model extraction and bot-modeller interactions. During these interactions, the bot presents multiple possible solutions to a modeller for modelling scenarios present in a given problem description. The bot further enables modellers to switch to a particular solution and updates the other parts of the domain model proactively. In this tool demo paper, we demonstrate how the implementation and architecture of DoMoBOT support the paradigm of automated and interactive domain modelling for assisting modellers.

2021-10-10

2021 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C) (publié)

doi.org

Generative Compositional Augmentations for Scene Graph Prediction

Boris Knyazev

Harm de Vries

Cătălina Cangea

Graham W. Taylor

Aaron Courville

Eugene Belilovsky

Inferring objects and their relationships from an image in the form of a scene graph is useful in many applications at the intersection of v… (voir plus)ision and language. We consider a challenging problem of compositional generalization that emerges in this task due to a long tail data distribution. Current scene graph generation models are trained on a tiny fraction of the distribution corresponding to the most frequent compositions, e.g. . However, test images might contain zero- and few-shot compositions of objects and relationships, e.g. . Despite each of the object categories and the predicate (e.g. ‘on’) being frequent in the training data, the models often fail to properly understand such unseen or rare compositions. To improve generalization, it is natural to attempt increasing the diversity of the training distribution. However, in the graph domain this is non-trivial. To that end, we propose a method to synthesize rare yet plausible scene graphs by perturbing real ones. We then propose and empirically study a model based on conditional generative adversarial networks (GANs) that allows us to generate visual features of perturbed scene graphs and learn from them in a joint fashion. When evaluated on the Visual Genome dataset, our approach yields marginal, but consistent improvements in zero- and few-shot metrics. We analyze the limitations of our approach indicating promising directions for future research.

2021-10-10

2021 IEEE/CVF International Conference on Computer Vision (ICCV) (publié)

doi.org

arxiv.org

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Publications

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Mots-clés populaires:

Publications