Publications

Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining

To explain NLP models a popular approach is to use importance measures, such as attention, which inform input tokens are important for makin… (see more)g a prediction. However, an open question is how well these explanations accurately reflect a model's logic, a property called faithfulness. To answer this question, we propose Recursive ROAR, a new faithfulness metric. This works by recursively masking allegedly important tokens and then retraining the model. The principle is that this should result in worse model performance compared to masking random tokens. The result is a performance curve given a masking-ratio. Furthermore, we propose a summarizing metric using relative area-between-curves (RACU), which allows for easy comparison across papers, models, and tasks. We evaluate 4 different importance measures on 8 different datasets, using both LSTM-attention models and RoBERTa models. We find that the faithfulness of importance measures is both model-dependent and task-dependent. This conclusion contradicts previous evaluations in both computer vision and faithfulness of attention literature.

2022-11-30

Findings of the Association for Computational Linguistics: EMNLP 2022 (published)

doi.org

arxiv.org

Implementing automation in deep brain stimulation: has the time come?

Marco Bonizzato

Alfonso Fasano

2022-11-30

The Lancet Digital Health (published)

doi.org

Improving Passage Retrieval with Zero-Shot Question Generation

Devendra Singh Sachan

Mike Lewis

Mandar Joshi

Armen Aghajanyan

Wen-tau Yih

Joelle Pineau

Luke Zettlemoyer

We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retr… (see more)ieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of any retrieval method (e.g. neural or keyword-based), does not require any domain- or task-specific training (and therefore is expected to generalize better to data distribution shifts), and provides rich cross-attention between query and passage (i.e. it must explain every token in the question). When evaluated on a number of open-domain retrieval datasets, our re-ranker improves strong unsupervised retrieval models by 6%-18% absolute and strong supervised models by up to 12% in terms of top-20 passage retrieval accuracy. We also obtain new state-of-the-art results on full open-domain question answering by simply adding the new re-ranker to existing models with no further changes.

2022-11-30

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (published)

doi.org

arxiv.org

In-Processing Fairness Improvement Methods for Regression Data-Driven Building Models: Achieving Uniform Energy Prediction

Ying Sun

Benjamin C. M. Fung

Fariborz Haghighat

2022-11-30

Energy and Buildings (published)

doi.org

A Multifaceted Framework to Evaluate Evasion, Content Preservation, and Misattribution in Authorship Obfuscation Techniques

Malik H. Altakrori

Thomas Scialom

Benjamin C. M. Fung

Jackie CK Cheung

2022-11-30

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (published)

doi.org

QRelScore: Better Evaluating Generated Questions with Deeper Understanding of Context-aware Relevance

Xiaoqiang Wang

Bang Liu

Siliang Tang

Lingfei Wu

Existing metrics for assessing question generation not only require costly human reference but also fail to take into account the input cont… (see more)ext of generation, rendering the lack of deep understanding of the relevance between the generated questions and input contexts. As a result, they may wrongly penalize a legitimate and reasonable candidate question when it (1) involves complicated reasoning with the context or (2) can be grounded by multiple evidences in the context.In this paper, we propose QRelScore, a context-aware Relevance evaluation metric for Question Generation.Based on off-the-shelf language models such as BERT and GPT2, QRelScore employs both word-level hierarchical matching and sentence-level prompt-based generation to cope with the complicated reasoning and diverse generation from multiple evidences, respectively.Compared with existing metrics, our experiments demonstrate that QRelScore is able to achieve a higher correlation with human judgments while being much more robust to adversarial samples.

2022-11-30

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (published)

doi.org

arxiv.org

Reference panel guided topological structure annotation of Hi-C data

Yanlin Zhang

Mathieu Blanchette

2022-11-30

Nature Communications (published)

doi.org

Structure-Aware Reinforcement Learning for Node-Overload Protection in Mobile Edge Computing

Anirudha Jitani

Aditya Mahajan

Zhongwen Zhu

Hatem Abou-Zeid

Emmanuel Thepie Fapi

Hakimeh Purmehdi

Mobile Edge Computing (MEC) involves placing computational capability and applications at the edge of the network, providing benefits such a… (see more)s reduced latency, reduced network congestion, and improved performance of applications. The performance and reliability of MEC degrades significantly when the edge server(s) in the cluster are overloaded. In this work, an adaptive admission control policy to prevent edge node from getting overloaded is presented. This approach is based on a recently-proposed low complexity RL (Reinforcement Learning) algorithm called SALMUT (Structure-Aware Learning for Multiple Thresholds), which exploits the structure of the optimal admission control policy in multi-class queues for an average-cost setting. We extend the framework to work for node overload-protection problem in a discounted-cost setting. The proposed solution is validated using several scenarios mimicking real-world deployments in two different settings — computer simulations and a docker testbed. Our empirical evaluations show that the total discounted cost incurred by SALMUT is similar to state-of-the-art deep RL algorithms such as PPO (Proximal Policy Optimization) and A2C (Advantage Actor Critic) but requires an order of magnitude less time to train, outputs easily interpretable policy, and can be deployed in an online manner.

2022-11-30

IEEE Transactions on Cognitive Communications and Networking (published)

doi.org

arxiv.org

The Emergence of Argument Structure in Artificial Languages

Tom Bosc

Pascal Vincent

Computational approaches to the study of language emergence can help us understand how natural languages are shaped by cognitive and sociocu… (see more)ltural factors. Previous work focused on tasks where agents refer to a single entity. In contrast, we study how agents predicate, that is, how they express that some relation holds between several entities. We introduce a setup where agents talk about a variable number of entities that can be partially observed by the listener. In the presence of a least-effort pressure, they tend to discuss only entities that are not observed by the listener. Thus we can obtain artificial phrases that denote a single entity, as well as artificial sentences that denote several entities. In natural languages, if we ignore the verb, phrases are usually concatenated, either in a specific order or by adding case markers to form sentences. Our setup allows us to quantify how much this holds in emergent languages using a metric we call concatenability. We also measure transitivity, which quantifies the importance of word order. We demonstrate the usefulness of this new setup and metrics for studying factors that influence argument structure. We compare agents having access to input representations structured into pre-segmented objects with properties, versus unstructured representations. Our results indicate that the awareness of object structure yields a more natural sentence organization.

2022-11-30

Transactions of the Association for Computational Linguistics (published)

doi.org

Using incorpoRATE to examine clinician willingness to engage in shared decision making: A study of Family Medicine residents

Roland Grad

Amrita Sandhu

Michael Ferrante

Vinita D’Souza

Lily Puterman-salzman

Samira Abbasgholizadeh Rahimi

Gabrielle Stevens

Glyn Elwyn

2022-11-30

Patient Education and Counseling (published)

doi.org

VDGraph2Vec: Vulnerability Detection in Assembly Code using Message Passing Neural Networks

Ashita Diwan

Miles Q. Li

Benjamin C. M. Fung

Software vulnerability detection is one of the most challenging tasks faced by reverse engineers. Recently, vulnerability detection has rece… (see more)ived a lot of attention due to a drastic increase in the volume and complexity of software. Reverse engineering is a time-consuming and labor-intensive process for detecting malware and software vulnerabilities. However, with the advent of deep learning and machine learning, it has become possible for researchers to automate the process of identifying potential security breaches in software by developing more intelligent technologies. In this research, we propose VDGraph2Vec, an automated deep learning method to generate representations of assembly code for the task of vulnerability detection. Previous approaches failed to attend to topological characteristics of assembly code while discovering the weakness in the software. VDGraph2Vec embeds the control flow and semantic information of assembly code effectively using the expressive capabilities of message passing neural networks and the RoBERTa model. Our model is able to learn the important features that help distinguish between vulnerable and non-vulnerable software. We carry out our experimental analysis for performance benchmark on three of the most common weaknesses and demonstrate that our model can identify vulnerabilities with high accuracy and outperforms the current state-of-the-art binary vulnerability detection models.

2022-11-30

International Conference on Machine Learning and Applications (published)

doi.org

Bayesian Dynamic Causal Discovery

Alexander Tong

Lazar Atanackovic

Jason Hartford

Yoshua Bengio

Learning the causal structure of observable variables is a central focus for scientific discovery. Bayesian causal discovery methods tackle … (see more)this problem by learning a posterior over the set of admissible graphs that are equally likely given our priors and observations. Existing methods primarily consider observations from static systems and assume the underlying causal structure takes the form of a directed acyclic graph (DAG). In settings with dynamic feedback mechanisms that regulate the trajectories of individual variables, this acyclicity assumption fails unless we account for time. We treat causal discovery in the unrolled causal graph as a problem of sparse identification of a dynamical system. This imposes a natural temporal causal order between variables and captures cyclic feedback loops through time. Under this lens, we propose a new framework for Bayesian causal discovery for dynamical systems and present a novel generative flow network architecture (Dyn-GFN) tailored for this task. Dyn-GFN imposes an edge-wise sparse prior to sequentially build a k -sparse causal graph. Through evaluation on temporal data, our results show that the posterior learned with Dyn-GFN yields improved Bayes coverage of admissible causal structures relative to state of the art Bayesian causal discovery methods.

2022-11-29

NeurIPS.cc/2022/Workshop/CDS (poster)

openreview.net

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications