Publications

Embedding Signals on Knowledge Graphs with Unbalanced Diffusion Earth Mover's Distance

Alexander Tong

Guillaume Huguet

Dennis L. Shung

Amine Natik

Manik Kuchroo

Guillaume Lajoie

Guy Wolf

Smita Krishnaswamy

In modern relational machine learning it is common to encounter large graphs that arise via interactions or similarities between observation… (voir plus)s in many domains. Further

2021-01-01

arXiv.org (preprint)

dblp.uni-trier.de

Embedding Signals on Knowledge Graphs with Unbalanced Diffusion Earth Mover's Distance

Alexander Tong

Guillaume Huguet

Dennis Shung

Amine Natik

Manik Kuchroo

Guillaume Lajoie

Guy Wolf

Smita Krishnaswamy

In modern relational machine learning it is common to encounter large graphs that arise via interactions or similarities between observation… (voir plus)s in many domains. Further

2021-01-01

arXiv.org (prépublication)

dblp.uni-trier.de

Enjeux juridiques propres au modèle émergent des patients accompagnateurs dans les milieux de soins au Québec (Legal Issues Arising from the Emerging Model of Accompanying Patients in the Quebec Healthcare System)

Léa Boutrouille

Catherine Régis

Marie-Pascale Pomey

2021-01-01

SSRN Electronic Journal (publié)

doi.org

Estimating the Impact of an Improvement to a Revenue Management System: An Airline Application

Greta Laage

Emma Frejinger

William Hamilton

Andrea Lodi

Guillaume Rabusseau

Airlines have been making use of highly complex Revenue Management Systems to maximize revenue for decades. Estimating the impact of changin… (voir plus)g one component of those systems on an important outcome such as revenue is crucial, yet very challenging. It is indeed the difference between the generated value and the value that would have been generated keeping business as usual, which is not observable. We provide a comprehensive overview of counterfactual prediction models and use them in an extensive computational study based on data from Air Canada to estimate such impact. We focus on predicting the counterfactual revenue and compare it to the observed revenue subject to the impact. Our microeconomic application and small expected treatment impact stand out from the usual synthetic control applications. We present accurate linear and deep-learning counterfactual prediction models which achieve respectively 1.1% and 1% of error and allow to estimate a simulated effect quite accurately.

2021-01-01

arXiv.org (prépublication)

dblp.uni-trier.de

Faults in deep reinforcement learning programs: a taxonomy and a detection approach

Amin Nikanjam

Mohammad Mehdi Morovati

Foutse Khomh

Houssem Ben Braiek

2021-01-01

ArXiv (preprint)

doi.org

arxiv.org

Guest Editorial Explainable AI: Towards Fairness, Accountability, Transparency and Trust in Healthcare

Arash Shaban-Nejad

Martin Michalowski

John S. Brownstein

David Buckeridge

2021-01-01

IEEE journal of biomedical and health informatics (publié)

doi.org

Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program)

Joelle Pineau

Philippe Vincent‐lamarre

Koustuv Sinha

Vincent Larivière

Alina Beygelzimer

Florence D'alche-buc

E. Fox

Hugo Larochelle

arxiv.org

Inspecting the Factuality of Hallucinated Entities in Abstractive Summarization

Meng Cao

Yue Dong

Jackie Cheung

State-of-the-art abstractive summarization systems often generate hallucinations ; i.e., content that is not directly inferable from the sou… (voir plus)rce text. Despite being assumed incorrect, many of the hallucinated contents are consistent with world knowledge (factual hallucinations). Including these factual hallucinations into a summary can be beneﬁcial in providing additional background information. In this work, we propose a novel detection approach that separates factual from non-factual hallucinations of entities. Our method is based on an entity’s prior and posterior probabilities according to pre-trained and ﬁnetuned masked language models, respectively. Empirical re-sults suggest that our method vastly outperforms three strong baselines in both accuracy and F1 scores and has a strong correlation with human judgements on factuality classiﬁcation tasks. Furthermore, our approach can provide insight into whether a particular hallucination is caused by the summarizer’s pre-training or ﬁne-tuning step. 1

2021-01-01

arXiv.org (preprint)

dblp.uni-trier.de

Inspecting the Factuality of Hallucinated Entities in Abstractive Summarization

Meng Cao

Yue Dong

Jackie Cheung

State-of-the-art abstractive summarization systems often generate hallucinations ; i.e., content that is not directly inferable from the sou… (voir plus)rce text. Despite being assumed incorrect, many of the hallucinated contents are consistent with world knowledge (factual hallucinations). Including these factual hallucinations into a summary can be beneﬁcial in providing additional background information. In this work, we propose a novel detection approach that separates factual from non-factual hallucinations of entities. Our method is based on an entity’s prior and posterior probabilities according to pre-trained and ﬁnetuned masked language models, respectively. Empirical re-sults suggest that our method vastly outperforms three strong baselines in both accuracy and F1 scores and has a strong correlation with human judgements on factuality classiﬁcation tasks. Furthermore, our approach can provide insight into whether a particular hallucination is caused by the summarizer’s pre-training or ﬁne-tuning step. 1

2021-01-01

arXiv.org (prépublication)

dblp.uni-trier.de

Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization

Kartik Ahuja

Ethan Caballero

Dinghuai Zhang

Jean-Christophe Gagnon-Audet

Yoshua Bengio

Ioannis Mitliagkas

Irina Rish

The invariance principle from causality is at the heart of notable approaches such as invariant risk minimization (IRM) that seek to address… (voir plus) out-of-distribution (OOD) generalization failures. Despite the promising theory, invariance principle-based approaches fail in common classification tasks, where invariant (causal) features capture all the information about the label. Are these failures due to the methods failing to capture the invariance? Or is the invariance principle itself insufficient? To answer these questions, we revisit the fundamental assumptions in linear regression tasks, where invariance-based approaches were shown to provably generalize OOD. In contrast to the linear regression tasks, we show that for linear classification tasks we need much stronger restrictions on the distribution shifts, or otherwise OOD generalization is impossible. Furthermore, even with appropriate restrictions on distribution shifts in place, we show that the invariance principle alone is insufficient. We prove that a form of the information bottleneck constraint along with invariance helps address key failures when invariant features capture all the information about the label and also retains the existing success when they do not. We propose an approach that incorporates both of these principles and demonstrate its effectiveness in several experiments.

openreview.net

Issue Link Label Recovery and Prediction for Open Source Software

Alexander Nicholson

Guo Jin L.C.

Jin Guo

Modern open source software development heavily relies on the issue tracking systems to manage their feature requests, bug reports, tasks, a… (voir plus)nd other similar artifacts. Together, those “issues” form a complex network with links to each other. The heterogeneous character of issues inherently results in varied link types and therefore poses a great challenge for users to create and maintain the label of the link manually. The goal of most existing automated issue link construction techniques ceases with only examining the existence of links between issues. In this work, we focus on the next important question of whether we can assess the type of issue link automatically through a data-driven method. We analyze the links between issues and their labels used the issue tracking system for 66 open source projects. Using three projects, we demonstrate promising results when using supervised machine learning classification for the task of link label recovery with careful model selection and tuning, achieving F1 scores of between 0.56-0.70 for the three studied projects. Further, the performance of our method for future link label prediction is convincing when there is sufficient historical data. Our work signifies the first step in systematically manage and maintain issue links faced in practice.

2021-01-01

2021 IEEE 29th International Requirements Engineering Conference Workshops (REW) (publié)

doi.org

arxiv.org

Learning Robust State Abstractions for Hidden-Parameter Block MDPs

Amy Zhang

Shagun Sodhani

Khimya Khetarpal

Joelle Pineau

2021-01-01

ICLR (publié)

openreview.net