Publications

Deep LDA-Pruned Nets for Efﬁcient Facial Gender Classiﬁcation

Qing Tian

James J. Clark

Many real-time tasks, such as human-computer interac-tion, require fast and efﬁcient facial gender classiﬁcation. Although deep CNN nets… (see more) have been very effective for a mul-titude of classiﬁcation tasks, their high space and time de-mands make them impractical for personal computers and mobile devices without a powerful GPU. In this paper, we develop a 16-layer, yet lightweight, neural network which boosts efﬁciency while maintaining high accuracy. Our net is pruned from the VGG-16 model [35] starting from the last convolutional (conv) layer where we ﬁnd neuron activations are highly uncorrelated given the gender. Through Fisher’s Linear Discriminant Analysis (LDA) [8], we show that this high decorrelation makes it safe to discard directly last conv layer neurons with high within-class variance and low between-class variance. Combined with either Support Vector Machines (SVM) or Bayesian classiﬁcation, the reduced CNNs are capable of achieving comparable (or even higher) accuracies on the LFW and CelebA datasets than the original net with fully connected layers. On LFW, only four Conv5 3 neurons are able to maintain a comparably high recognition accuracy, which results in a reduction of total network size by a factor of 70X with a 11 fold speedup. Comparisons with a state-of-the-art pruning method [12] (as well as two smaller nets [20, 24]) in terms of accuracy loss and convolutional layers pruning rate are also provided.

2021-01-01

(published)

www.semanticscholar.org

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Rishabh Agarwal

Max Schwarzer

Pablo Samuel Castro

Aaron Courville

Marc Gendron-Bellemare

Deep reinforcement learning (RL) algorithms are predominantly evaluated by comparing their relative performance on a large suite of tasks. M… (see more)ost published results on deep RL benchmarks compare point estimates of aggregate performance such as mean and median scores across tasks, ignoring the statistical uncertainty implied by the use of a finite number of training runs. Beginning with the Arcade Learning Environment (ALE), the shift towards computationally-demanding benchmarks has led to the practice of evaluating only a small number of runs per task, exacerbating the statistical uncertainty in point estimates. In this paper, we argue that reliable evaluation in the few run deep RL regime cannot ignore the uncertainty in results without running the risk of slowing down progress in the field. We illustrate this point using a case study on the Atari 100k benchmark, where we find substantial discrepancies between conclusions drawn from point estimates alone versus a more thorough statistical analysis. With the aim of increasing the field's confidence in reported results with a handful of runs, we advocate for reporting interval estimates of aggregate performance and propose performance profiles to account for the variability in results, as well as present more robust and efficient aggregate metrics, such as interquartile mean scores, to achieve small uncertainty in results. Using such statistical tools, we scrutinize performance evaluations of existing algorithms on other widely used RL benchmarks including the ALE, Procgen, and the DeepMind Control Suite, again revealing discrepancies in prior comparisons. Our findings call for a change in how we evaluate performance in deep RL, for which we present a more rigorous evaluation methodology, accompanied with an open-source library rliable, to prevent unreliable results from stagnating the field. This work received an outstanding paper award at NeurIPS 2021.

openreview.net

Embedding Signals on Knowledge Graphs with Unbalanced Diffusion Earth Mover's Distance

Alexander Tong

Guillaume Huguet

Dennis Shung

Amine Natik

Manik Kuchroo

Guillaume Lajoie

Guy Wolf

Smita Krishnaswamy

In modern relational machine learning it is common to encounter large graphs that arise via interactions or similarities between observation… (see more)s in many domains. Further

2021-01-01

arXiv.org (preprint)

dblp.uni-trier.de

Embedding Signals on Knowledge Graphs with Unbalanced Diffusion Earth Mover's Distance

Alexander Tong

Guillaume Huguet

Dennis L. Shung

Amine Natik

Manik Kuchroo

Guillaume Lajoie

Guy Wolf

Smita Krishnaswamy

In modern relational machine learning it is common to encounter large graphs that arise via interactions or similarities between observation… (see more)s in many domains. Further

2021-01-01

arXiv.org (preprint)

dblp.uni-trier.de

Enjeux juridiques propres au modèle émergent des patients accompagnateurs dans les milieux de soins au Québec (Legal Issues Arising from the Emerging Model of Accompanying Patients in the Quebec Healthcare System)

Léa Boutrouille

Catherine Régis

Marie-Pascale Pomey

2021-01-01

SSRN Electronic Journal (published)

doi.org

Estimating the Impact of an Improvement to a Revenue Management System: An Airline Application

Greta Laage

Emma Frejinger

William Hamilton

Andrea Lodi

Guillaume Rabusseau

Airlines have been making use of highly complex Revenue Management Systems to maximize revenue for decades. Estimating the impact of changin… (see more)g one component of those systems on an important outcome such as revenue is crucial, yet very challenging. It is indeed the difference between the generated value and the value that would have been generated keeping business as usual, which is not observable. We provide a comprehensive overview of counterfactual prediction models and use them in an extensive computational study based on data from Air Canada to estimate such impact. We focus on predicting the counterfactual revenue and compare it to the observed revenue subject to the impact. Our microeconomic application and small expected treatment impact stand out from the usual synthetic control applications. We present accurate linear and deep-learning counterfactual prediction models which achieve respectively 1.1% and 1% of error and allow to estimate a simulated effect quite accurately.

2021-01-01

arXiv.org (preprint)

dblp.uni-trier.de

Faults in deep reinforcement learning programs: a taxonomy and a detection approach

Amin Nikanjam

Mohammad Mehdi Morovati

Foutse Khomh

Houssem Ben Braiek

2021-01-01

ArXiv (preprint)

doi.org

arxiv.org

Guest Editorial Explainable AI: Towards Fairness, Accountability, Transparency and Trust in Healthcare

Arash Shaban-Nejad

Martin Michalowski

John S. Brownstein

David Buckeridge

2021-01-01

IEEE journal of biomedical and health informatics (published)

doi.org

Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program)

Joelle Pineau

Philippe Vincent‐lamarre

Koustuv Sinha

Vincent Larivière

Alina Beygelzimer

Florence D'alche-buc

E. Fox

Hugo Larochelle

arxiv.org

Inspecting the Factuality of Hallucinated Entities in Abstractive Summarization

Meng Cao

Yue Dong

Jackie Cheung

State-of-the-art abstractive summarization systems often generate hallucinations ; i.e., content that is not directly inferable from the sou… (see more)rce text. Despite being assumed incorrect, many of the hallucinated contents are consistent with world knowledge (factual hallucinations). Including these factual hallucinations into a summary can be beneﬁcial in providing additional background information. In this work, we propose a novel detection approach that separates factual from non-factual hallucinations of entities. Our method is based on an entity’s prior and posterior probabilities according to pre-trained and ﬁnetuned masked language models, respectively. Empirical re-sults suggest that our method vastly outperforms three strong baselines in both accuracy and F1 scores and has a strong correlation with human judgements on factuality classiﬁcation tasks. Furthermore, our approach can provide insight into whether a particular hallucination is caused by the summarizer’s pre-training or ﬁne-tuning step. 1

2021-01-01

arXiv.org (preprint)

dblp.uni-trier.de

Inspecting the Factuality of Hallucinated Entities in Abstractive Summarization

Meng Cao

Yue Dong

Jackie Cheung

State-of-the-art abstractive summarization systems often generate hallucinations ; i.e., content that is not directly inferable from the sou… (see more)rce text. Despite being assumed incorrect, many of the hallucinated contents are consistent with world knowledge (factual hallucinations). Including these factual hallucinations into a summary can be beneﬁcial in providing additional background information. In this work, we propose a novel detection approach that separates factual from non-factual hallucinations of entities. Our method is based on an entity’s prior and posterior probabilities according to pre-trained and ﬁnetuned masked language models, respectively. Empirical re-sults suggest that our method vastly outperforms three strong baselines in both accuracy and F1 scores and has a strong correlation with human judgements on factuality classiﬁcation tasks. Furthermore, our approach can provide insight into whether a particular hallucination is caused by the summarizer’s pre-training or ﬁne-tuning step. 1

2021-01-01

arXiv.org (preprint)

dblp.uni-trier.de

Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization

Kartik Ahuja

Ethan Caballero

Dinghuai Zhang

Jean-Christophe Gagnon-Audet

Yoshua Bengio

Ioannis Mitliagkas

Irina Rish

The invariance principle from causality is at the heart of notable approaches such as invariant risk minimization (IRM) that seek to address… (see more) out-of-distribution (OOD) generalization failures. Despite the promising theory, invariance principle-based approaches fail in common classification tasks, where invariant (causal) features capture all the information about the label. Are these failures due to the methods failing to capture the invariance? Or is the invariance principle itself insufficient? To answer these questions, we revisit the fundamental assumptions in linear regression tasks, where invariance-based approaches were shown to provably generalize OOD. In contrast to the linear regression tasks, we show that for linear classification tasks we need much stronger restrictions on the distribution shifts, or otherwise OOD generalization is impossible. Furthermore, even with appropriate restrictions on distribution shifts in place, we show that the invariance principle alone is insufficient. We prove that a form of the information bottleneck constraint along with invariance helps address key failures when invariant features capture all the information about the label and also retains the existing success when they do not. We propose an approach that incorporates both of these principles and demonstrate its effectiveness in several experiments.

openreview.net

Rising to the Occasion

Summer School in Responsible AI and Human Rights

TRAIL for Professionals

Publications

Rising to the Occasion

Summer School in Responsible AI and Human Rights

TRAIL for Professionals

Popular keywords:

Publications