Publications

Learning to live with Dale's principle: ANNs with separate excitatory and inhibitory units

Jonathan Cornford

Damjan Kalajdzievski

Marco Leite

Amélie Lamarquette

Dimitri Michael Kullmann

The units in artificial neural networks (ANNs) can be thought of as abstractions of biological neurons, and ANNs are increasingly used in ne… (see more)uroscience research. However, there are many important differences between ANN units and real neurons. One of the most notable is the absence of Dale's principle, which ensures that biological neurons are either exclusively excitatory or inhibitory. Dale's principle is typically left out of ANNs because its inclusion impairs learning. This is problematic, because one of the great advantages of ANNs for neuroscience research is their ability to learn complicated, realistic tasks. Here, by taking inspiration from feedforward inhibitory interneurons in the brain we show that we can develop ANNs with separate populations of excitatory and inhibitory units that learn just as well as standard ANNs. We call these networks Dale's ANNs (DANNs). We present two insights that enable DANNs to learn well: (1) DANNs are related to normalization schemes, and can be initialized such that the inhibition centres and standardizes the excitatory activity, (2) updates to inhibitory neuron parameters should be scaled using corrections based on the Fisher Information matrix. These results demonstrate how ANNs that respect Dale's principle can be built without sacrificing learning performance, which is important for future work using ANNs as models of the brain. The results may also have interesting implications for how inhibitory plasticity in the real brain operates.

2021-01-12

ICLR.cc/2021/Conference (poster)

doi.org

openreview.net

Predicting Infectiousness for Proactive Contact Tracing

Yoshua Bengio

Prateek Gupta

Tegan Maharaj

Nasim Rahaman

Martin Weiss

Tristan Deleu

Eilif Benjamin Muller

Meng Qu

Victor Schmidt

Pierre-Luc St-Charles

Hannah Alsdurf

Olexa Bilaniuk

David Buckeridge

gaetan caron

pierre luc carrier

Joumana Ghosn

satya ortiz gagne

Chris Pal

Irina Rish

Bernhard Schölkopf … (see 3 more)

abhinav sharma

Jian Tang

andrew williams

The COVID-19 pandemic has spread rapidly worldwide, overwhelming manual contact tracing in many countries and resulting in widespread lockdo… (see more)wns for emergency containment. Large-scale digital contact tracing (DCT) has emerged as a potential solution to resume economic and social activity while minimizing spread of the virus. Various DCT methods have been proposed, each making trade-offs between privacy, mobility restrictions, and public health. The most common approach, binary contact tracing (BCT), models infection as a binary event, informed only by an individual's test results, with corresponding binary recommendations that either all or none of the individual's contacts quarantine. BCT ignores the inherent uncertainty in contacts and the infection process, which could be used to tailor messaging to high-risk individuals, and prompt proactive testing or earlier warnings. It also does not make use of observations such as symptoms or pre-existing medical conditions, which could be used to make more accurate infectiousness predictions. In this paper, we use a recently-proposed COVID-19 epidemiological simulator to develop and test methods that can be deployed to a smartphone to locally and proactively predict an individual's infectiousness (risk of infecting others) based on their contact history and other information, while respecting strong privacy constraints. Predictions are used to provide personalized recommendations to the individual via an app, as well as to send anonymized messages to the individual's contacts, who use this information to better predict their own infectiousness, an approach we call proactive contact tracing (PCT). We find a deep-learning based PCT method which improves over BCT for equivalent average mobility, suggesting PCT could help in safe re-opening and second-wave prevention.

2021-01-12

ICLR.cc/2021/Conference (spotlight)

openreview.net

The patient advisor, an organizational resource as a lever for an enhanced oncology patient experience (PAROLE-onco): a longitudinal multiple case study protocol

Marie-Pascale Pomey

Michèle de Guise

Mado Desforges

Karine Bouchard

Cécile Vialaron

Louise Normandin

Monica Iliescu‐Nelea

Israël Fortin

Isabelle Ganache

Catherine Régis

Zeev Rosberger

Danielle Charpentier

L. Bélanger

Michel Dorval

Djahanchah Philip Ghadiri

Mélanie Lavoie-Tremblay

A. Boivin

Jean-François Pelletier

Nicolas Fernandez

Alain M. Danino

2021-01-04

BMC Health Services Research (published)

doi.org

Abg-CoQA: Clarifying Ambiguity in Conversational Question Answering

Meiqi Guo

Mingda Zhang

Siva Reddy

Malihe Alikhani

Effective communication is about the dissemination of properly worded meaningful ideas/messages that are comprehensible to both sen… (see more)der and receiver and which ultimately can attract the desired response or feedback. For machines to engage in a conversation, it is therefore essential to enable them to clarify ambiguity and achieve a common ground. We introduce Abg-CoQA, a novel dataset for clarifying ambiguity in Conversational Question Answering systems. Our dataset contains 9k questions with answers where 1k questions are ambiguous, obtained from 4k text passages from five diverse domains. For ambiguous questions, a clarification conversational turn is collected. We evaluate strong language generation models and conversational question answering models on Abg-CoQA. The best-performing system achieves a BLEU-1 score of 12.9% on generating clarification question, which is 27.9 points behind human performance (40.8%); and a F1 score of 40.1% on question answering after clarification, which is 35.1 points behind human performance (75.2%), indicating there is ample room for improvement.

2021-01-01

Conference on Automated Knowledge Base Construction (published)

doi.org

openreview.net

Accounting for Variance in Machine Learning Benchmarks

Xavier Bouthillier

Pierre Delaunay

Mirko Bronzi

Assya Trofimov

Brennan Nichyporuk

Justin Szeto

Naz Sepah

Edward Raff

Kanika Madan

Vikram Voleti

Samira Ebrahimi Kahou

Vincent Michalski

Dmitriy Serdyuk

Tal Arbel

Chris Pal

Gael Varoquaux

Pascal Vincent

Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the l… (see more)earning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons.

2021-01-01

MLSys (published)

arxiv.org

Active Learning for Capturing Human Decision Policies in a Data Frugal Context

Loïc Grossetête

Alexandre Marois

Bénédicte Chatelais

Christian Gagné

Daniel Lafond

2021-01-01

International Conference on Machine Learning, Optimization, and Data Science (published)

doi.org

ADEPT: An Adjective-Dependent Plausibility Task

Ali Emami

Ian Porada

Alexandra Olteanu

Kaheer Suleman

Adam Trischler

Jackie Cheung

2021-01-01

Annual Meeting of the Association for Computational Linguistics (published)

doi.org

Adversarial Feature Desensitization

Pouya Bashivan

Reza Bayat

Adam Ibrahim

Kartik Ahuja

Mojtaba Faramarzi

Touraj Laleh

Blake Richards

Irina Rish

Neural networks are known to be vulnerable to adversarial attacks -- slight but carefully constructed perturbations of the inputs which can … (see more)drastically impair the network's performance. Many defense methods have been proposed for improving robustness of deep networks by training them on adversarially perturbed inputs. However, these models often remain vulnerable to new types of attacks not seen during training, and even to slightly stronger versions of previously seen attacks. In this work, we propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field. Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs. This is achieved through a game where we learn features that are both predictive and robust (insensitive to adversarial attacks), i.e. cannot be used to discriminate between natural and adversarial data. Empirical results on several benchmarks demonstrate the effectiveness of the proposed approach against a wide range of attack types and attack strengths. Our code is available at https://github.com/BashivanLab/afd.

openreview.net

An Analysis of the Adaptation Speed of Causal Models

Rémi LE PRIOL

Reza Babanezhad Harikandeh

Yoshua Bengio

Simon Lacoste-Julien

2021-01-01

AISTATS (published)

proceedings.mlr.press

arxiv.org

Analyzing the Contribution of Ethical Charters to Building the Future of Artificial Intelligence Governance

Lyse Langlois

Catherine Régis

2021-01-01

Reflections on Artificial Intelligence for Humanity (published)

doi.org

Batch Reinforcement Learning Through Continuation Method

Yijie Guo

Shengyu Feng

Nicolas Le Roux

Ed Chi

Honglak Lee

Minmin Chen

Many real-world applications of reinforcement learning (RL) require the agent to learn from a fixed set of trajectories, without collecting … (see more)new interactions. Policy optimization under this setting is extremely challenging as: 1) the geometry of the objective function is hard to optimize efficiently; 2) the shift of data distributions causes high noise in the value estimation. In this work, we propose a simple yet effective policy iteration approach to batch RL using global optimization techniques known as continuation. By constraining the difference between the learned policy and the behavior policy that generates the fixed trajectories, and continuously relaxing the constraint, our method 1) helps the agent escape local optima; 2) reduces the error in policy evaluation in the optimization procedure. We present results on a variety of control tasks, game environments, and a recommendation task to empirically demonstrate the efficacy of our proposed method.

2021-01-01

ICLR (published)

openreview.net

Can Open Source Licenses Help Regulate Lethal Autonomous Weapons?

Cheng Lin

AJung Moon

Lethal autonomous weapon systems (LAWS, ethal autonomous weapon also known as killer robots) are a real and emerging technology that have th… (see more)e potential to radically transform warfare. Because of the myriad of moral, legal, privacy, and security risks the technology introduces, many scholars and advocates have called for a ban on the development, production, and use of fully autonomous weapons [1], [2].

2021-01-01

IEEE technology & society magazine (published)

doi.org

NLP in the era of generative AI, cognitive sciences, and societal transformation

AI Policy Compass

Student Life and Resources

Publications

NLP in the era of generative AI, cognitive sciences, and societal transformation

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications