Publications

Correction to: The patient advisor, an organizational resource as a lever for an enhanced oncology patient experience (PAROLEonco): a longitudinal multiple case study protocol

Marie-Pascale Pomey

Michèle de Guise

Mado Desforges

Karine Bouchard

Cécile Vialaron

Louise Normandin

Monica Iliescu‐Nelea

Israël Fortin

Isabelle Ganache

Catherine Régis

Zeev Rosberger

Danielle Charpentier

L. Bélanger

Michel Dorval

Djahanchah Philip Ghadiri

Mélanie Lavoie-Tremblay

A. Boivin

Jean-François Pelletier

Nicolas Fernandez

Alain M. Danino

2021-01-14

BMC Health Services Research (publié)

doi.org

Assessing the Impact: Does an Improvement to a Revenue Management System Lead to an Improved Revenue?

Greta Laage

Emma Frejinger

Andrea Lodi

Guillaume Rabusseau

2021-01-13

ArXiv (prépublication)

arxiv.org

Learning with Gradient Descent and Weakly Convex Losses

Dominic Richards

Michael Rabbat

We study the learning performance of gradient descent when the empirical risk is weakly convex, namely, the smallest negative eigenvalue of … (voir plus)the empirical risk's Hessian is bounded in magnitude. By showing that this eigenvalue can control the stability of gradient descent, generalisation error bounds are proven that hold under a wider range of step sizes compared to previous work. Out of sample guarantees are then achieved by decomposing the test error into generalisation, optimisation and approximation errors, each of which can be bounded and traded off with respect to algorithmic parameters, sample size and magnitude of this eigenvalue. In the case of a two layer neural network, we demonstrate that the empirical risk can satisfy a notion of local weak convexity, specifically, the Hessian's smallest eigenvalue during training can be controlled by the normalisation of the layers, i.e., network scaling. This allows test error guarantees to then be achieved when the population risk minimiser satisfies a complexity assumption. By trading off the network complexity and scaling, insights are gained into the implicit bias of neural network scaling, which are further supported by experimental findings.

2021-01-13

ArXiv (preprint)

arxiv.org

Systematic detection of brain protein-coding genes under positive selection during primate evolution and their roles in cognition

Guillaume Dumas

Simon Malesys

Thomas Bourgeron

The human brain differs from that of other primates, but the genetic basis of these differences remains unclear. We investigated the evoluti… (voir plus)onary pressures acting on almost all human protein-coding genes (N = 11,667; 1:1 orthologs in primates) based on their divergence from those of early hominins, such as Neanderthals, and non-human primates. We confirm that genes encoding brain-related proteins are among the most strongly conserved protein-coding genes in the human genome. Combining our evolutionary pressure metrics for the protein-coding genome with recent data sets, we found that this conservation applied to genes functionally associated with the synapse and expressed in brain structures such as the prefrontal cortex and the cerebellum. Conversely, several genes presenting signatures commonly associated with positive selection appear as causing brain diseases or conditions, such as micro/macrocephaly, Joubert syndrome, dyslexia, and autism. Among those, a number of DNA damage response genes associated with microcephaly in humans such as BRCA1, NHEJ1, TOP3A, and RNF168 show strong signs of positive selection and might have played a role in human brain size expansion during primate evolution. We also showed that cerebellum granule neurons express a set of genes also presenting signatures of positive selection and that may have contributed to the emergence of fine motor skills and social cognition in humans. This resource is available online and can be used to estimate evolutionary constraints acting on a set of genes and to explore their relative contributions to human traits.

2021-01-13

Genome Research (publié)

doi.org

Adversarial score matching and improved sampling for image generation

Alexia Jolicoeur-Martineau

Rémi Piché-Taillefer

Remi Tachet des Combes

Ioannis Mitliagkas

Denoising Score Matching with Annealed Langevin Sampling (DSM-ALS) has recently found success in generative modeling. The approach works by … (voir plus)first training a neural network to estimate the score of a distribution, and then using Langevin dynamics to sample from the data distribution assumed by the score network. Despite the convincing visual quality of samples, this method appears to perform worse than Generative Adversarial Networks (GANs) under the Fréchet Inception Distance, a standard metric for generative models. We show that this apparent gap vanishes when denoising the final Langevin samples using the score network. In addition, we propose two improvements to DSM-ALS: 1) Consistent Annealed Sampling as a more stable alternative to Annealed Langevin Sampling, and 2) a hybrid training formulation, composed of both Denoising Score Matching and adversarial objectives. By combining these two techniques and exploring different network architectures, we elevate score matching methods and obtain results competitive with state-of-the-art image generation on CIFAR-10.

2021-01-12

ICLR.cc/2021/Conference (poster)

openreview.net

Learning to live with Dale's principle: ANNs with separate excitatory and inhibitory units

Jonathan Cornford

Damjan Kalajdzievski

Marco Leite

Amélie Lamarquette

Dimitri Michael Kullmann

Blake Richards

The units in artificial neural networks (ANNs) can be thought of as abstractions of biological neurons, and ANNs are increasingly used in ne… (voir plus)uroscience research. However, there are many important differences between ANN units and real neurons. One of the most notable is the absence of Dale's principle, which ensures that biological neurons are either exclusively excitatory or inhibitory. Dale's principle is typically left out of ANNs because its inclusion impairs learning. This is problematic, because one of the great advantages of ANNs for neuroscience research is their ability to learn complicated, realistic tasks. Here, by taking inspiration from feedforward inhibitory interneurons in the brain we show that we can develop ANNs with separate populations of excitatory and inhibitory units that learn just as well as standard ANNs. We call these networks Dale's ANNs (DANNs). We present two insights that enable DANNs to learn well: (1) DANNs are related to normalization schemes, and can be initialized such that the inhibition centres and standardizes the excitatory activity, (2) updates to inhibitory neuron parameters should be scaled using corrections based on the Fisher Information matrix. These results demonstrate how ANNs that respect Dale's principle can be built without sacrificing learning performance, which is important for future work using ANNs as models of the brain. The results may also have interesting implications for how inhibitory plasticity in the real brain operates.

2021-01-12

ICLR.cc/2021/Conference (poster)

doi.org

openreview.net

Predicting Infectiousness for Proactive Contact Tracing

Yoshua Bengio

Prateek Gupta

Tegan Maharaj

Nasim Rahaman

Martin Weiss

Tristan Deleu

Eilif Benjamin Muller

Meng Qu

Victor Schmidt

Pierre-Luc St-Charles

Hannah Alsdurf

Olexa Bilaniuk

David Buckeridge

gaetan caron

pierre luc carrier

Joumana Ghosn

satya ortiz gagne

Chris Pal

Irina Rish

Bernhard Schölkopf … (voir 3 de plus)

abhinav sharma

Jian Tang

andrew williams

The COVID-19 pandemic has spread rapidly worldwide, overwhelming manual contact tracing in many countries and resulting in widespread lockdo… (voir plus)wns for emergency containment. Large-scale digital contact tracing (DCT) has emerged as a potential solution to resume economic and social activity while minimizing spread of the virus. Various DCT methods have been proposed, each making trade-offs between privacy, mobility restrictions, and public health. The most common approach, binary contact tracing (BCT), models infection as a binary event, informed only by an individual's test results, with corresponding binary recommendations that either all or none of the individual's contacts quarantine. BCT ignores the inherent uncertainty in contacts and the infection process, which could be used to tailor messaging to high-risk individuals, and prompt proactive testing or earlier warnings. It also does not make use of observations such as symptoms or pre-existing medical conditions, which could be used to make more accurate infectiousness predictions. In this paper, we use a recently-proposed COVID-19 epidemiological simulator to develop and test methods that can be deployed to a smartphone to locally and proactively predict an individual's infectiousness (risk of infecting others) based on their contact history and other information, while respecting strong privacy constraints. Predictions are used to provide personalized recommendations to the individual via an app, as well as to send anonymized messages to the individual's contacts, who use this information to better predict their own infectiousness, an approach we call proactive contact tracing (PCT). We find a deep-learning based PCT method which improves over BCT for equivalent average mobility, suggesting PCT could help in safe re-opening and second-wave prevention.

2021-01-12

ICLR.cc/2021/Conference (spotlight)

openreview.net

The patient advisor, an organizational resource as a lever for an enhanced oncology patient experience (PAROLE-onco): a longitudinal multiple case study protocol

Marie-Pascale Pomey

Michèle de Guise

Mado Desforges

Karine Bouchard

Cécile Vialaron

Louise Normandin

Monica Iliescu‐Nelea

Israël Fortin

Isabelle Ganache

Catherine Régis

Zeev Rosberger

Danielle Charpentier

L. Bélanger

Michel Dorval

Djahanchah Philip Ghadiri

Mélanie Lavoie-Tremblay

A. Boivin

Jean-François Pelletier

Nicolas Fernandez

Alain M. Danino

2021-01-04

BMC Health Services Research (publié)

doi.org

Abg-CoQA: Clarifying Ambiguity in Conversational Question Answering

Meiqi Guo

Mingda Zhang

Siva Reddy

Malihe Alikhani

Effective communication is about the dissemination of properly worded meaningful ideas/messages that are comprehensible to both sen… (voir plus)der and receiver and which ultimately can attract the desired response or feedback. For machines to engage in a conversation, it is therefore essential to enable them to clarify ambiguity and achieve a common ground. We introduce Abg-CoQA, a novel dataset for clarifying ambiguity in Conversational Question Answering systems. Our dataset contains 9k questions with answers where 1k questions are ambiguous, obtained from 4k text passages from five diverse domains. For ambiguous questions, a clarification conversational turn is collected. We evaluate strong language generation models and conversational question answering models on Abg-CoQA. The best-performing system achieves a BLEU-1 score of 12.9% on generating clarification question, which is 27.9 points behind human performance (40.8%); and a F1 score of 40.1% on question answering after clarification, which is 35.1 points behind human performance (75.2%), indicating there is ample room for improvement.

2021-01-01

Conference on Automated Knowledge Base Construction (publié)

doi.org

openreview.net

Accounting for Variance in Machine Learning Benchmarks

Xavier Bouthillier

Pierre Delaunay

Mirko Bronzi

Assya Trofimov

Brennan Nichyporuk

Justin Szeto

Naz Sepah

Edward Raff

Kanika Madan

Vikram Voleti

Samira Ebrahimi Kahou

Vincent Michalski

Dmitriy Serdyuk

Tal Arbel

Chris Pal

Gael Varoquaux

Pascal Vincent

Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the l… (voir plus)earning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons.

2021-01-01

MLSys (publié)

arxiv.org