Publications

A Study of Condition Numbers for First-Order Optimization

The study of first-order optimization algorithms (FOA) typically starts with assumptions on the objective functions, most commonly smoothnes… (see more)s and strong convexity. These metrics are used to tune the hyperparameters of FOA. We introduce a class of perturbations quantified via a new norm, called *-norm. We show that adding a small perturbation to the objective function has an equivalently small impact on the behavior of any FOA, which suggests that it should have a minor impact on the tuning of the algorithm. However, we show that smoothness and strong convexity can be heavily impacted by arbitrarily small perturbations, leading to excessively conservative tunings and convergence issues. In view of these observations, we propose a notion of continuity of the metrics, which is essential for a robust tuning strategy. Since smoothness and strong convexity are not continuous, we propose a comprehensive study of existing alternative metrics which we prove to be continuous. We describe their mutual relations and provide their guaranteed convergence rates for the Gradient Descent algorithm accordingly tuned. Finally we discuss how our work impacts the theoretical understanding of FOA and their performances.

2021-03-17

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (published)

doi.org

proceedings.mlr.press

[Strengthening the culture of public health surveillance and population health monitoring].

Arnaud Chiolero

David L Buckeridge

St'ephane Cullati

Public health surveillance is the systematic and ongoing collection, analysis and interpretation of data to produce information useful for d… (see more)ecision-making. With the development of data science, surveillance methods are evolving through access to big data. More data does not automatically mean more information. For example, the massive amounts of data on Covid-19 was not easily transformed in useful information for decision-making. Further, data scientists have often difficulties to make their analyses useful for decision-making. For the implementation of evidence-based and data-driven public health practice, the culture of public health surveillance and population health monitoring needs to be strengthened.

2021-03-16

Revue medicale suisse (published)

pubmed.ncbi.nlm.nih.gov

Price discounting as a hidden risk factor of energy drink consumption

Hiroshi Mamiya

Erica E. M. Moodie

Alexandra M. Schmidt

Yu Ma

David L. Buckeridge

Global consumption of caffeinated energy drinks (CED) has been increasing dramatically despite increasing evidence of their adverse health e… (see more)ffects. Temporary price discounting is a rarely investigated but potentially powerful food marketing tactic influencing purchasing of CED. Using grocery transaction records generated by food stores in Montreal, we investigated the association between price discounting and purchasing of CED across socio-economic status operationalized by education and income levels in store neighbourhood. The outcome, log-transformed weekly store-level sales of CED, was modelled as a function of store-level percent price discounting, store- and neighbourhood-level confounders, and an interaction term between discounting and each of tertile education and income in store neighbourhood. The model was separately fit to transactions from supermarkets, pharmacies, supercentres, and convenience stores. There were 18,743, 12,437, 3965, and 49,533 weeks of CED sales from supermarkets, pharmacies, supercentres, and convenience stores, respectively. Percent price discounting was positively associated with log sales of CED for all store types, and the interaction between education and discounting was prominent in supercentres: −0.039 [95% confidence interval (CI): −0.051, −0.028] and −0.039 [95% CI: −0.057, −0.021], for middle- and high-education neighbourhoods relative to low-education neighbourhoods, respectively. Relative to low-income areas, the associations of discounting and log CED sales in supercentres for neighbourhoods with middle- and high-income tertile were 0.022 [95% CI: 0.010, 0.033] and 0.015 (95% CI: −0.001, 0.031), respectively. Price discounting is an important driver of CED consumption and has a varying impact across community education and income.

2021-03-15

Canadian Journal of Public Health (published)

doi.org

Local Data Debiasing for Fairness Based on Generative Adversarial Training

Ulrich Matchi Aïvodji

François Bidet

Sébastien Gambs

Rosin Claude Ngueveu

Alain Tapp

The widespread use of automated decision processes in many areas of our society raises serious ethical issues with respect to the fairness o… (see more)f the process and the possible resulting discrimination. To solve this issue, we propose a novel adversarial training approach called GANSan for learning a sanitizer whose objective is to prevent the possibility of any discrimination (i.e., direct and indirect) based on a sensitive attribute by removing the attribute itself as well as the existing correlations with the remaining attributes. Our method GANSan is partially inspired by the powerful framework of generative adversarial networks (in particular Cycle-GANs), which offers a flexible way to learn a distribution empirically or to translate between two different distributions. In contrast to prior work, one of the strengths of our approach is that the sanitization is performed in the same space as the original data by only modifying the other attributes as little as possible, thus preserving the interpretability of the sanitized data. Consequently, once the sanitizer is trained, it can be applied to new data locally by an individual on their profile before releasing it. Finally, experiments on real datasets demonstrate the effectiveness of the approach as well as the achievable trade-off between fairness and utility.

2021-03-13

Algorithms (published)

doi.org

arxiv.org

Continuing professional education of Iranian healthcare professionals in shared decision-making: lessons learned

S. A. Rahimi

Charo Rodriguez

Jordie Croteau

Alireza Sadeghpour

Amir-Mohammad Navali

France Légaré

2021-03-11

BMC Health Services Research (published)

doi.org

Staying Ahead of the Epidemiologic Curve: Evaluation of the British Columbia Asthma Prediction System (BCAPS) During the Unprecedented 2018 Wildfire Season

Sarah B. Henderson

Kathryn T. Morrison

Kathleen E. McLean

Yue Ding

Jiayun Yao

Gavin Shaddick

David L Buckeridge

2021-03-11

Frontiers in Public Health (published)

doi.org

Pretraining Reward-Free Representations for Data-Efficient Reinforcement Learning

Philip Bachman

2021-03-08

International Conference on Learning Representations (unknown)

openreview.net

Parallel inference of hierarchical latent dynamics in two-photon calcium imaging of neuronal populations

Luke Y. Prince

Shahab Bakhtiari

Colleen J. Gillon

Blake A. Richards

Dynamic latent variable modelling has provided a powerful tool for understanding how populations of neurons compute. For spiking data, such … (see more)latent variable modelling can treat the data as a set of point-processes, due to the fact that spiking dynamics occur on a much faster timescale than the computational dynamics being inferred. In contrast, for other experimental techniques, the slow dynamics governing the observed data are similar in timescale to the computational dynamics that researchers want to infer. An example of this is in calcium imaging data, where calcium dynamics can have timescales on the order of hundreds of milliseconds. As such, the successful application of dynamic latent variable modelling to modalities like calcium imaging data will rest on the ability to disentangle the deeper- and shallower-level dynamical systems’ contributions to the data. To-date, no techniques have been developed to directly achieve this. Here we solve this problem by extending recent advances using sequential variational autoencoders for dynamic latent variable modelling of neural data. Our system VaLPACa (Variational Ladders for Parallel Autoencoding of Calcium imaging data) solves the problem of disentangling deeper- and shallower-level dynamics by incorporating a ladder architecture that can infer a hierarchy of dynamical systems. Using some built-in inductive biases for calcium dynamics, we show that we can disentangle calcium flux from the underlying dynamics of neural computation. First, we demonstrate with synthetic calcium data that we can correctly disentangle an underlying Lorenz attractor from calcium dynamics. Next, we show that we can infer appropriate rotational dynamics in spiking data from macaque motor cortex after it has been converted into calcium fluorescence data via a calcium dynamics model. Finally, we show that our method applied to real calcium imaging data from primary visual cortex in mice allows us to infer latent factors that carry salient sensory information about unexpected stimuli. These results demonstrate that variational ladder autoencoders are a promising approach for inferring hierarchical dynamics in experimental settings where the measured variable has its own slow dynamics, such as calcium imaging data. Our new, open-source tool thereby provides the neuroscience community with the ability to apply dynamic latent variable modelling to a wider array of data modalities.

2021-03-07

BioRxiv (preprint)

doi.org

Enabling Technologies for Energy Cloud

Thar Intisar Baker

Zehua Guo

Ali Ismail Ali Awad

Shangguang Wang

Benjamin C. M. Fung

2021-03-04

J. Parallel Distributed Comput. (published)

doi.org

Training a First-Order Theorem Prover from Synthetic Data

Vlad Firoiu

Eser Aygün

Ankit Anand

Zafarali Ahmed

Xavier Glorot

Laurent Orseau

Lei Zhang

Doina Precup

Shibl Mourad

2021-03-04

ArXiv (preprint)

arxiv.org

Comment on Starke et al.: “Computing schizophrenia: ethical challenges for machine learning in psychiatry”: From machine learning to student learning: pedagogical challenges for psychiatry – Corrigendum

Christophe Gauld

Jean‐Arthur Micoulaud‐Franchi

Guillaume Dumas

2021-03-03

Psychological Medicine (published)

doi.org

A Two-Stream Continual Learning System With Variational Domain-Agnostic Feature Replay

Qicheng Lao

Xiang Jiang

Mohammad Havaei

Yoshua Bengio

Learning in nonstationary environments is one of the biggest challenges in machine learning. Nonstationarity can be caused by either task dr… (see more)ift, i.e., the drift in the conditional distribution of labels given the input data, or the domain drift, i.e., the drift in the marginal distribution of the input data. This article aims to tackle this challenge with a modularized two-stream continual learning (CL) system, where the model is required to learn new tasks from a support stream and adapted to new domains in the query stream while maintaining previously learned knowledge. To deal with both drifts within and across the two streams, we propose a variational domain-agnostic feature replay-based approach that decouples the system into three modules: an inference module that filters the input data from the two streams into domain-agnostic representations, a generative module that facilitates the high-level knowledge transfer, and a solver module that applies the filtered and transferable knowledge to solve the queries. We demonstrate the effectiveness of our proposed approach in addressing the two fundamental scenarios and complex scenarios in two-stream CL.

2021-03-02

IEEE Transactions on Neural Networks and Learning Systems (published)

doi.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications