The Mila AI Policy Fellowship translates deep AI expertise into rigorous, public-interest policy. Read the newest publication Bridging the Expertise Gap: Knowledge Transfer Mechanisms for AI Regulation by Moritz von Knebel
This program supports AI startups at any time of the year. Benefit from cutting-edge resources and tailored support to accelerate your technology's development.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Recent advances in variational inference enable the modelling of highly structured joint distributions, but are limited in their capacity to… (see more) scale to the high-dimensional setting of stochastic neural networks. This limitation motivates a need for scalable parameterizations of the noise generation process, in a manner that adequately captures the dependencies among the various parameters. In this work, we address this need and present the Kronecker Flow, a generalization of the Kronecker product to invertible mappings designed for stochastic neural networks. We apply our method to variational Bayesian neural networks on predictive tasks, PAC-Bayes generalization bound estimation, and approximate Thompson sampling in contextual bandits. In all setups, our methods prove to be competitive with existing methods and better than the baselines.
2020-06-02
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (published)
We consider differentiable games where the goal is to find a Nash equilibrium. The machine learning community has recently started using var… (see more)iants of the gradient method (GD). Prime examples are extragradient (EG), the optimistic gradient method (OG) and consensus optimization (CO), which enjoy linear convergence in cases like bilinear games, where the standard GD fails. The full benefits of theses relatively new methods are not known as there is no unified analysis for both strongly monotone and bilinear games. We provide new analyses of the EG's local and global convergence properties and use is to get a tighter global convergence rate for OG and CO. Our analysis covers the whole range of settings between bilinear and strongly monotone games. It reveals that these methods converge via different mechanisms at these extremes; in between, it exploits the most favorable mechanism for the given problem. We then prove that EG achieves the optimal rate for a wide class of algorithms with any number of extrapolations. Our tight analysis of EG's convergence rate in games shows that, unlike in convex minimization, EG may be much faster than GD.
2020-06-02
International Conference on Artificial Intelligence and Statistics (unknown)
We introduce a principled method to train end-to-end analog neural networks by stochastic gradient descent. In these analog neural networks,… (see more) the weights to be adjusted are implemented by the conductances of programmable resistive devices such as memristors [Chua, 1971], and the nonlinear transfer functions (or `activation functions') are implemented by nonlinear components such as diodes. We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models: they possess an energy function as a consequence of Kirchhoff's laws governing electrical circuits. This property enables us to train them using the Equilibrium Propagation framework [Scellier and Bengio, 2017]. Our update rule for each conductance, which is local and relies solely on the voltage drop across the corresponding resistor, is shown to compute the gradient of the loss function. Our numerical simulations, which use the SPICE-based Spectre simulation framework to simulate the dynamics of electrical circuits, demonstrate training on the MNIST classification task, performing comparably or better than equivalent-size software-based neural networks. Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
High-resolution satellite imagery is critical for various earth observation applications related to environment monitoring, geoscience, fore… (see more)casting, and land use analysis. However, the acquisition cost of such high-quality imagery due to the scarcity of providers and needs for high-frequency revisits restricts its accessibility in many fields. In this work, we present a data-driven, multi-image super resolution approach to alleviate these problems. Our approach is based on an end-to-end deep neural network that consists of an encoder, a fusion module, and a decoder. The encoder extracts co-registered highly efficient feature representations from low-resolution images of a scene. A Gated Re-current Unit (GRU)-based module acts as the fusion module, aggregating features into a combined representation. Finally, a decoder reconstructs the super-resolved image. The proposed model is evaluated on the PROBA-V dataset released in a recent competition held by the European Space Agency. Our results show that it performs among the top contenders and offers a new practical solution for real-world applications.
2020-05-31
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (published)
The social brain hypothesis proposes that the complexity of human brains has coevolved with increasing complexity of social interactions in … (see more)primate societies. The present study explored the possible relationships between brain morphology and the richness of more intimate ‘inner’ and wider ‘outer’ social circles by integrating Bayesian hierarchical modeling with a large cohort sample from the UK Biobank resource (n = 10 000). In this way, we examined population volume effects in 36 regions of the ‘social brain’, ranging from lower sensory to higher associative cortices. We observed strong volume effects in the visual sensory network for the group of individuals with satisfying friendships. Further, the limbic network displayed several brain regions with substantial volume variations in individuals with a lack of social support. Our population neuroscience approach thus showed that distinct networks of the social brain show different patterns of volume variations linked to the examined social indices.
2020-05-31
Social Cognitive and Affective Neuroscience (published)
Restless bandits are a class of sequential resource allocation problems concerned with allocating one or more resources among several altern… (see more)ative processes where the evolution of the process depends on the resource allocated to them. Such models capture the fundamental trade-offs between exploration and exploitation. In 1988, Whittle developed an index heuristic for restless bandit problems which has emerged as a popular solution approach due to its simplicity and strong empirical performance. The Whittle index heuristic is applicable if the model satisfies a technical condition known as indexability. In this paper, we present two general sufficient conditions for indexability and identify simpler to verify refinements of these conditions. We then present a general algorithm to compute Whittle index for indexable restless bandits. Finally, we present a detailed numerical study which affirms the strong performance of the Whittle index heuristic.
Current works and future directions on application of machine learning in primary care
S. A. Rahimi
Vera Granikov
Pierre Pluye
In this short paper, we explained current machine learning works in primary care based on a scoping review that we performed. The performed … (see more)review was in line with the methodological framework proposed by Colquhoun and colleagues. Lastly, we discussed our observations and gave important directions to the future studies in this fast-growing area.
2020-05-26
International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems (published)
Failure to follow medication changes made at hospital discharge is associated with adverse events in 30 days
Daniala L. Weir
Aude Motulsky
Michal Abrahamowicz
Todd C. Lee
Steven Morgan
David L. Buckeridge
Robyn Tamblyn
To evaluate the hypothesis that nonadherence to medication changes made at hospital discharge is associated with an increased risk of advers… (see more)e events in the 30 days postdischarge.
Patients admitted to hospitals in Montreal, Quebec, between 2014 and 2016.
Prospective cohort study.
Nonadherence to medication changes was measured by comparing medications dispensed in the community with those prescribed at hospital discharge. Patient, health system, and drug regimen‐level covariates were measured using medical services and pharmacy claims data as well as data abstracted from the patient's hospital chart. Multivariable Cox models were used to determine the association between nonadherence to medication changes and the risk of adverse events.
Among 2655 patients who met our inclusion criteria, mean age was 69.5 years (SD 14.7) and 1581 (60%) were males. Almost half of patients (n = 1161, 44%) were nonadherent to at least one medication change, and 860 (32%) were readmitted to hospital, visited the emergency department, or died in the 30 days postdischarge. Patients who were not adherent to any of their medication changes had a 35% higher risk of adverse events compared to those who were adherent to all medication changes (1.41 vs 1.27 events/100 person‐days, adjusted hazard ratio: 1.35, 95% CI: 1.06‐1.71).
Almost half of all patients were not adherent to some or all changes made to their medications at hospital discharge. Nonadherence to all changes was associated with an increased risk of adverse events. Interventions addressing barriers to adherence should be considered moving forward.