We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
FL Games: A federated learning framework for distribution shifts
Federated learning aims to train predictive models for data that is distributed across clients, under the orchestration of a server. However… (see more), participating clients typically each hold data from a different distribution, whereby predictive models with strong in-distribution generalization can fail catastrophically on unseen domains. In this work, we argue that in order to generalize better across non-i.i.d. clients, it is imperative to only learn correlations that are stable and invariant across domains. We propose FL Games, a game-theoretic framework for federated learning for learning causal features that are invariant across clients. While training to achieve the Nash equilibrium, the traditional best response strategy suffers from high-frequency oscillations. We demonstrate that FL Games effectively resolves this challenge and exhibits smooth performance curves. Further, FL Games scales well in the number of clients, requires significantly fewer communication rounds, and is agnostic to device heterogeneity. Through empirical evaluation, we demonstrate that FL Games achieves high out-of-distribution performance on various benchmarks.
We develop the sparse VAE for unsupervised representation learning on high-dimensional data. The sparse VAE learns a set of latent factors … (see more)(representations) which summarize the associations in the observed data features. The underlying model is sparse in that each observed feature (i.e. each dimension of the data) depends on a small subset of the latent factors. As examples, in ratings data each movie is only described by a few genres; in text data each word is only applicable to a few topics; in genomics, each gene is active in only a few biological processes. We prove such sparse deep generative models are identifiable: with infinite data, the true model parameters can be learned. (In contrast, most deep generative models are not identifiable.) We empirically study the sparse VAE with both simulated and real data. We find that it recovers meaningful latent factors and has smaller heldout reconstruction error than related methods.
Impact of a vaccine passport on first-dose COVID-19 vaccine coverage by age and area-level social determinants in the Canadian provinces of Quebec and Ontario: an interrupted time series analysis
Background: In Canada, all provinces implemented vaccine passports in 2021 to increase vaccine uptake and reduce transmission in non-essenti… (see more)al indoor spaces. We evaluate the impact of vaccine passport policies on first-dose COVID-19 vaccination coverage by age, area-level income and proportion racialized. Methods: We performed interrupted time-series analyses using vaccine registry data linked to census information in Quebec and Ontario (20.5 million people [≥]12 years; unit of analysis: dissemination area). We fit negative binomial regressions to weekly first-dose vaccination, using a natural spline to capture pre-announcement trends, adjusting for baseline vaccination coverage (start: July 3rd; end: October 23rd Quebec, November 13th Ontario). We obtain counterfactual vaccination rates and coverage, and estimated vaccine passports' impact on vaccination coverage (absolute) and new vaccinations (relative). Results: In both provinces, pre-announcement first-dose vaccination coverage was 82% ([≥]12 years). The announcement resulted in estimated increases in vaccination coverage of 0.9 percentage points (p.p.;95% CI: 0.4-1.2) in Quebec and 0.7 p.p. (95% CI: 0.5-0.8) in Ontario. In relative terms, these increases correspond to 23% (95% CI: 10-36%) and 19% (95% CI: 15-22%) more vaccinations. The impact was larger among people aged 12-39 (1-2 p.p.). There was little variability in the absolute impact by area-level income or proportion racialized in either province. Conclusions: In the context of high baseline vaccine coverage across two provinces, the announcement of vaccine passports led to a small impact on first-dose coverage, with little impact on reducing economic and racial inequities in vaccine coverage. Findings suggest the need for other policies to further increase vaccination coverage among lower-income and more racialized neighbourhoods and communities.
Throughout the SARS-CoV-2 pandemic, several variants of concern (VOC) have been identified, many of which share recurrent mutations in the s… (see more)pike protein’s receptor binding domain (RBD). This region coincides with known epitopes and can therefore have an impact on immune escape. Protracted infections in immunosuppressed patients have been hypothesized to lead to an enrichment of such mutations and therefore drive evolution towards VOCs. Here, we show that immunosuppressed patients with hematologic cancers develop distinct populations with immune escape mutations throughout the course of their infection. Notably, by investigating the co-occurrence of substitutions on individual sequencing reads in the RBD, we found quasispecies harboring mutations that confer resistance to known monoclonal antibodies (mAbs) such as S:E484K and S:E484A. Furthermore, we provide the first evidence for a viral reservoir based on intra-host phylogenetics. Our results on viral reservoirs can shed light on protracted infections interspersed with periods where the virus is undetectable as well as an alternative explanation for some long-COVID cases. Our findings also highlight that protracted infections should be treated with combination therapies rather than by a single mAbs to clear pre-existing resistant mutations.
Decision-making AI agents are often faced with two important challenges: the depth of the planning horizon, and the branching factor due to … (see more)having many choices. Hierarchical reinforcement learning methods aim to solve the first problem, by providing shortcuts that skip over multiple time steps. To cope with the breadth, it is desirable to restrict the agent's attention at each step to a reasonable number of possible choices. The concept of affordances (Gibson, 1977) suggests that only certain actions are feasible in certain states. In this work, we first characterize "affordances" as a "hard" attention mechanism that strictly limits the available choices of temporally extended options. We then investigate the role of hard versus soft attention in training data collection, abstract value learning in long-horizon tasks, and handling a growing number of choices. To this end, we present an online, model-free algorithm to learn affordances that can be used to further learn subgoal options. Finally, we identify and empirically demonstrate the settings in which the "paradox of choice" arises, i.e. when having fewer but more meaningful choices improves the learning speed and performance of a reinforcement learning agent.
Modern deep learning involves training costly, highly overparameterized networks, thus motivating the search for sparser networks that requi… (see more)re less compute and memory but can still be trained to the same accuracy as the full network (i.e. matching). Iterative magnitude pruning (IMP) is a state of the art algorithm that can find such highly sparse matching subnetworks, known as winning tickets, that can be retrained from initialization or an early training stage. IMP operates by iterative cycles of training, masking a fraction of smallest magnitude weights, rewinding unmasked weights back to an early training point, and repeating. Despite its simplicity, the underlying principles for when and how IMP finds winning tickets remain elusive. In particular, what useful information does an IMP mask found at the end of training convey to a rewound network near the beginning of training? We find that—at higher sparsities—pairs of pruned networks at successive pruning iterations are connected by a linear path with zero error barrier if and only if they are matching. This indicates that masks found at the end of training encodes information about the identity of an axial subspace that intersects a desired linearly connected mode of a matching sublevel set. We leverage this observation to design a simple adaptive pruning heuristic for speeding up the discovery of winning tickets and achieve a 30% reduction in computation time on CIFAR-100. These results make progress toward demystifying the existence of winning tickets with an eye towards enabling the development of more efficient pruning algorithms.