Publications

In-Processing Fairness Improvement Methods for Regression Data-Driven Building Models: Achieving Uniform Energy Prediction
Ying Sun
Benjamin C. M. Fung
Fariborz Haghighat
Interpolated Adversarial Training: Achieving Robust Neural Networks Without Sacrificing Too Much Accuracy
Adversarial robustness has become a central goal in deep learning, both in theory and in practice. However, successful methods to improve th… (see more)e adversarial robustness (such as adversarial training) greatly hurt generalization performance on the unperturbed data. This could have a major impact on how achieving adversarial robustness affects real world systems (i.e. many may opt to forego robustness if it can improve accuracy on the unperturbed data). We propose Interpolated Adversarial Training, which employs recently proposed interpolation based training methods in the framework of adversarial training. On CIFAR-10, adversarial training increases the standard test error (when there is no adversary) from 4.43% to 12.32%, whereas with our Interpolated adversarial training we retain adversarial robustness while achieving a standard test error of only 6.45%. With our technique, the relative increase in the standard error for the robust model is reduced from 178.1% to just 45.5%.
Lifelong Topological Visual Navigation
Commonly, learning-based topological navigation approaches produce a local policy while preserving some loose connectivity of the space thro… (see more)ugh a topological map. Nevertheless, spurious or missing edges in the topological graph often lead to navigation failure. In this work, we propose a sampling-based graph building method, which results in sparser graphs yet with higher navigation performance compared to baseline methods. We also propose graph maintenance strategies that eliminate spurious edges and expand the graph as needed, which improves lifelong navigation performance. Unlike controllers that learn from fixed training environments, we show that our model can be fine-tuned using only a small number of collected trajectory images from a real-world environment where the agent is deployed. We demonstrate successful navigation after fine-tuning on real-world environments, and notably show significant navigation improvements over time by applying our lifelong graph maintenance strategies.
MixEHR-Guided: A guided multi-modal topic modeling approach for large-scale automatic phenotyping using the electronic health record
Yuri Ahuja
Yuesong Zou
David L Buckeridge
Yuemei Li
Evolution of cell size control is canalized towards adders or sizers by cell cycle structure and selective pressures
Felix Proulx-Giraldeau
Jan M Skotheim
Cell size is controlled to be within a specific range to support physiological function. To control their size, cells use diverse mechanisms… (see more) ranging from ‘sizers’, in which differences in cell size are compensated for in a single cell division cycle, to ‘adders’, in which a constant amount of cell growth occurs in each cell cycle. This diversity raises the question why a particular cell would implement one rather than another mechanism? To address this question, we performed a series of simulations evolving cell size control networks. The size control mechanism that evolved was influenced by both cell cycle structure and specific selection pressures. Moreover, evolved networks recapitulated known size control properties of naturally occurring networks. If the mechanism is based on a G1 size control and an S/G2/M timer, as found for budding yeast and some human cells, adders likely evolve. But, if the G1 phase is significantly longer than the S/G2/M phase, as is often the case in mammalian cells in vivo, sizers become more likely. Sizers also evolve when the cell cycle structure is inverted so that G1 is a timer, while S/G2/M performs size control, as is the case for the fission yeast S. pombe. For some size control networks, cell size consistently decreases in each cycle until a burst of cell cycle inhibitor drives an extended G1 phase much like the cell division cycle of the green algae Chlamydomonas. That these size control networks evolved such self-organized criticality shows how the evolution of complex systems can drive the emergence of critical processes.
Learning Robust Kernel Ensembles with Kernel Average Pooling
Amirozhan Dehghani
Yifei Ren
SPeCiaL: Self-Supervised Pretraining for Continual Learning
Lucas Caccia
From analytic to synthetic-organizational pluralisms: A pluralistic enactive psychiatry
Christophe Gauld
Kristopher Nielsen
Manon Job
Hugo Bottemanne
Reliance on sole reductionism, whether explanatory, methodological or ontological, is difficult to support in clinical psychiatry. Rather, p… (see more)sychiatry is challenged by a plurality of approaches. There exist multiple legitimate ways of understanding human functionality and disorder, i.e., different systems of representation, different tools, different methodologies and objectives. Pluralistic frameworks have been presented through which the multiplicity of approaches in psychiatry can be understood. In parallel of these frameworks, an enactive approach for psychiatry has been proposed. In this paper, we consider the relationships between the different kinds of pluralistic frameworks and this enactive approach for psychiatry. We compare the enactive approach in psychiatry with wider analytical forms of pluralism. On one side, the enactive framework anchored both in cognitive sciences, theory of dynamic systems, systems biology, and phenomenology, has recently been proposed as an answer to the challenge of an integrative psychiatry. On the other side, two forms of explanatory pluralisms can be described: a non-integrative pluralism and an integrative pluralism. The first is tolerant, it examines the coexistence of different potentially incompatible or untranslatable systems in the scientific or clinical landscape. The second is integrative and proposes to bring together the different levels of understanding and systems of representations. We propose that enactivism is inherently a form of integrative pluralism, but it is at the same time a component of the general framework of explanatory pluralism, composed of a set of so-called analytical approaches. A significant number of mental health professionals are already accepting the variety of clinical and scientific approaches. In this way, a rigorous understanding of the theoretical positioning of psychiatric actors seems necessary to promote quality clinical practice. The study of entanglements between an analytical pluralism and a synthetic-organizational enactivist pluralism could prove fruitful.
FedShuffle: Recipes for Better Use of Local Work in Federated Learning
Samuel Horváth
Maziar Sanjabi
Lin Xiao
Peter Richtárik
Michael G. Rabbat
The practice of applying several local updates before aggregation across clients has been empirically shown to be a successful approach to o… (see more)vercoming the communication bottleneck in Federated Learning (FL). Such methods are usually implemented by having clients perform one or more epochs of local training per round while randomly reshuffling their finite dataset in each epoch. Data imbalance, where clients have different numbers of local training samples, is ubiquitous in FL applications, resulting in different clients performing different numbers of local updates in each round. In this work, we propose a general recipe, FedShuffle, that better utilizes the local updates in FL, especially in this regime encompassing random reshuffling and heterogeneity. FedShuffle is the first local update method with theoretical convergence guarantees that incorporates random reshuffling, data imbalance, and client sampling — features that are essential in large-scale cross-device FL. We present a comprehensive theoretical analysis of FedShuffle and show, both theoretically and empirically, that it does not suffer from the objective function mismatch that is present in FL methods that assume homogeneous updates in heterogeneous FL setups, such as FedAvg (McMahan et al., 2017). In addition, by combining the ingredients above, FedShuffle improves upon FedNova (Wang et al., 2020), which was previously proposed to solve this mismatch. Similar to Mime (Karimireddy et al., 2020), we show that FedShuffle with momentum variance reduction (Cutkosky & Orabona, 2019) improves upon non-local methods under a Hessian similarity assumption.
Tackling bias in AI health datasets through the STANDING Together initiative
Shaswath Ganapathi
Johannes Palmer
J. Alderman
Melanie Calvert
Cyrus Espinoza
Jacqui Gath
Marzyeh Ghassemi
Katherine Heller
Francis McKay
Alan Karthikesalingam
S. Kuku
Maxine E. Mackintosh
Sinduja Manohar
Bilal Mateen
Rubeta Matin
Melissa D. McCradden
Lauren Oakden-Rayner
Johan Ordish
Russell Pearson
S. Pfohl … (see 8 more)
Elizabeth Sapey
Neil J. Sebire
Viknesh Sounderajah
Charlotte Summers
Darren E. Treanor
Alastair Denniston
Xiaoxuan Liu
The 5-year longitudinal diagnostic profile and health services utilization of patients treated with electroconvulsive therapy in Quebec: a population-based study
Simon Lafrenière
Fatemeh Gholi-Zadeh-Kharrat
Caroline Sirois
Victoria Massamba
Louis Rochette
Camille Brousseau-Paradis
Simon Patry
Morgane Lemasson
Geneviève Gariépy
Chantal Mérette
Elham Rahme
Alain Lesage
OSSEM: One-Shot Speaker Adaptive Speech Enhancement Using Meta Learning
Cheng Yu
Tsun-An Hsieh
Yu Tsao
Although deep learning (DL) has achieved notable progress in speech enhancement (SE), further research is still required for a DL-based SE s… (see more)ystem to adapt effectively and efficiently to particular speakers. In this study, we propose a novel meta-learning-based speaker-adaptive SE approach (called OSSEM) that aims to achieve SE model adaptation in a one-shot manner. OSSEM consists of a modified transformer SE network and a speaker-specific masking (SSM) network. In practice, the SSM network takes an enrolled speaker embedding extracted using ECAPA-TDNN to adjust the input noisy feature through masking. To evaluate OSSEM, we designed a modified Voice Bank-DEMAND dataset, in which one utterance from the testing set was used for model adaptation, and the remaining utterances were used for testing the performance. Moreover, we set restrictions allowing the enhancement process to be conducted in real time, and thus designed OSSEM to be a causal SE system. Experimental results first show that OSSEM can effectively adapt a pretrained SE model to a particular speaker with only one utterance, thus yielding improved SE results. Meanwhile, OSSEM exhibits a competitive performance compared to state-of-the-art causal SE systems.