Publications

Hitting the High-dimensional notes: an ODE for SGD learning dynamics on GLMs and multi-index models
We analyze the dynamics of streaming stochastic gradient descent (SGD) in the high-dimensional limit when applied to generalized linear mode… (voir plus)ls and multi-index models (e.g. logistic regression, phase retrieval) with general data-covariance. In particular, we demonstrate a deterministic equivalent of SGD in the form of a system of ordinary differential equations that describes a wide class of statistics, such as the risk and other measures of sub-optimality. This equivalence holds with overwhelming probability when the model parameter count grows proportionally to the number of data. This framework allows us to obtain learning rate thresholds for stability of SGD as well as convergence guarantees. In addition to the deterministic equivalent, we introduce an SDE with a simplified diffusion coefficient (homogenized SGD) which allows us to analyze the dynamics of general statistics of SGD iterates. Finally, we illustrate this theory on some standard examples and show numerical simulations which give an excellent match to the theory.
ASTROPHOT: fitting everything everywhere all at once in astronomical images
Connor J. Stone
Stéphane Courteau
Jean-Charles Cuillandre
Nikhil Arora
We present AstroPhot, a fast, powerful, and user-friendly Python based astronomical image photometry solver. AstroPhot incorporates automati… (voir plus)c differentiation and GPU (or parallel CPU) acceleration, powered by the machine learning library PyTorch. Everything: AstroPhot can fit models for sky, stars, galaxies, PSFs, and more in a principled Chi^2 forward optimization, recovering Bayesian posterior information and covariance of all parameters. Everywhere: AstroPhot can optimize forward models on CPU or GPU; across images that are large, multi-band, multi-epoch, rotated, dithered, and more. All at once: The models are optimized together, thus handling overlapping objects and including the covariance between parameters (including PSF and galaxy parameters). A number of optimization algorithms are available including Levenberg-Marquardt, Gradient descent, and No-U-Turn MCMC sampling. With an object-oriented user interface, AstroPhot makes it easy to quickly extract detailed information from complex astronomical data for individual images or large survey programs. This paper outlines novel features of the AstroPhot code and compares it to other popular astronomical image modeling software. AstroPhot is open-source, fully Python based, and freely accessible here: https://github.com/Autostronomy/AstroPhot
BamQuery: a proteogenomic tool to explore the immunopeptidome and prioritize actionable tumor antigens
Maria Virginia Ruiz Cuevas
Marie-Pierre Hardy
Jean-David Larouche
Anca Apavaloaei
Eralda Kina
Krystel Vincent
Patrick Gendron
Jean-Philippe Laverdure
Chantal Durette
Pierre Thibault
Claude Perreault
Gregory Ehx
MHC-I-associated peptides (MAPs) derive from selective yet highly diverse genomic regions, including allegedly non-protein-coding sequences,… (voir plus) such as endogenous retroelements (EREs). Quantifying canonical (exonic) and non-canonical MAPs-encoding RNA expression in malignant and benign cells is critical for identifying tumor antigens (TAs) but represents a challenge for immunologists. We present BamQuery, a computational tool attributing an exhaustive RNA expression to MAPs of any origin (exon, intron, UTR, intergenic) from bulk and single-cell RNA-sequencing data. We show that non-canonical MAPs (including TAs) can derive from multiple different genomic regions (up to 35,343 for EREs), abundantly expressed in normal tissues. We also show that supposedly tumor-specific mutated MAPs, viral MAPs, and MAPs derived from proteasomal splicing can arise from different unmutated non-canonical genomic regions. The genome-wide approach of BamQuery allows comprehensive mapping of all MAPs in healthy and cancer tissues. BamQuery can also help predict MAP immunogenicity and identify safe and actionable TAs.
Morphological Parameters and Associated Uncertainties for 8 Million Galaxies in the Hyper Suprime-Cam Wide Survey
Aritra Ghosh
C. Megan Urry
Aayush Mishra
Priyamvada Natarajan
David B. Sanders
Daisuke Nagai
Chuan Tian
Nico Cappelluti
Jeyhan S. Kartaltepe
Meredith C. Powell
Amrit Rau
Ezequiel Treister
We use the Galaxy Morphology Posterior Estimation Network (GaMPEN) to estimate morphological parameters and associated uncertainties for …
The evolution of SARS-CoV-2 seroprevalence in Canada: a time-series study, 2020–2023
Tanya J. Murphy
Hanna Swail
Jaspreet Jain
Maureen Anderson
Philip Awadalla
Lesley Behl
P. Brown
C. Charlton
Karen Colwill
S. Drews
A. Gingras
Deena Hinshaw
P. Jha
J. Kanji
Victoria A. Kirsh
Amanda Lang
Marc-andré Langlois
Stephen Lee
Antoine Lewin
Sheila F O’Brien … (voir 10 de plus)
Chantale Pambrun
Kimberly Skead
David A. Stephens
Derek R. Stein
G. Tipples
Paul G. Van Caeseele
Timothy Grant Evans
Olivia Oxlade
Bruce D. Mazer
David L Buckeridge
Background: During the first year of the COVID-19 pandemic, the proportion of reported cases of COVID-19 among Canadians was under 6%. Altho… (voir plus)ugh high vaccine coverage was achieved in Canada by fall 2021, the Omicron variant caused unprecedented numbers of infections, overwhelming testing capacity and making it difficult to quantify the trajectory of population immunity. Methods: Using a time-series approach and data from more than 900 000 samples collected by 7 research studies collaborating with the COVID-19 Immunity Task Force (CITF), we estimated trends in SARS-CoV-2 seroprevalence owing to infection and vaccination for the Canadian population over 3 intervals: prevaccination (March to November 2020), vaccine roll-out (December 2020 to November 2021), and the arrival of the Omicron variant (December 2021 to March 2023). We also estimated seroprevalence by geographical region and age. Results: By November 2021, 9.0% (95% credible interval [CrI] 7.3%–11%) of people in Canada had humoral immunity to SARS-CoV-2 from an infection. Seroprevalence increased rapidly after the arrival of the Omicron variant — by Mar. 15, 2023, 76% (95% CrI 74%–79%) of the population had detectable antibodies from infections. The rapid rise in infection-induced antibodies occurred across Canada and was most pronounced in younger age groups and in the Western provinces: Manitoba, Saskatchewan, Alberta and British Columbia. Interpretation: Data up to March 2023 indicate that most people in Canada had acquired antibodies against SARS-CoV-2 through natural infection and vaccination. However, given variations in population seropositivity by age and geography, the potential for waning antibody levels, and new variants that may escape immunity, public health policy and clinical decisions should be tailored to local patterns of population immunity.
A Two-Stage Optimization Framework for Electric Vehicle Fleet Day-ahead Charging Management
Nowadays electric vehicles (EVs) have become one of the important means of transportation all over the world. The importance of EV owners’… (voir plus) privacy as well as smart EV fleet charging has always been one of the challenges in smart charging planning and management. Furthermore, in smart charging, the distribution system operator must also coordinate with EV aggregators to insure that the power system is operated within security limits while reducing charging costs and satisfying EV users’ energy needs. In this paper, a semi-private framework for EV owners has been introduced which solves a two-stage optimization problem for the smart control of EV charging. This framework considers charging cost reduction and peak load shaving as well as satisfying power grid constraints. At the higher stage, based on optimal power flow calculations, the proposed control signals are transferred to the lower stage in order to facilitate optimal scheduling in accordance with the mentioned goals. The obtained results based on the proposed optimal method implemented on the IEEE 33-bus network show that compared to uncontrolled charging, the cost of charging and the peak of the network are reduced by 5.31% and 4.90%, respectively. Moreover, all the constraints of the power grid are satisfied.
Pretrainable geometric graph neural network for antibody affinity maturation
Mingkai Wang
Bozitao Zhong
Yanling Wu
Tianlei Ying
Increasing the binding affinity of an antibody to its target antigen is a crucial task in antibody therapeutics development. This paper pres… (voir plus)ents a pretrainable geometric graph neural network, GearBind, and explores its potential inin silicoaffinity maturation. Leveraging multi-relational graph construction, multi-level geometric message passing and contrastive pretraining on mass-scale, unlabeled protein structural data, GearBind outperforms previous state-of-the-art approaches on SKEMPI and an independent test set. A powerful ensemble model based on GearBind is then derived and used to successfully enhance the binding of two antibodies with distinct formats and target antigens. ELISA EC50values of the designed antibody mutants are decreased by up to 17 fold, andKDvalues by up to 6.1 fold. These promising results underscore the utility of geometric deep learning and effective pretraining in macromolecule interaction modeling tasks.
A Systematic Literature Review of Fashion, Sustainability, and Consumption Using a Mixed Methods Approach
Osmud Rahman
Dingtao Hu
Benjamin C. M. Fung
With the growing global awareness of the environmental impact of clothing consumption, there has been a notable surge in the publication of … (voir plus)journal articles dedicated to “fashion sustainability” in the past decade, specifically from 2010 to 2020. However, despite this wealth of research, many studies remain disconnected and fragmented due to varying research objectives, focuses, and approaches. Conducting a systematic literature review with a mixed methods research approach can help identify key research themes, trends, and developmental patterns, while also shedding light on the complexity of fashion, sustainability, and consumption. To enhance the literature review and analytical process, the current systematic literature review employed text mining techniques and bibliometric visualization tools, including RAKE, VOSviewer, and CitNetExplorer. The findings revealed an increase in the number of publications focusing on “fashion and sustainability” between 2010 and 2021. Most studies were predominantly conducted in the United States, with a specific focus on female consumers. Moreover, a greater emphasis was placed on non-sustainable cues rather than the sustainable cues. Additionally, a higher number of case studies was undertaken to investigate three fast-fashion companies. To enhance our knowledge and understanding of this subject, this article highlights several valuable contributions and provides recommendations for future research.
AI4GCC - Track 3: Consumption and the Challenges of Multi-Agent RL
Teacher-Student Architecture for Knowledge Distillation: A Survey
Danyang Liu
X. T. Chen
Ju Wang
Xue Liu
Although Deep neural networks (DNNs) have shown a strong capacity to solve large-scale problems in many areas, such DNNs are hard to be depl… (voir plus)oyed in real-world systems due to their voluminous parameters. To tackle this issue, Teacher-Student architectures were proposed, where simple student networks with a few parameters can achieve comparable performance to deep teacher networks with many parameters. Recently, Teacher-Student architectures have been effectively and widely embraced on various knowledge distillation (KD) objectives, including knowledge compression, knowledge expansion, knowledge adaptation, and knowledge enhancement. With the help of Teacher-Student architectures, current studies are able to achieve multiple distillation objectives through lightweight and generalized student networks. Different from existing KD surveys that primarily focus on knowledge compression, this survey first explores Teacher-Student architectures across multiple distillation objectives. This survey presents an introduction to various knowledge representations and their corresponding optimization objectives. Additionally, we provide a systematic overview of Teacher-Student architectures with representative learning algorithms and effective distillation schemes. This survey also summarizes recent applications of Teacher-Student architectures across multiple purposes, including classification, recognition, generation, ranking, and regression. Lastly, potential research directions in KD are investigated, focusing on architecture design, knowledge quality, and theoretical studies of regression-based learning, respectively. Through this comprehensive survey, industry practitioners and the academic community can gain valuable insights and guidelines for effectively designing, learning, and applying Teacher-Student architectures on various distillation objectives.
Assemblies, synapse clustering, and network topology interact with plasticity to explain structure-function relationships of the cortical connectome
András Ecker
Daniela Egas Santander
Marwan Abdellah
Jorge Blanco Alonso
Sirio Bolaños-Puchet
Giuseppe Chindemi
James B Isbister
James King
Pramod Kumbhar
Ioannis Magkanaris
Eilif B Muller
Michael W Reimann
Abstract Synaptic plasticity underlies the brain’s ability to learn and adapt. While experiments in brain slices have reve… (voir plus)aled mechanisms and protocols for the induction of plasticity between pairs of neurons, how these synaptic changes are coordinated in biological neuronal networks to ensure the emergence of learning remains poorly understood. Simulation and modeling have emerged as important tools to study learning in plastic networks, but have yet to achieve a scale that incorporates realistic network structure, active dendrites, and multi-synapse interactions, key determinants of synaptic plasticity. To rise to this challenge, we endowed an existing large-scale cortical network model, incorporating data-constrained dendritic processing and multi-synaptic connections, with a calcium-based model of functional plasticity that captures the diversity of excitatory connections extrapolated to in vivo-like conditions. This allowed us to study how dendrites and network structure interact with plasticity to shape stimulus representations at the microcircuit level. In our simulations, plasticity acted sparsely and specifically, firing rates and weight distributions remained stable without additional homeostatic mechanisms. At the circuit level, we found plasticity was driven by co-firing stimulus-evoked functional assemblies, spatial clustering of synapses on dendrites, and the topology of the network connectivity. As a result of the plastic changes, the network became more reliable with more stimulus-specific responses. We confirmed our testable predictions in the MICrONS datasets, an openly available electron microscopic reconstruction of a large volume of cortical tissue. Our results quantify at a large scale how the dendritic architecture and higher-order structure of cortical microcircuits play a central role in functional plasticity and provide a foundation for elucidating their role in learning.
Bayesian modelling disentangles language versus executive control disruption in stroke
Gesa Hartwigsen
Jae-Sung Lim
Hee-Joon Bae
Kyung-Ho Yu
Hugo J. Kuijf
Nick A. Weaver
J. Matthijs Biesbroek
Stroke is the leading cause of long-term disability worldwide. Incurred brain damage disrupts cognition, often with persisting deficits in l… (voir plus)anguage and executive capacities. Despite their clinical relevance, the commonalities, and differences of language versus executive control impairments remain under-specified. We tailored a Bayesian hierarchical modeling solution in a largest-of-its-kind cohort (1080 stroke patients) to deconvolve language and executive control in the brain substrates of stroke insults. Four cognitive factors distinguished left- and right-hemispheric contributions to ischemic tissue lesion. One factor delineated language and general cognitive performance and was mainly associated with damage to left-hemispheric brain regions in the frontal and temporal cortex. A factor for executive control summarized control and visual-constructional abilities. This factor was strongly related to right-hemispheric brain damage of posterior regions in the occipital cortex. The interplay of language and executive control was reflected in two factors: executive speech functions and verbal memory. Impairments on both were mainly linked to left-hemispheric lesions. These findings shed light onto the causal implications of hemispheric specialization for cognition; and make steps towards subgroup-specific treatment protocols after stroke.