Publications

Inferring electric vehicle charging patterns from smart meter data for impact studies
Élodie Campeau
Ilhan Kocar
Innovative transfusion strategies for blood deserts in disaster settings
Ayla Gerk
Robert Glatter
Long-term outcomes of critically ill patients with hematological malignancies: what is the impact of the coronavirus disease 2019 pandemic? Author's reply
Laveena Munshi
Sangeeta Mehta
MAP: Model Merging with Amortized Pareto Front Using Limited Computation
Li Li
Zhiqi Bu
Huan He
Yonghui Wu
Jiang Bian
Yong Chen
ProtSCAPE: Mapping the landscape of protein conformations in molecular dynamics
Dhananjay Bhaskar
David R. Johnson
João Felipe Rocha
Egbert Castro
Jackson Grady
Alex T. Grigas
Michael Perlmutter
Corey S. O'Hern
Understanding the dynamic nature of protein structures is essential for comprehending their biological functions. While significant progress… (voir plus) has been made in predicting static folded structures, modeling protein motions on microsecond to millisecond scales remains challenging. To address these challenges, we introduce a novel deep learning architecture, Protein Transformer with Scattering, Attention, and Positional Embedding (ProtSCAPE), which leverages the geometric scattering transform alongside transformer-based attention mechanisms to capture protein dynamics from molecular dynamics (MD) simulations. ProtSCAPE utilizes the multi-scale nature of the geometric scattering transform to extract features from protein structures conceptualized as graphs and integrates these features with dual attention structures that focus on residues and amino acid signals, generating latent representations of protein trajectories. Furthermore, ProtSCAPE incorporates a regression head to enforce temporally coherent latent representations.
Single-Shot Learning of Stable Dynamical Systems for Long-Horizon Manipulation Tasks
Mastering complex sequential tasks continues to pose a significant challenge in robotics. While there has been progress in learning long-hor… (voir plus)izon manipulation tasks, most existing approaches lack rigorous mathematical guarantees for ensuring reliable and successful execution. In this paper, we extend previous work on learning long-horizon tasks and stable policies, focusing on improving task success rates while reducing the amount of training data needed. Our approach introduces a novel method that (1) segments long-horizon demonstrations into discrete steps defined by waypoints and subgoals, and (2) learns globally stable dynamical system policies to guide the robot to each subgoal, even in the face of sensory noise and random disturbances. We validate our approach through both simulation and real-world experiments, demonstrating effective transfer from simulation to physical robotic platforms. Code is available at https://github.com/Alestaubin/stable-imitation-policy-with-waypoints
SOAK: Same/Other/All K-fold cross-validation for estimating similarity of patterns in data subsets
Gabrielle Thibault
C. S. Bodine
Paul Nelson Arellano
Alexander F. Shenkin
Olivia Jasmine Lindly
In many real-world applications of machine learning, we are interested to know if it is possible to train on the data that we have gathered … (voir plus)so far, and obtain accurate predictions on a new test data subset that is qualitatively different in some respect (time period, geographic region, etc). Another question is whether data subsets are similar enough so that it is beneficial to combine subsets during model training. We propose SOAK, Same/Other/All K-fold cross-validation, a new method which can be used to answer both questions. SOAK systematically compares models which are trained on different subsets of data, and then used for prediction on a fixed test subset, to estimate the similarity of learnable/predictable patterns in data subsets. We show results of using SOAK on six new real data sets (with geographic/temporal subsets, to check if predictions are accurate on new subsets), 3 image pair data sets (subsets are different image types, to check that we get smaller prediction error on similar images), and 11 benchmark data sets with predefined train/test splits (to check similarity of predefined splits).
Spatial Action Unit Cues for Interpretable Deep Facial Expression Recognition
Soufiane Belharbi
Alessandro Lameiras Koerich
Simon Bacon
Eric Granger
Although state-of-the-art classifiers for facial expression recognition (FER) can achieve a high level of accuracy, they lack interpretabili… (voir plus)ty, an important feature for end-users. Experts typically associate spatial action units (AUs) from a codebook to facial regions for the visual interpretation of expressions. In this paper, the same expert steps are followed. A new learning strategy is proposed to explicitly incorporate AU cues into classifier training, allowing to train deep interpretable models. During training, this AU codebook is used, along with the input image expression label, and facial landmarks, to construct a AU heatmap that indicates the most discriminative image regions of interest w.r.t the facial expression. This valuable spatial cue is leveraged to train a deep interpretable classifier for FER. This is achieved by constraining the spatial layer features of a classifier to be correlated with AU heatmaps. Using a composite loss, the classifier is trained to correctly classify an image while yielding interpretable visual layer-wise attention correlated with AU maps, simulating the expert decision process. Our strategy only relies on image class expression for supervision, without additional manual annotations. Our new strategy is generic, and can be applied to any deep CNN- or transformer-based classifier without requiring any architectural change or significant additional training time. Our extensive evaluation on two public benchmarks RAF-DB, and AffectNet datasets shows that our proposed strategy can improve layer-wise interpretability without degrading classification performance. In addition, we explore a common type of interpretable classifiers that rely on class activation mapping (CAM) methods, and show that our approach can also improve CAM interpretability.
A Survey of Diversification Techniques in Search and Recommendation
Yansen Zhang
Fuyuan Lyu
Bowei He
Bhaskar Mitra
Xue Liu
Diversifying search results is an important research topic in retrieval systems in order to satisfy both the various interests of customers … (voir plus)and the equal market exposure of providers. There has been a growing attention on diversity-aware research during recent years, accompanied by a proliferation of literature on methods to promote diversity in search and recommendation. However, the diversity-aware studies in retrieval systems lack a systematic organization and are rather fragmented. In this survey, we are the first to propose a unified taxonomy for classifying the metrics and approaches of diversification in both search and recommendation, which are two of the most extensively researched fields of retrieval systems. We begin the survey with a brief discussion of why diversity is important in retrieval systems, followed by a summary of the various diversity concerns in search and recommendation, highlighting their relationship and differences. For the survey’s main body, we present a unified taxonomy of diversification metrics and approaches in retrieval systems, from both the search and recommendation perspectives. In the later part of the survey, we discuss the openness research questions of diversity-aware research in search and recommendation in an effort to inspire future innovations and encourage the implementation of diversity in real-world systems.
The Canadian VirusSeq Data Portal and Duotang: open resources for SARS-CoV-2 viral sequences and genomic epidemiology
Erin E. Gill
Baofeng Jia
Carmen Lia Murall
Raphaël Poujol
Muhammad Zohaib Anwar
Nithu Sara John
Justin Richardsson
Ashley Hobb
Abayomi S. Olabode
Alexandru Lepsa
Ana T. Duggan
Andrea D. Tyler
Arnaud N'Guessan
Atul Kachru
Brandon Chan
Catherine Yoshida
Christina K. Yung
David Bujold
Dusan Andric
Edmund Su … (voir 46 de plus)
Emma J. Griffiths
Gary Van Domselaar
Gordon W. Jolly
Heather K. E. Ward
Henrich Feher
Jared Baker
Jared T. Simpson
Jaser Uddin
Jiannis Ragoussis
Jon Eubank
Jörg H. Fritz
José Héctor Gálvez
Karen Fang
Kim Cullion
Leonardo Rivera
Linda Xiang
Matthew A. Croxen
Mitchell Shiell
Natalie Prystajecky
Pierre-Olivier Quirion
Rosita Bajari
Samantha Rich
Samira Mubareka
Sandrine Moreira
Scott Cain
Steven G. Sutcliffe
Susanne A. Kraemer
Yelizar Alturmessov
Yann Joly
Marc Fiume
Terrance P. Snutch
Cindy Bell
Catalina Lopez-Correa
Julie G. Hussin
Jeffrey B. Joy
Caroline Colijn
Paul M. K. Gordon
William W. L. Hsiao
Art F. Y. Poon
Natalie C. Knox
Mélanie Courtot
Lincoln Stein
Sarah P. Otto
Guillaume Bourque
B. Jesse Shapiro
Fiona S. L. Brinkman
The COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform t… (voir plus)he public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN – VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts. The goal of VirusSeq was to allow open access to Canadian SARS-CoV-2 genomic sequences and enhanced, standardized contextual data that were unavailable in other repositories and that meet FAIR standards (Findable, Accessible, Interoperable and Reusable). In addition, the portal data submission pipeline contains data quality checking procedures and appropriate acknowledgement of data generators that encourages collaboration. From inception to execution, the portal was developed with a conscientious focus on strong data governance principles and practices. Extensive efforts ensured a commitment to Canadian privacy laws, data security standards, and organizational processes. This portal has been coupled with other resources, such as Viral AI, and was further leveraged by the Coronavirus Variants Rapid Response Network (CoVaRR-Net) to produce a suite of continually updated analytical tools and notebooks. Here we highlight this portal (https://virusseq-dataportal.ca/), including its contextual data not available elsewhere, and the Duotang (https://covarr-net.github.io/duotang/duotang.html), a web platform that presents key genomic epidemiology and modelling analyses on circulating and emerging SARS-CoV-2 variants in Canada. Duotang presents dynamic changes in variant composition of SARS-CoV-2 in Canada and by province, estimates variant growth, and displays complementary interactive visualizations, with a text overview of the current situation. The VirusSeq Data Portal and Duotang resources, alongside additional analyses and resources computed from the portal (COVID-MVP, CoVizu), are all open source and freely available. Together, they provide an updated picture of SARS-CoV-2 evolution to spur scientific discussions, inform public discourse, and support communication with and within public health authorities. They also serve as a framework for other jurisdictions interested in open, collaborative sequence data sharing and analyses.
The oneirogen hypothesis: modeling the hallucinatory effects of classical psychedelics in terms of replay-dependent plasticity mechanisms
Abstract Classical psychedelics induce complex visual hallucinations in humans, generating percepts that are co-herent at a … (voir plus)low level, but which have surreal, dream-like qualities at a high level. While there are many hypotheses as to how classical psychedelics could induce these effects, there are no concrete mechanistic models that capture the variety of observed effects in humans, while remaining consistent with the known pharmacological effects of classical psychedelics on neural circuits. In this work, we propose the “oneirogen hypothesis”, which posits that the perceptual effects of classical psychedelics are a result of their pharmacological actions inducing neural activity states that truly are more similar to dream-like states. We simulate classical psychedelics’ effects via manipulating neural network models trained on perceptual tasks with the Wake-Sleep algorithm. This established machine learning algorithm leverages two activity phases, a perceptual phase (wake) where sensory inputs are encoded, and a generative phase (dream) where the network internally generates activity consistent with stimulus-evoked responses. We simulate the action of psychedelics by partially shifting the model to the ‘Sleep’ state, which entails a greater influence of top-down connections, in line with the impact of psychedelics on apical dendrites. The effects resulting from this manipulation capture a number of experimentally observed phenomena including the emergence of hallucinations, increases in stimulus-conditioned variability, and large increases in synaptic plasticity. We further provide a number of testable predictions which could be used to validate or invalidate our oneirogen hypothesis.
Automating MedSAM by Learning Prompts with Weak Few-Shot Supervision
Christian Desrosiers