Publications

A responsible framework for applying artificial intelligence on medical images and signals at the point-of-care: the PACS-AI platform.

Pascal Thériault-Lauzier

Denis Cobin

Olivier Tastet

Élodie Labrecque Langlais

B. Taji

Guson Kang

A. Chong

Derek So

An Tang

Judy Wawira Gichoya

A. Chandar

Pierre-Luc Deziel

Julie G Hussin

Samuel Kadoury

Robert Avram

2024-05-31

Canadian Journal of Cardiology (published)

doi.org

Revisiting the 2023 wildfire season in Canada

Flavie Pelletier

Jeffrey A. Cardille

Michael A. Wulder

Joanne C. White

Txomin Hermosilla

2024-05-31

Science of Remote Sensing (published)

doi.org

Stimulus information guides the emergence of behavior-related signals in primary somatosensory cortex during learning

Mariangela Panniello

Colleen J. Gillon

Roberto Maffulli

Marco Celotto

Blake A. Richards

Stefano Panzeri

Michael M. Kohl

Neurons in the primary cortex carry sensory- and behavior-related information, but it remains an open question how this information emerges … (see more)and intersects together during learning. Current evidence points to two possible learning-related changes: sensory information increases in the primary cortex or sensory information remains stable, but its readout efficiency in association cortices increases. We investigated this question by imaging neuronal activity in mouse primary somatosensory cortex before, during, and after learning of an object localization task. We quantified sensory- and behavior-related information and estimated how much sensory information was used to instruct perceptual choices as learning progressed. We find that sensory information increases from the start of training, while choice information is mostly present in the later stages of learning. Additionally, the readout of sensory information becomes more efficient with learning as early as in the primary sensory cortex. Together, our results highlight the importance of primary cortical neurons in perceptual learning.

2024-05-31

Cell Reports (published)

doi.org

Climate Variable Downscaling with Conditional Normalizing Flows

Christina Winkler

Paula Harder

David Rolnick

Predictions of global climate models typically operate on coarse spatial scales due to the large computational costs of climate simulations.… (see more) This has led to a considerable interest in methods for statistical downscaling, a similar process to super-resolution in the computer vision context, to provide more local and regional climate information. In this work, we apply conditional normalizing flows to the task of climate variable downscaling. We showcase its successful performance on an ERA5 water content dataset for different upsampling factors. Additionally, we show that the method allows us to assess the predictive uncertainty in terms of standard deviation from the fitted conditional distribution mean.

2024-05-30

ArXiv (preprint)

doi.org

arxiv.org

How well do models of visual cortex generalize to out of distribution samples?

Yifei Ren

Pouya Bashivan

2024-05-30

PLOS Computational Biology (published)

doi.org

On shallow planning under partial observability

Randy Lefebvre

Audrey Durand

2024-05-30

rl-conference.cc/RLC/2024/Workshop/Deployable_RL (published)

openreview.net

On the Costs and Benefits of Adopting Lifelong Learning for Software Analytics -- Empirical Study on Brown Build and Risk Prediction

Doriane Olewicki

Sarra Habchi

Mathieu Nayrolles

Mojtaba Faramarzi

A. Chandar

Bram Adams

Nowadays, software analytics tools using machine learning (ML) models to, for example, predict the risk of a code change are well establishe… (see more)d. However, as the goals of a project shift over time, and developers and their habits change, the performance of said models tends to degrade (drift) over time. Current retraining practices typically require retraining a new model from scratch on a large updated dataset when performance decay is observed, thus incurring a computational cost; also there is no continuity between the models as the past model is discarded and ignored during the new model training. Even though the literature has taken interest in online learning approaches, those have rarely been integrated and evaluated in industrial environments. This paper evaluates the use of lifelong learning (LL) for industrial use cases at Ubisoft, evaluating both the performance and the required computational effort in comparison to the retraining-from-scratch approaches commonly used by the industry. LL is used to continuously build and maintain ML-based software analytics tools using an incremental learner that progressively updates the old model using new data. To avoid so-called"catastrophic forgetting"of important older data points, we adopt a replay buffer of older data, which still allows us to drastically reduce the size of the overall training dataset, and hence model training time.

2024-05-30

Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice (published)

doi.org

arxiv.org

Deep Grokking: Would Deep Neural Networks Generalize Better?

Simin Fan

Razvan Pascanu

Martin Jaggi

Recent research on the grokking phenomenon has illuminated the intricacies of neural networks' training dynamics and their generalization be… (see more)haviors. Grokking refers to a sharp rise of the network's generalization accuracy on the test set, which occurs long after an extended overfitting phase, during which the network perfectly fits the training set. While the existing research primarily focus on shallow networks such as 2-layer MLP and 1-layer Transformer, we explore grokking on deep networks (e.g. 12-layer MLP). We empirically replicate the phenomenon and find that deep neural networks can be more susceptible to grokking than its shallower counterparts. Meanwhile, we observe an intriguing multi-stage generalization phenomenon when increase the depth of the MLP model where the test accuracy exhibits a secondary surge, which is scarcely seen on shallow models. We further uncover compelling correspondences between the decreasing of feature ranks and the phase transition from overfitting to the generalization stage during grokking. Additionally, we find that the multi-stage generalization phenomenon often aligns with a double-descent pattern in feature ranks. These observations suggest that internal feature rank could serve as a more promising indicator of the model's generalization behavior compared to the weight-norm. We believe our work is the first one to dive into grokking in deep neural networks, and investigate the relationship of feature rank and generalization performance.

2024-05-28

ArXiv (preprint)

doi.org

arxiv.org

Forward-Backward Knowledge Distillation for Continual Clustering

Mohammadreza Sadeghi

Zihan Wang

Narges Armanfard

Unsupervised Continual Learning (UCL) is a burgeoning field in machine learning, focusing on enabling neural networks to sequentially learn … (see more)tasks without explicit label information. Catastrophic Forgetting (CF), where models forget previously learned tasks upon learning new ones, poses a significant challenge in continual learning, especially in UCL, where labeled information of data is not accessible. CF mitigation strategies, such as knowledge distillation and replay buffers, often face memory inefficiency and privacy issues. Although current research in UCL has endeavored to refine data representations and address CF in streaming data contexts, there is a noticeable lack of algorithms specifically designed for unsupervised clustering. To fill this gap, in this paper, we introduce the concept of Unsupervised Continual Clustering (UCC). We propose Forward-Backward Knowledge Distillation for unsupervised Continual Clustering (FBCC) to counteract CF within the context of UCC. FBCC employs a single continual learner (the ``teacher'') with a cluster projector, along with multiple student models, to address the CF issue. The proposed method consists of two phases: Forward Knowledge Distillation, where the teacher learns new clusters while retaining knowledge from previous tasks with guidance from specialized student models, and Backward Knowledge Distillation, where a student model mimics the teacher's behavior to retain task-specific knowledge, aiding the teacher in subsequent tasks. FBCC marks a pioneering approach to UCC, demonstrating enhanced performance and memory efficiency in clustering across various tasks, outperforming the application of clustering algorithms to the latent space of state-of-the-art UCL algorithms.

2024-05-28

ArXiv (preprint)

doi.org

arxiv.org

On the Limits of Multi-modal Meta-Learning with Auxiliary Task Modulation Using Conditional Batch Normalization

Jordi Armengol-Estap'e

Vincent Michalski

Ramnath Kumar

Pierre-Luc St-Charles

Doina Precup

S Ebrahimi Kahou

Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples. Recent studies show that cross… (see more)-modal learning can improve representations for few-shot classification. More specifically, language is a rich modality that can be used to guide visual learning. In this work, we experiment with a multi-modal architecture for few-shot learning that consists of three components: a classifier, an auxiliary network, and a bridge network. While the classifier performs the main classification task, the auxiliary network learns to predict language representations from the same input, and the bridge network transforms high-level features of the auxiliary network into modulation parameters for layers of the few-shot classifier using conditional batch normalization. The bridge should encourage a form of lightweight semantic alignment between language and vision which could be useful for the classifier. However, after evaluating the proposed approach on two popular few-shot classification benchmarks we find that a) the improvements do not reproduce across benchmarks, and b) when they do, the improvements are due to the additional compute and parameters introduced by the bridge network. We contribute insights and recommendations for future work in multi-modal meta-learning, especially when using language representations.

2024-05-28

ArXiv (preprint)

doi.org

arxiv.org

Arbuscular and ectomycorrhizal tree seedling growth is inhibited by competition from neighboring roots and associated fungal hyphae

Vlad Parasquive

Jacques Brisson

Étienne Laliberté

Pierre Luc Chagnon

2024-05-27

Plant and Soil (published)

doi.org

ERS0: Enhancing Military Cybersecurity with AI-Driven SBOM for Firmware Vulnerability Detection and Asset Management

Max Beninger

Philippe Charland

Steven H. H. Ding

Benjamin C. M. Fung

Firmware vulnerability detection and asset management through a software bill of material (SBOM) approach is integral to defensive military … (see more)operations. SBOMs provide a comprehensive list of software components, enabling military organizations to identify vulnerabilities within critical systems, including those controlling various functions in military platforms, as well as in operational technologies and Internet of Things devices. This proactive approach is essential for supply chain security, ensuring that software components are sourced from trusted suppliers and have not been tampered with during production, distribution, or through updates. It is a key element of defense strategies, allowing for rapid assessment, response, and mitigation of vulnerabilities, ultimately safeguarding military capabilities and information from cyber threats. In this paper, we propose ERS0, an SBOM system, driven by artificial intelligence (AI), for detecting firmware vulnerabilities and managing firmware assets. We harness the power of pre-trained large-scale language models to effectively address a wide array of string patterns, extending our coverage to thousands of third-party library patterns. Furthermore, we employ AI-powered code clone search models, enabling a more granular and precise search for vulnerabilities at the binary level, reducing our dependence on string analysis only. Additionally, our AI models extract high-level behavioral functionalities in firmware, such as communication and encryption, allowing us to quantitatively define the behavioral scope of firmware. In preliminary comparative assessments against open-source alternatives, our solution has demonstrated better SBOM coverage, accuracy in vulnerability identification, and a wider array of features.

2024-05-27

2024 16th International Conference on Cyber Conflict: Over the Horizon (CyCon) (published)

doi.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications