Publications

Position: Cracking the Code of Cascading Disparity Towards Marginalized Communities
Quantitative Analysis of Miniature Synaptic Calcium Transients Using Positive Unlabeled Deep Learning
Anthony Bilodeau
Theresa Wiesner
Gabriel Leclerc
Mado Lemieux
Gabriel Nadeau
Katrine Castonguay
Bolin Fan
Simon Labrecque
Renée Hložek
Paul De Koninck
Ca 2+ imaging methods are widely used for studying cellular activity in the brain, allowing detailed ana… (voir plus)lysis of dynamic processes across various scales. Enhanced by high-contrast optical microscopy and fluorescent Ca 2+ sensors, this technique can be used to reveal localized Ca 2+ fluctuations within neurons, including in sub-cellular compartments, such as the dendritic shaft or spines. Despite advances in Ca 2+ sensors, the analysis of miniature Synaptic Calcium Transients (mSCTs), characterized by variability in morphology and low signal-to-noise ratios, remains challenging. Traditional threshold-based methods struggle with the detection and segmentation of these small, dynamic events. Deep learning (DL) approaches offer promising solutions but are limited by the need for large annotated datasets. Positive Unlabeled (PU) learning addresses this limitation by leveraging unlabeled instances to increase dataset size and enhance performance. This approach is particularly useful in the case of mSCTs that are scarce and small, associated with a very small proportion of the foreground pixels. PU learning significantly increases the effective size of the training dataset, improving model performance. Here, we present a PU learning-based strategy for detecting and segmenting mSCTs. We evaluate the performance of two 3D deep learning models, StarDist-3D and 3D U-Net, which are well established for the segmentation of small volumetric structures in microscopy datasets. By integrating PU learning, we enhance the 3D U-Net’s performance, demonstrating significant gains over traditional methods. This work pioneers the application of PU learning in Ca 2+ imaging analysis, offering a robust framework for mSCT detection and segmentation. We also demonstrate how this quantitative analysis pipeline can be used for subsequent mSCTs feature analysis. We characterize morphological and kinetic changes of mSCTs associated with the application of chemical long-term potentiation (cLTP) stimulation in cultured rat hippocampal neurons. Our data-driven approach shows that a cLTP-inducing stimulus leads to the emergence of new active dendritic regions and differently affects mSCTs subtypes.
Randomized Confidence Bounds for Stochastic Partial Monitoring
The partial monitoring (PM) framework provides a theoretical formulation of sequential learning problems with incomplete feedback. On each r… (voir plus)ound, a learning agent plays an action while the environment simultaneously chooses an outcome. The agent then observes a feedback signal that is only partially informative about the (unobserved) outcome. The agent leverages the received feedback signals to select actions that minimize the (unobserved) cumulative loss. In contextual PM, the outcomes depend on some side information that is observable by the agent before selecting the action on each round. In this paper, we consider the contextual and non-contextual PM settings with stochastic outcomes. We introduce a new class of PM strategies based on the randomization of deterministic confidence bounds. We also extend regret guarantees to settings where existing stochastic strategies are not applicable. Our experiments show that the proposed RandCBP and RandCBPsidestar strategies have favorable performance against state-of-the-art baselines in multiple PM games. To advocate for the adoption of the PM framework, we design a use case on the real-world problem of monitoring the error rate of any deployed classification system.
A Reinforcement Learning Pipeline for Band Gap-directed Crystal Generation
Santiago Miret
Mariano Phielipp
A. Chandar
Property-driven AI-automated material discovery presents unique challenges owing to the complex nature of the chemical structural space and … (voir plus)computationally expensive simulations. For crystalline solids, the band gap is an important property for designing semiconductors and batteries. However, optimizing crystals for a target band gap is difficult and not well-explored. Reinforcement learning (RL) shows promise towards optimizing crystals, as it can freely explore the chemical space. However, it relies on regular band gap evaluations, which can only be accurately computed through expensive Density Functional Theory (DFT) simulations. In this study, we propose an active learning-inspired pipeline that combines RL and DFT simulations for optimizing crystal compositions given a target band gap. The pipeline includes an RL policy for predicting atom types and a band gap network that is fine-tuned with DFT data. Preliminary results indicate the need for furthering the state-of-the-art to address the inherent challenges of the problem.
Robust Data-driven Prescriptiveness Optimization
Mehran Poursoltani
Angelos Georghiou
The abundance of data has led to the emergence of a variety of optimization techniques that attempt to leverage available side information t… (voir plus)o provide more anticipative decisions. The wide range of methods and contexts of application have motivated the design of a universal unitless measure of performance known as the coefficient of prescriptiveness. This coefficient was designed to quantify both the quality of contextual decisions compared to a reference one and the prescriptive power of side information. To identify policies that maximize the former in a data-driven context, this paper introduces a distributionally robust contextual optimization model where the coefficient of prescriptiveness substitutes for the classical empirical risk minimization objective. We present a bisection algorithm to solve this model, which relies on solving a series of linear programs when the distributional ambiguity set has an appropriate nested form and polyhedral structure. Studying a contextual shortest path problem, we evaluate the robustness of the resulting policies against alternative methods when the out-of-sample dataset is subject to varying amounts of distribution shift.
SelfIE: Self-Interpretation of Large Language Model Embeddings
Haozhe Chen
Carl Vondrick
Chengzhi Mao
Simple and Scalable Strategies to Continually Pre-train Large Language Models
SiT: Symmetry-invariant Transformers for Generalisation in Reinforcement Learning
Matthias Weissenbacher
Yoshinobu Kawahara
An open challenge in reinforcement learning (RL) is the effective deployment of a trained policy to new or slightly different situations as … (voir plus)well as semantically-similar environments. We introduce Symmetry-Invariant Transformer (SiT), a scalable vision transformer (ViT) that leverages both local and global data patterns in a self-supervised manner to improve generalisation. Central to our approach is Graph Symmetric Attention, which refines the traditional self-attention mechanism to preserve graph symmetries, resulting in invariant and equivariant latent representations. We showcase SiT’s superior generalization over ViTs on MiniGrid and Procgen RL benchmarks, and its sample efficiency on Atari 100k and CIFAR10.
Stealing part of a production language model
Nicholas Carlini
Daniel Paleka
Krishnamurthy Dj Dvijotham
Thomas Steinke
Jonathan Hayase
A. Feder Cooper
Katherine Lee
Matthew Jagielski
Milad Nasr
Arthur Conmy
Eric Wallace
Florian Tramèr
We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like … (voir plus)OpenAI's ChatGPT or Google's PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access. For under \\
Stochastic positional embeddings improve masked image modeling
Amir Bar
Assaf Shocher
Mahmoud Assran
P Vincent
Trevor Darrell
Amir Globerson
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Jordi Orbay
Quan Vuong
Yevgen Chebotar
Ted Xiao
Alex Irpan
Sergey Levine
Aleksandra Faust
Aviral Kumar
Value functions are a central component of deep reinforcement learning (RL). These functions, parameterized by neural networks, are trained … (voir plus)using a mean squared error regression objective to match bootstrapped target values. However, scaling value-based RL methods that use regression to large networks, such as high-capacity Transformers, has proven challenging. This difficulty is in stark contrast to supervised learning: by leveraging a cross-entropy classification loss, supervised methods have scaled reliably to massive networks. Observing this discrepancy, in this paper, we investigate whether the scalability of deep RL can also be improved simply by using classification in place of regression for training value functions. We demonstrate that value functions trained with categorical cross-entropy significantly improves performance and scalability in a variety of domains. These include: single-task RL on Atari 2600 games with SoftMoEs, multi-task RL on Atari with large-scale ResNets, robotic manipulation with Q-transformers, playing Chess without search, and a language-agent Wordle task with high-capacity Transformers, achieving state-of-the-art results on these domains. Through careful analysis, we show that the benefits of categorical cross-entropy primarily stem from its ability to mitigate issues inherent to value-based RL, such as noisy targets and non-stationarity. Overall, we argue that a simple shift to training value functions with categorical cross-entropy can yield substantial improvements in the scalability of deep RL at little-to-no cost.
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Elvis Dopgima Dohmatob
Yunzhen Feng
Pu Yang
Francois Charton
Julia Kempe
As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing … (voir plus)capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data. In this paper we ask: How will the scaling laws change in the inevitable regime where synthetic data makes its way into the training corpus? Will future models, still improve, or be doomed to degenerate up to total (model) collapse? We develop a theoretical framework of model collapse through the lens of scaling laws. We discover a wide range of decay phenomena, analyzing loss of scaling, shifted scaling with number of generations, the ”un-learning" of skills, and grokking when mixing human and synthesized data. Our theory is validated by large-scale experiments with a transformer on an arithmetic task and text generation using the large language model Llama2.