Publications

Prompt learning with bounding box constraints for medical image segmentation.
Mehrdad Noori
Sahar Dastani
Christian Desrosiers
Pixel-wise annotations are notoriously labourious and costly to obtain in the medical domain. To mitigate this burden, weakly supervised app… (see more)roaches based on bounding box annotations-much easier to acquire-offer a practical alternative. Vision foundation models have recently shown noteworthy segmentation performance when provided with prompts such as points or bounding boxes. Prompt learning exploits these models by adapting them to downstream tasks and automating segmentation, thereby reducing user intervention. However, existing prompt learning approaches depend on fully annotated segmentation masks. This paper proposes a novel framework that combines the representational power of foundation models with the annotation efficiency of weakly supervised segmentation. More specifically, our approach automates prompt generation for foundation models using only bounding box annotations. Our proposed optimization scheme integrates multiple constraints derived from box annotations with pseudo-labels generated by the prompted foundation model. Extensive experiments across multi-modal datasets reveal that our weakly supervised method achieves an average Dice score of 84.90% in a limited data setting, outperforming existing fully-supervised and weakly-supervised approaches. The code will be available upon acceptance
Spatially and non-spatially tuned hippocampal neurons are linear perceptual and nonlinear memory encoders
Kaicheng Yan
Benjamin Corrigan
Roberto Gulli
Julio Martinez-Trujillo
The hippocampus has long been regarded as a neural map of physical space, with its neurons categorized as spatially or non-spatially tuned a… (see more)ccording to their response selectivity. However, growing evidence suggests that this dichotomy oversimplifies the complex roles hippocampal neurons play in integrating spatial and non-spatial information. Through computational modeling and in-vivo electrophysiology in macaques, we show that neurons classified as spatially tuned primarily encode linear combinations of immediate behaviorally relevant factors, while those labeled as non-spatially tuned rely on nonlinear mechanisms to integrate temporally distant experiences. Furthermore, our findings reveal a temporal gradient in the primate CA3 region, where spatial selectivity diminishes as neurons encode increasingly distant past events. Finally, using artificial neural networks, we demonstrate that nonlinear recurrent connections are crucial for capturing the response dynamics of non-spatially tuned neurons, particularly those encoding memory-related information. These findings challenge the traditional dichotomy of spatial versus non-spatial representations and instead suggest a continuum of linear and nonlinear computations that underpin hippocampal function. This framework provides new insights into how the hippocampus bridges perception and memory, informing on its role in episodic memory, spatial navigation, and associative learning.
CellMemory: hierarchical interpretation of out-of-distribution cells using bottlenecked transformer
Qifei Wang
Yiwen Hu
Yanjie Chen
Yuwei Wang
Guochao Li
Yun Li
Jinfeng Chen
Xuegong Zhang
James Zou
Manolis Kellis
Dianbo Liu
Lan Jiang
Identifying the genetic and molecular drivers of phenotypic heterogeneity among individuals is vital for understanding human health and for … (see more)diagnosing, monitoring, and treating diseases. To this end, international consortia such as the Human Cell Atlas and the Tabula Sapiens are creating comprehensive cellular references. Due to the massive volume of data generated, machine learning methods, especially transformer architectures, have been widely employed in related studies. However, applying machine learning to cellular data presents several challenges. One such challenge is making the methods interpretable with respect to both the input cellular information and its context. Another less explored challenge is the accurate representation of cells outside existing references, referred to as out-of-distribution (OOD) cells. The out-of-distribution could be attributed to various physiological conditions, such as comparing diseased cells, particularly tumor cells, with healthy reference data, or significant technical variations, such as using transfer learning from single-cell reference to spatial query data. Inspired by the global workspace theory in cognitive neuroscience, we introduce CellMemory, a bottlenecked Transformer with improved generalization capabilities designed for the hierarchical interpretation of OOD cells unseen during reference building. Even without pre-training, it exceeds the performance of large language models pre-trained with tens of millions of cells. In particular, when deciphering spatially resolved single-cell transcriptomics data, CellMemory demonstrates the ability to interpret data at the granule level accurately. Finally, we harness CellMemory's robust representational capabilities to elucidate malignant cells and their founder cells in different patients, providing reliable characterizations of the cellular changes caused by the disease.
Learning to combine top-down context and feed-forward representations under ambiguity with apical and basal dendrites
Guillaume Etter
Busra Tugce Gurbuz
One of the hallmark features of neocortical anatomy is the presence of extensive top-down projections into primary sensory areas, with many … (see more)impinging on the distal apical dendrites of pyramidal neurons. While it is known that they exert a modulatory effect, altering the gain of responses, their functional role remains an active area of research. It is hypothesized that these top-down projections carry contextual information that can help animals to resolve ambiguities in sensory data. One proposed mechanism of contextual integration is a non-linear integration of distinct input streams at apical and basal dendrites of pyramidal neurons. Computationally, however, it is yet to be demonstrated how such an architecture could leverage distinct compartments for flexible contextual integration and sensory processing when both sensory and context signals can be unreliable. Here, we implement an augmented deep neural network with distinct apical and basal compartments that integrates a) contextual information from top-down projections to apical compartments, and b) sensory representations driven by bottom-up projections to basal compartments, via a biophysically inspired rule. In addition, we develop a new multi-scenario contextual integration task using a generative image modeling approach. In addition to generalizing previous contextual integration tasks, it better captures the diversity of scenarios where neither contextual nor sensory information are fully reliable. To solve this task, this model successfully learns to select among integration strategies. We find that our model outperforms those without the "apical prior" when contextual information contradicts sensory input. Altogether, this suggests that the apical prior and biophysically inspired integration rule could be key components necessary for handling the ambiguities that animals encounter in the diverse contexts of the real world.
Multi-Agent Matrix Games with Individual learners: How Exploration-Exploitation Strategies Impact the Emergence of Coordination
Coordination between independent learning agents in a multi-agent environment is an important problem where AI systems may impact each other… (see more)s learning process. In this paper, we study how individual agents converge to optimal equilibrium in multi-agent where coordination is necessary to achieve optimality. Specifically, we cover the case of coordination to maximize every individual payoffs and coordination to maximize the collective payoff (cooperation). We study the emergence of such coordination behaviours in two-players matrix games with unknown payoff matrices and noisy bandit feedback. We consider five different environments along with widely used deterministic and stochastic bandit strategies. We study how different learning strategies and observation noise influence convergence to the optimal equilibrium. Our results indicate that coordination often emerge more easily from interactions between deterministic agents, especially when they follow the same learning behaviour. However, stochastic learning strategies appear to be more robust in the presence of many optimal joint actions. Overall, noisy observations often help stabilizing learning behaviours.
Opening the Scope of Openness in AI
Tamara Paris
Jin L.C. Guo
Relative Explanations for Contextual Problems with Endogenous Uncertainty: An Application to Competitive Facility Location
Jasone Ram'irez-Ayerbe
A Survey of State Representation Learning for Deep Reinforcement Learning
Representation learning methods are an important tool for addressing the challenges posed by complex observations spaces in sequential decis… (see more)ion making problems. Recently, many methods have used a wide variety of types of approaches for learning meaningful state representations in reinforcement learning, allowing better sample efficiency, generalization, and performance. This survey aims to provide a broad categorization of these methods within a model-free online setting, exploring how they tackle the learning of state representations differently. We categorize the methods into six main classes, detailing their mechanisms, benefits, and limitations. Through this taxonomy, our aim is to enhance the understanding of this field and provide a guide for new researchers. We also discuss techniques for assessing the quality of representations, and detail relevant future directions.
The challenge of hidden gifts in multi-agent reinforcement learning
Blake Aaron Richards
Cooperation between people is not always obvious. Sometimes we benefit from actions that others have taken even when we are unaware that the… (see more)y took those actions. For example, if your neighbor chooses not to take a parking spot in front of your house when you are not there, you can benefit, even without being aware that they took this action. These “hidden gifts” represent an interesting challenge for multi-agent reinforcement learning (MARL), since assigning credit to your own actions correctly when the beneficial actions of others are hidden is non-trivial. Here, we study the impact of hidden gifts with a very simple MARL task. In this task, agents in a grid-world environment have individual doors to unlock in order to obtain individual rewards. As well, if all the agents unlock their door the group receives a larger collective reward. However, there is only one key for all of the doors, such that the collective reward can only be obtained when the agents drop the key for others after they use it. Notably, there is nothing to indicate to an agent that the other agents have dropped the key, thus the act of dropping the key for others is a “hidden gift”. We show that several different state-of-the-art RL algorithms, including MARL algorithms, fail to learn how to obtain the collective reward in this simple task. Interestingly, we find that independent model-free policy gradient agents can solve the task when we provide them with information about their action history, but MARL agents still cannot solve the task with action history. Finally, we derive a correction term for these independent agents, inspired by learning aware approaches, which reduces the variance in learning and helps them to converge to collective success more reliably. These results show how credit assignment in multi-agent settings can be particularly challenging in the presence of “hidden gifts”, and demonstrate that learning awareness can benefit these settings
Towards Sustainable Investment Policies Informed by Opponent Shaping
Addressing climate change requires global coordination, yet rational economic actors often prioritize immediate gains over collective welfar… (see more)e, resulting in social dilemmas. InvestESG is a recently proposed multi-agent simulation that captures the dynamic interplay between investors and companies under climate risk. We provide a formal characterization of the conditions under which InvestESG exhibits an intertemporal social dilemma, deriving theoretical thresholds at which individual incentives diverge from collective welfare. Building on this, we apply Advantage Alignment, a scalable opponent shaping algorithm shown to be effective in general-sum games, to influence agent learning in InvestESG. We offer theoretical insights into why Advantage Alignment systematically favors socially beneficial equilibria by biasing learning dynamics toward cooperative outcomes. Our results demonstrate that strategically shaping the learning processes of economic agents can result in better outcomes that could inform policy mechanisms to better align market incentives with long-term sustainability goals.
Rethinking Full Finetuning from Pretraining Checkpoints in Active Learning for African Languages
Bonaventure F. P. Dossou
Jackie CK Cheung
Revisiting Laplacian Representations for Value Function Approximation in Deep RL
Rishav
A. Chandar
Yash Chandak
S Ebrahimi Kahou
Proto-value functions (PVFs) introduced Laplacian embeddings as an effective feature basis for value-function approximation; however, their … (see more)utility remained limited to small, fully known state spaces. Recent work has scaled Laplacian embeddings to high-dimensional inputs, using them for reward shaping and option discovery in goal-directed tasks, yet only as auxiliary signals, rather than directly using them as features for value functions. In this paper, we learn Laplacian eigenvectors online and employ them as features for Q-learning in 23 Atari games. We empirically demonstrate that these online–learned embeddings substantially improve model-free RL in large, high-dimensional domains. We demonstrate that enriching state representations with action embeddings yields additional gains under both behavior-policy and uniform-random policies. Additionally, we introduce the Fusion architecture, which augments the representation with useful inductive bias at the embedding level. To assess the usefulness of each embedding used in the Fusion architecture, we use Shapley values analysis.