Portrait of Pouya Bashivan is unavailable

Pouya Bashivan

Associate Academic Member
Assistant Professor, McGill University, Department of Physiology
Research Topics
Computational Neuroscience

Biography

Pouya Bashivan is an assistant professor in the Department of Physiology at McGill University, a member of McGill’s Integrated Program in Neuroscience, and an associate academic member of Mila – Quebec Artificial Intelligence Institute.

Before joining McGill University, Bashivan was a postdoctoral fellow at Mila, where he worked with Irina Rish and Blake Richards. Prior to that, he was a postdoctoral researcher in the Department of Brain and Cognitive Sciences and at the McGovern Institute for Brain Research at MIT, where he worked with James DiCarlo.

He received his PhD in computer engineering from the University of Memphis in 2016, and his BSc and MSc degrees in electrical and control engineering from K.N. Toosi University of Technology (Tehran).

The goal of research in Bashivan’s lab is to develop neural network models that leverage memory to solve complex tasks. While we often rely on task-performance measures to find improved neural network models and learning algorithms, we also use neural and behavioral measurements from humans and other animal brains to evaluate the similarity of these models to biologically evolved brains. We believe that these additional constraints could expedite the progress towards engineering a human-level artificially intelligent agent.

Current Students

Master's Research - McGill University
Master's Research - McGill University
Collaborating researcher - McGill University
PhD - McGill University
PhD - McGill University
Co-supervisor :

Publications

A Geometric Lens on RL Environment Complexity Based on Ricci Curvature
We introduce Ollivier-Ricci Curvature (ORC) as an information-geometric tool for analyzing the local structure of reinforcement learning (RL… (see more)) environments. We establish a novel connection between ORC and the Successor Representation (SR), enabling a geometric interpretation of environment dynamics decoupled from reward signals. Our analysis shows that states with positive and negative ORC values correspond to regions where random walks converge and diverge respectively, which are often critical for effective exploration. ORC is highly correlated with established environment complexity metrics, yet integrates naturally with standard RL frameworks based on SR and provides both global and local complexity measures. Leveraging this property, we propose an ORC-based intrinsic reward that guides agents toward divergent regions and away from convergent traps. Empirical results demonstrate that our curvature-driven reward substantially improves exploration performance across diverse environments, outperforming both random and count-based intrinsic reward baselines.
Spatially and non-spatially tuned hippocampal neurons are linear perceptual and nonlinear memory encoders
Maxime Daigle
Kaicheng Yan
Benjamin Corrigan
Roberto Gulli
Julio Martinez-Trujillo
Emergent brain-like representations in a goal-directed neural network model of visual search
Motahareh Pourrahimi
Building spatial world models from sparse transitional episodic memories
Zizhan He
Maxime Daigle
Building spatial world models from sparse transitional episodic memories
Zizhan He
Maxime Daigle
Many animals possess a remarkable capacity to rapidly construct flexible mental models of their environments. These world models are crucial… (see more) for ethologically relevant behaviors such as navigation, exploration, and planning. The ability to form episodic memories and make inferences based on these sparse experiences is believed to underpin the efficiency and adaptability of these models in the brain. Here, we ask: Can a neural network learn to construct a spatial model of its surroundings from sparse and disjoint episodic memories? We formulate the problem in a simulated world and propose a novel framework, the Episodic Spatial World Model (ESWM), as a potential answer. We show that ESWM is highly sample-efficient, requiring minimal observations to construct a robust representation of the environment. It is also inherently adaptive, allowing for rapid updates when the environment changes. In addition, we demonstrate that ESWM readily enables near-optimal strategies for exploring novel environments and navigating between arbitrary points, all without the need for additional training.
Caption This, Reason That: VLMs Caught in the Middle
Zihan Weng
Lucas Gomez
Taylor Whittington Webb
Real-time fine finger motion decoding for transradial amputees with surface electromyography
Zihan Weng
Yang Xiao
Peiyang Li
Chanlin Yi
Hailin Ma
Guang Yao
Yuan Lin
Fali Li
Dezhong Yao 0001
Jingming Hou
Yangsong Zhang
Peng Xu
Credit-based self organizing maps: training deep topographic networks with minimal performance degradation
Amirozhan Dehghani
Xinyu Qian
Asa Farahani
In the primate neocortex, neurons with similar function are often found to be spatially close. Kohonen's self-organizing map (SOM) has been … (see more)one of the most influential approaches for simulating brain-like topographical organization in artificial neural network models. However, integrating these maps into deep neural networks with multitude of layers has been challenging, with self-organized deep neural networks suffering from substantially diminished capacity to perform visual recognition. We identified a key factor leading to the performance degradation in self-organized topographical neural network models: the discord between predominantly bottom-up learning updates in the self-organizing maps, and those derived from top-down, credit-based learning approaches. To address this, we propose an alternative self organization algorithm, tailored to align with the top-down learning processes in deep neural networks. This model not only emulates critical aspects of cortical topography but also significantly narrows the performance gap between non-topographical and topographical models. This advancement underscores the substantial importance of top-down assigned credits in shaping topographical organization. Our findings are a step in reconciling topographical modeling with the functional efficacy of neural network models, paving the way for more brain-like neural architectures.
RGP: Achieving Memory-Efficient Model Fine-tuning Via Randomized Gradient Projection
Ali Saheb Pasand
Training and fine-tuning Large Language Models (LLMs) require significant memory due to the substantial growth in the size of weight paramet… (see more)ers and optimizer states. While methods like low-rank adaptation (LoRA), which introduce low-rank trainable modules in parallel to frozen pre-trained weights, effectively reduce memory usage, they often fail to preserve the optimization trajectory and are generally less effective for pre-training models. On the other hand, approaches, such as GaLore, that project gradients onto lower-dimensional spaces maintain the training trajectory and perform well in pre-training but suffer from high computational complexity, as they require repeated singular value decomposition on large matrices. In this work, we propose Randomized Gradient Projection (RGP), which outperforms GaLore, the current state-of-the-art in efficient fine-tuning, on the GLUE task suite, while being 74% faster on average and requiring similar memory.
Learning adversarially robust kernel ensembles with kernel average pooling
Reza Bayat
Adam Ibrahim
Amirozhan Dehghani
Yifei Ren
Learning adversarially robust kernel ensembles with kernel average pooling
Reza Bayat
Adam Ibrahim
Amirozhan Dehghani
Yifei Ren
Learning adversarially robust kernel ensembles with kernel average pooling
Reza Bayat
Adam Ibrahim
Amirozhan Dehghani
Yifei Ren