Pouya Bashivan

Associate Academic Member

Assistant Professor, McGill University, Department of Physiology

Research Topics

Computational Neuroscience

Biography

Pouya Bashivan is an assistant professor in the Department of Physiology at McGill University, a member of McGill’s Integrated Program in Neuroscience, and an associate academic member of Mila – Quebec Artificial Intelligence Institute.

Before joining McGill University, Bashivan was a postdoctoral fellow at Mila, where he worked with Irina Rish and Blake Richards. Prior to that, he was a postdoctoral researcher in the Department of Brain and Cognitive Sciences and at the McGovern Institute for Brain Research at MIT, where he worked with James DiCarlo.

He received his PhD in computer engineering from the University of Memphis in 2016, and his BSc and MSc degrees in electrical and control engineering from K.N. Toosi University of Technology (Tehran).

The goal of research in Bashivan’s lab is to develop neural network models that leverage memory to solve complex tasks. While we often rely on task-performance measures to find improved neural network models and learning algorithms, we also use neural and behavioral measurements from humans and other animal brains to evaluate the similarity of these models to biologically evolved brains. We believe that these additional constraints could expedite the progress towards engineering a human-level artificially intelligent agent.

Current Students

Maxime Daigle

Master's Research - McGill University

Ozhan Dehghani

Master's Research - McGill University

amirozhan.dehghani@mail.mcgill.ca

Github

Google Scholar

Lucas Gomez

Master's Research - McGill University

Website

Herbie he

Collaborating researcher - McGill University

Website

Github

Xiaoxuan Lei

PhD - McGill University

Motahareh Pourrahimi

PhD - McGill University

Co-supervisor :

Irina Rish

Github

Publications

A Geometric Lens on RL Environment Complexity Based on Ricci Curvature

Ali Saheb Pasand

Pablo Samuel Castro

Pouya Bashivan

We introduce Ollivier-Ricci Curvature (ORC) as an information-geometric tool for analyzing the local structure of reinforcement learning (RL… (see more)) environments. We establish a novel connection between ORC and the Successor Representation (SR), enabling a geometric interpretation of environment dynamics decoupled from reward signals. Our analysis shows that states with positive and negative ORC values correspond to regions where random walks converge and diverge respectively, which are often critical for effective exploration. ORC is highly correlated with established environment complexity metrics, yet integrates naturally with standard RL frameworks based on SR and provides both global and local complexity measures. Leveraging this property, we propose an ORC-based intrinsic reward that guides agents toward divergent regions and away from convergent traps. Empirical results demonstrate that our curvature-driven reward substantially improves exploration performance across diverse environments, outperforming both random and count-based intrinsic reward baselines.

2025-07-01

rl-conference.cc/RLC/2025/Workshop/RLBrew (published)

openreview.net

Spatially and non-spatially tuned hippocampal neurons are linear perceptual and nonlinear memory encoders

Maxime Daigle

Kaicheng Yan

Benjamin Corrigan

Roberto Gulli

Julio Martinez-Trujillo

Pouya Bashivan

2025-06-24

bioRxiv (preprint)

doi.org

Emergent brain-like representations in a goal-directed neural network model of visual search

Motahareh Pourrahimi

Pouya Bashivan

2025-06-10

bioRxiv (preprint)

doi.org

Building spatial world models from sparse transitional episodic memories

Zizhan He

Maxime Daigle

Pouya Bashivan

2025-05-19

ArXiv (preprint)

arxiv.org

Building spatial world models from sparse transitional episodic memories

Zizhan He

Maxime Daigle

Pouya Bashivan

Many animals possess a remarkable capacity to rapidly construct flexible mental models of their environments. These world models are crucial… (see more) for ethologically relevant behaviors such as navigation, exploration, and planning. The ability to form episodic memories and make inferences based on these sparse experiences is believed to underpin the efficiency and adaptability of these models in the brain. Here, we ask: Can a neural network learn to construct a spatial model of its surroundings from sparse and disjoint episodic memories? We formulate the problem in a simulated world and propose a novel framework, the Episodic Spatial World Model (ESWM), as a potential answer. We show that ESWM is highly sample-efficient, requiring minimal observations to construct a robust representation of the environment. It is also inherently adaptive, allowing for rapid updates when the environment changes. In addition, we demonstrate that ESWM readily enables near-optimal strategies for exploring novel environments and navigating between arbitrary points, all without the need for additional training.

2025-05-01

arXiv (published)

doi.org

arxiv.org

Caption This, Reason That: VLMs Caught in the Middle

Zihan Weng

Lucas Gomez

Taylor Whittington Webb

Pouya Bashivan

2025-05-01

arXiv (published)

doi.org

arxiv.org

Real-time fine finger motion decoding for transradial amputees with surface electromyography

Zihan Weng

Yang Xiao

Peiyang Li

Chanlin Yi

Pouya Bashivan

Hailin Ma

Guang Yao

Yuan Lin

Fali Li

Dezhong Yao 0001

Jingming Hou

Yangsong Zhang

Peng Xu

2025-05-01

Neural Networks (published)

doi.org

Credit-based self organizing maps: training deep topographic networks with minimal performance degradation

Amirozhan Dehghani

Xinyu Qian

Asa Farahani

Pouya Bashivan

In the primate neocortex, neurons with similar function are often found to be spatially close. Kohonen's self-organizing map (SOM) has been … (see more)one of the most influential approaches for simulating brain-like topographical organization in artificial neural network models. However, integrating these maps into deep neural networks with multitude of layers has been challenging, with self-organized deep neural networks suffering from substantially diminished capacity to perform visual recognition. We identified a key factor leading to the performance degradation in self-organized topographical neural network models: the discord between predominantly bottom-up learning updates in the self-organizing maps, and those derived from top-down, credit-based learning approaches. To address this, we propose an alternative self organization algorithm, tailored to align with the top-down learning processes in deep neural networks. This model not only emulates critical aspects of cortical topography but also significantly narrows the performance gap between non-topographical and topographical models. This advancement underscores the substantial importance of top-down assigned credits in shaping topographical organization. Our findings are a step in reconciling topographical modeling with the functional efficacy of neural network models, paving the way for more brain-like neural architectures.

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

RGP: Achieving Memory-Efficient Model Fine-tuning Via Randomized Gradient Projection

Ali Saheb Pasand

Pouya Bashivan

Training and fine-tuning Large Language Models (LLMs) require significant memory due to the substantial growth in the size of weight paramet… (see more)ers and optimizer states. While methods like low-rank adaptation (LoRA), which introduce low-rank trainable modules in parallel to frozen pre-trained weights, effectively reduce memory usage, they often fail to preserve the optimization trajectory and are generally less effective for pre-training models. On the other hand, approaches, such as GaLore, that project gradients onto lower-dimensional spaces maintain the training trajectory and perform well in pre-training but suffer from high computational complexity, as they require repeated singular value decomposition on large matrices. In this work, we propose Randomized Gradient Projection (RGP), which outperforms GaLore, the current state-of-the-art in efficient fine-tuning, on the GLUE task suite, while being 74% faster on average and requiring similar memory.

2024-12-10

Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop (published)

proceedings.mlr.press

Learning adversarially robust kernel ensembles with kernel average pooling

Pouya Bashivan

Reza Bayat

Adam Ibrahim

Amirozhan Dehghani

Yifei Ren

2024-12-01

Expert systems with applications (published)

doi.org

Learning adversarially robust kernel ensembles with kernel average pooling

Pouya Bashivan

Reza Bayat

Adam Ibrahim

Amirozhan Dehghani

Yifei Ren

2024-12-01

Expert systems with applications (published)

doi.org

Learning adversarially robust kernel ensembles with kernel average pooling

Pouya Bashivan

Reza Bayat

Adam Ibrahim

Amirozhan Dehghani

Yifei Ren

2024-12-01

Expert systems with applications (published)

doi.org

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Pouya Bashivan

Biography

Current Students

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Pouya Bashivan

Biography

Current Students

Publications