Portrait of Pouya Bashivan is unavailable

Pouya Bashivan

Associate Academic Member
Assistant Professor, McGill University, Department of Physiology
Research Topics
Computational Neuroscience

Biography

Pouya Bashivan is an assistant professor in the Department of Physiology at McGill University, a member of McGill’s Integrated Program in Neuroscience, and an associate academic member of Mila – Quebec Artificial Intelligence Institute.

Before joining McGill University, Bashivan was a postdoctoral fellow at Mila, where he worked with Irina Rish and Blake Richards. Prior to that, he was a postdoctoral researcher in the Department of Brain and Cognitive Sciences and at the McGovern Institute for Brain Research at MIT, where he worked with James DiCarlo.

He received his PhD in computer engineering from the University of Memphis in 2016, and his BSc and MSc degrees in electrical and control engineering from K.N. Toosi University of Technology (Tehran).

The goal of research in Bashivan’s lab is to develop neural network models that leverage memory to solve complex tasks. While we often rely on task-performance measures to find improved neural network models and learning algorithms, we also use neural and behavioral measurements from humans and other animal brains to evaluate the similarity of these models to biologically evolved brains. We believe that these additional constraints could expedite the progress towards engineering a human-level artificially intelligent agent.

Current Students

Master's Research - Université de Montréal
Principal supervisor :
Master's Research - McGill University
Master's Research - McGill University
Research Intern - McGill University
PhD - McGill University
PhD - McGill University
Co-supervisor :

Publications

Geometry of naturalistic object representations in recurrent neural network models of working memory
Xiaoxuan Lei
Takuya Ito
Burst firing optimizes invariant coding of natural communication signals by electrosensory neural populations
Michael G. Metzen
Amin Akhshi
Anmar Khadra
Maurice J. Chacron
Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques
Rishika Bhagwatkar
Shravan Nayak
Reza Bayat
Alexis Roger
Daniel Z Kaplan
Vision-Language Models (VLMs) have witnessed a surge in both research and real-world applications. However, as they becoming increasingly pr… (see more)evalent, ensuring their robustness against adversarial attacks is paramount. This work systematically investigates the impact of model design choices on the adversarial robustness of VLMs against image-based attacks. Additionally, we introduce novel, cost-effective approaches to enhance robustness through prompt formatting. By rephrasing questions and suggesting potential adversarial perturbations, we demonstrate substantial improvements in model robustness against strong image-based attacks such as Auto-PGD. Our findings provide important guidelines for developing more robust VLMs, particularly for deployment in safety-critical environments.
iWISDM: Assessing instruction following in multimodal models at scale
Xiaoxuan Lei
Lucas Gomez
Hao Yuan Bai
The ability to perform complex tasks from detailed instructions is a key to the remarkable achievements of our species. As humans, we are no… (see more)t only capable of performing a wide variety of tasks but also very complex ones that may entail hundreds or thousands of steps to complete. Large language models and their more recent multimodal counterparts that integrate textual and visual inputs have achieved unprecedented success in performing complex tasks. Yet, most existing benchmarks are largely confined to single-modality inputs — either text or vision — and thus, narrowing the scope of multimodal integration assessments, particularly for instruction-following in multimodal contexts. To bridge this gap, we introduce the instructed-Virtual VISual Decision Making (iWISDM) environment engineered to generate a limitless array of vision-language tasks of varying complexity. Using iWISDM, we compiled three distinct benchmarks of instruction following visual tasks across varying complexity levels and evaluated several newly developed multimodal models on these benchmarks. Our findings establish iWISDM as a robust benchmark for assessing the instructional adherence of both existing and emergent multimodal models and highlight a large gap in these models’ ability to precisely follow instructions.
Local lateral connectivity is sufficient for replicating cortex-like topographical organization in deep neural networks
Xinyu Qian
Amirozhan Dehghani
Asa Borzabadi Farahani
Across the primate cortex, neurons that perform similar functions tend to be spatially grouped together. In high-level visual cortex, this w… (see more)idely observed biological rule manifests itself as a modular organization of neuronal clusters, each tuned to a specific object category. The tendency toward short connections is one of the most widely accepted views of why such an organization exists in the brains of many animals. Yet, how such a feat is implemented at the neural level remains unclear. Here, using artificial deep neural networks as test beds, we demonstrate that a topographical organization similar to that in the primary, intermediate, and high-level human visual cortex emerges when units in these models are laterally connected and their weight parameters are tuned by top-down credit assignment. Importantly, the emergence of the modular organization in the absence of explicit topography-inducing learning rules and objectives questions their necessity and suggests that local lateral connectivity alone may be sufficient for the formation of the topographic organization across the cortex.
A Hybrid CNN-Transformer Approach for Continuous Fine Finger Motion Decoding from sEMG Signals
Zihan Weng
Xiabing Zhang
Yufeng Mou
Chanlin Yi
Fali Li
Peng Xu
This work presents a novel approach that synergistically integrates convolutional neural networks (CNNs) and Transformer models for decoding… (see more) continuous fine finger motions from surface electromyography (sEMG) signals. This integration capitalizes on CNNs’ proficiency in extracting rich temporal and spatial features from multichannel sEMG data and the Transformer’s superior capability in recognizing complex patterns and long-range dependencies. A significant advancement in this field is the use of a custom-developed Epidermal Electrode Array Sleeve (EEAS) for capturing high-fidelity sEMG signals, enabling more accurate and reliable signal acquisition than traditional methods. The decoded joint angles could be used in seamless and intuitive human-machine interaction in various applications, such as virtual reality, augmented reality, robotic control, and prosthetic control. Evaluations demonstrate the superior performance of the proposed CNN-Transformer hybrid architecture in decoding continuous fine finger motions, outperforming individual CNN and Transformer models. The synergistic integration of CNNs and Transformers presents a powerful framework for sEMG decoding, offering exciting opportunities for naturalistic and intuitive human-machine interaction applications. Its robustness and efficiency make it an ideal choice for real-world applications, promising to enhance the interface between humans and machines significantly. The implications of this research extend to advancing the understanding of human neuromuscular signals and their application in computing interfaces.
How well do models of visual cortex generalize to out of distribution samples?
Yifei Ren
Improving Adversarial Robustness in Vision-Language Models with Architecture and Prompt Design.
Rishika Bhagwatkar
Shravan Nayak
The feature landscape of visual cortex
Rudi Tong
Ronan da Silva
Dongyan Lin
Arna Ghosh
James Wilsenach
Erica Cianfarano
Stuart Trenholm
Understanding computations in the visual system requires a characterization of the distinct feature preferences of neurons in different visu… (see more)al cortical areas. However, we know little about how feature preferences of neurons within a given area relate to that area’s role within the global organization of visual cortex. To address this, we recorded from thousands of neurons across six visual cortical areas in mouse and leveraged generative AI methods combined with closed-loop neuronal recordings to identify each neuron’s visual feature preference. First, we discovered that the mouse’s visual system is globally organized to encode features in a manner invariant to the types of image transformations induced by self-motion. Second, we found differences in the visual feature preferences of each area and that these differences generalized across animals. Finally, we observed that a given area’s collection of preferred stimuli (‘own-stimuli’) drive neurons from the same area more effectively through their dynamic range compared to preferred stimuli from other areas (‘other-stimuli’). As a result, feature preferences of neurons within an area are organized to maximally encode differences among own-stimuli while remaining insensitive to differences among other-stimuli. These results reveal how visual areas work together to efficiently encode information about the external world.
Using modular connectome-based predictive modeling to reveal brain-behavior relationships of individual differences in working memory
Huayi Yang
Junjun Zhang
Zhenlan Jin
Ling Li
Towards Out-of-Distribution Adversarial Robustness
Adam Ibrahim
Charles Guille-Escuret
Adversarial robustness continues to be a major challenge for deep learning. A core issue is that robustness to one type of attack often fail… (see more)s to transfer to other attacks. While prior work establishes a theoretical trade-off in robustness against different
Learning Robust Kernel Ensembles with Kernel Average Pooling
Adam Ibrahim
Amirozhan Dehghani
Yifei Ren