Portrait de Pouya Bashivan n'est pas disponible

Pouya Bashivan

Membre académique associé
Professeur adjoint, McGill University, Département de physiologie
Sujets de recherche
Neurosciences computationnelles

Biographie

Pouya Bashivan est professeur adjoint au Département de physiologie et membre du programme intégré en neurosciences de l'Université McGill, ainsi que membre associé de Mila – Institut québécois d'intelligence artificielle. Avant de se joindre à l'Université McGill, il a été chercheur postdoctoral à Mila, travaillant avec Irina Rish et Blake Richards. Auparavant, il a été chercheur postdoctoral au Département des sciences du cerveau et de la cognition et à l'Institut McGovern pour la recherche sur le cerveau du Massachusetts Institute of Technology (MIT), où il a travaillé avec le professeur James DiCarlo. Il a obtenu un doctorat en génie informatique de l'Université de Memphis en 2016, après avoir obtenu une licence et une maîtrise en ingénierie électrique et de contrôle de l'Université KNT (Téhéran, Iran).

L'objectif de la recherche menée à son laboratoire est de développer des modèles de réseaux neuronaux qui exploitent la mémoire pour résoudre des tâches complexes. Alors que nous nous appuyons souvent sur des mesures de performance des tâches pour trouver des modèles de réseaux neuronaux et des algorithmes d'apprentissage améliorés, nous utilisons également des mesures neuronales et comportementales provenant de cerveaux d’humains et d'autres animaux pour évaluer la similitude de ces modèles avec des cerveaux biologiquement évolués. Nous pensons que ces contraintes supplémentaires pourraient accélérer les progrès vers l'ingénierie d'un agent artificiellement intelligent de niveau humain.

Étudiants actuels

Maîtrise recherche - McGill
Maîtrise recherche - McGill
Collaborateur·rice de recherche - McGill
Doctorat - McGill
Doctorat - McGill
Co-superviseur⋅e :

Publications

Real-time fine finger motion decoding for transradial amputees with surface electromyography
Zihan Weng
Yang Xiao
Peiyang Li
Chanlin Yi
Hailin Ma
Guang Yao
Yuan Lin
Fali Li
Dezhong Yao 0001
Jingming Hou
Yangsong Zhang
Peng Xu
Caption This, Reason That: VLMs Caught in the Middle
Zihan Weng
Lucas Gomez
Taylor Whittington Webb
Building spatial world models from sparse transitional episodic memories
Zizhan He
Maxime Daigle
Building spatial world models from sparse transitional episodic memories
Zizhan He
Maxime Daigle
Many animals possess a remarkable capacity to rapidly construct flexible mental models of their environments. These world models are crucial… (voir plus) for ethologically relevant behaviors such as navigation, exploration, and planning. The ability to form episodic memories and make inferences based on these sparse experiences is believed to underpin the efficiency and adaptability of these models in the brain. Here, we ask: Can a neural network learn to construct a spatial model of its surroundings from sparse and disjoint episodic memories? We formulate the problem in a simulated world and propose a novel framework, the Episodic Spatial World Model (ESWM), as a potential answer. We show that ESWM is highly sample-efficient, requiring minimal observations to construct a robust representation of the environment. It is also inherently adaptive, allowing for rapid updates when the environment changes. In addition, we demonstrate that ESWM readily enables near-optimal strategies for exploring novel environments and navigating between arbitrary points, all without the need for additional training.
Credit-based self organizing maps: training deep topographic networks with minimal performance degradation
Amirozhan Dehghani
Xinyu Qian
Asa Farahani
In the primate neocortex, neurons with similar function are often found to be spatially close. Kohonen's self-organizing map (SOM) has been … (voir plus)one of the most influential approaches for simulating brain-like topographical organization in artificial neural network models. However, integrating these maps into deep neural networks with multitude of layers has been challenging, with self-organized deep neural networks suffering from substantially diminished capacity to perform visual recognition. We identified a key factor leading to the performance degradation in self-organized topographical neural network models: the discord between predominantly bottom-up learning updates in the self-organizing maps, and those derived from top-down, credit-based learning approaches. To address this, we propose an alternative self organization algorithm, tailored to align with the top-down learning processes in deep neural networks. This model not only emulates critical aspects of cortical topography but also significantly narrows the performance gap between non-topographical and topographical models. This advancement underscores the substantial importance of top-down assigned credits in shaping topographical organization. Our findings are a step in reconciling topographical modeling with the functional efficacy of neural network models, paving the way for more brain-like neural architectures.
RGP: Achieving Memory-Efficient Model Fine-tuning Via Randomized Gradient Projection
Ali Saheb Pasand
Training and fine-tuning Large Language Models (LLMs) require significant memory due to the substantial growth in the size of weight paramet… (voir plus)ers and optimizer states. While methods like low-rank adaptation (LoRA), which introduce low-rank trainable modules in parallel to frozen pre-trained weights, effectively reduce memory usage, they often fail to preserve the optimization trajectory and are generally less effective for pre-training models. On the other hand, approaches, such as GaLore, that project gradients onto lower-dimensional spaces maintain the training trajectory and perform well in pre-training but suffer from high computational complexity, as they require repeated singular value decomposition on large matrices. In this work, we propose Randomized Gradient Projection (RGP), which outperforms GaLore, the current state-of-the-art in efficient fine-tuning, on the GLUE task suite, while being 74% faster on average and requiring similar memory.
Learning adversarially robust kernel ensembles with kernel average pooling
Reza Bayat
Adam Ibrahim
Amirozhan Dehghani
Yifei Ren
Learning adversarially robust kernel ensembles with kernel average pooling
Reza Bayat
Adam Ibrahim
Amirozhan Dehghani
Yifei Ren
Learning adversarially robust kernel ensembles with kernel average pooling
Reza Bayat
Adam Ibrahim
Amirozhan Dehghani
Yifei Ren
Geometry of naturalistic object representations in recurrent neural network models of working memory
Xiaoxuan Lei
Takuya Ito
Working memory is a central cognitive ability crucial for intelligent decision-making. Recent experimental and computational work studying w… (voir plus)orking memory has primarily used categorical (i.e., one-hot) inputs, rather than ecologically relevant, multidimensional naturalistic ones. Moreover, studies have primarily investigated working memory during single or few cognitive tasks. As a result, an understanding of how naturalistic object information is maintained in working memory in neural networks is still lacking. To bridge this gap, we developed sensory-cognitive models, comprising a convolutional neural network (CNN) coupled with a recurrent neural network (RNN), and trained them on nine distinct N-back tasks using naturalistic stimuli. By examining the RNN's latent space, we found that: (1) Multi-task RNNs represent both task-relevant and irrelevant information simultaneously while performing tasks; (2) The latent subspaces used to maintain specific object properties in vanilla RNNs are largely shared across tasks, but highly task-specific in gated RNNs such as GRU and LSTM; (3) Surprisingly, RNNs embed objects in new representational spaces in which individual object features are less orthogonalized relative to the perceptual space; (4) The transformation of working memory encodings (i.e., embedding of visual inputs in the RNN latent space) into memory was shared across stimuli, yet the transformations governing the retention of a memory in the face of incoming distractor stimuli were distinct across time. Our findings indicate that goal-driven RNNs employ chronological memory subspaces to track information over short time spans, enabling testable predictions with neural data.
Geometry of naturalistic object representations in recurrent neural network models of working memory
Xiaoxuan Lei
Takuya Ito
Burst firing optimizes invariant coding of natural communication signals by electrosensory neural populations
Michael G. Metzen
Amin Akhshi
Anmar Khadra
Maurice J. Chacron