Portrait de Pouya Bashivan n'est pas disponible

Pouya Bashivan

Membre académique associé
Professeur adjoint, McGill University, Département de physiologie
Sujets de recherche
Neurosciences computationnelles

Biographie

Pouya Bashivan est professeur adjoint au Département de physiologie et membre du programme intégré en neurosciences de l'Université McGill, ainsi que membre associé de Mila – Institut québécois d'intelligence artificielle. Avant de se joindre à l'Université McGill, il a été chercheur postdoctoral à Mila, travaillant avec Irina Rish et Blake Richards. Auparavant, il a été chercheur postdoctoral au Département des sciences du cerveau et de la cognition et à l'Institut McGovern pour la recherche sur le cerveau du Massachusetts Institute of Technology (MIT), où il a travaillé avec le professeur James DiCarlo. Il a obtenu un doctorat en génie informatique de l'Université de Memphis en 2016, après avoir obtenu une licence et une maîtrise en ingénierie électrique et de contrôle de l'Université KNT (Téhéran, Iran).

L'objectif de la recherche menée à son laboratoire est de développer des modèles de réseaux neuronaux qui exploitent la mémoire pour résoudre des tâches complexes. Alors que nous nous appuyons souvent sur des mesures de performance des tâches pour trouver des modèles de réseaux neuronaux et des algorithmes d'apprentissage améliorés, nous utilisons également des mesures neuronales et comportementales provenant de cerveaux d’humains et d'autres animaux pour évaluer la similitude de ces modèles avec des cerveaux biologiquement évolués. Nous pensons que ces contraintes supplémentaires pourraient accélérer les progrès vers l'ingénierie d'un agent artificiellement intelligent de niveau humain.

Étudiants actuels

Maîtrise recherche - McGill
Maîtrise recherche - McGill
Stagiaire de recherche - McGill
Doctorat - McGill
Doctorat - McGill
Co-superviseur⋅e :

Publications

Credit-based self organizing maps: training deep topographic networks with minimal performance degradation
Amirozhan Dehghani
Xinyu Qian
Asa Farahani
In the primate neocortex, neurons with similar function are often found to be spatially close. Kohonen's self-organizing map (SOM) has been … (voir plus)one of the most influential approaches for simulating brain-like topographical organization in artificial neural network models. However, integrating these maps into deep neural networks with multitude of layers has been challenging, with self-organized deep neural networks suffering from substantially diminished capacity to perform visual recognition. We identified a key factor leading to the performance degradation in self-organized topographical neural network models: the discord between predominantly bottom-up learning updates in the self-organizing maps, and those derived from top-down, credit-based learning approaches. To address this, we propose an alternative self organization algorithm, tailored to align with the top-down learning processes in deep neural networks. This model not only emulates critical aspects of cortical topography but also significantly narrows the performance gap between non-topographical and topographical models. This advancement underscores the substantial importance of top-down assigned credits in shaping topographical organization. Our findings are a step in reconciling topographical modeling with the functional efficacy of neural network models, paving the way for more brain-like neural architectures.
RGP: Achieving Memory-Efficient Model Fine-tuning Via Randomized Gradient Projection
Ali Saheb Pasand
Training and fine-tuning Large Language Models (LLMs) require significant memory due to the substantial growth in the size of weight paramet… (voir plus)ers and optimizer states. While methods like low-rank adaptation (LoRA), which introduce low-rank trainable modules in parallel to frozen pre-trained weights, effectively reduce memory usage, they often fail to preserve the optimization trajectory and are generally less effective for pre-training models. On the other hand, approaches, such as GaLore, that project gradients onto lower-dimensional spaces maintain the training trajectory and perform well in pre-training but suffer from high computational complexity, as they require repeated singular value decomposition on large matrices. In this work, we propose Randomized Gradient Projection (RGP), which outperforms GaLore, the current state-of-the-art in efficient fine-tuning, on the GLUE task suite, while being 74% faster on average and requiring similar memory.
Learning adversarially robust kernel ensembles with kernel average pooling
Reza Bayat
Adam Ibrahim
Amirozhan Dehghani
Yifei Ren
Learning adversarially robust kernel ensembles with kernel average pooling
Reza Bayat
Adam Ibrahim
Amirozhan Dehghani
Yifei Ren
Learning adversarially robust kernel ensembles with kernel average pooling
Reza Bayat
Adam Ibrahim
Amirozhan Dehghani
Yifei Ren
Geometry of naturalistic object representations in recurrent neural network models of working memory
Xiaoxuan Lei
Takuya Ito
Working memory is a central cognitive ability crucial for intelligent decision-making. Recent experimental and computational work studying w… (voir plus)orking memory has primarily used categorical (i.e., one-hot) inputs, rather than ecologically relevant, multidimensional naturalistic ones. Moreover, studies have primarily investigated working memory during single or few cognitive tasks. As a result, an understanding of how naturalistic object information is maintained in working memory in neural networks is still lacking. To bridge this gap, we developed sensory-cognitive models, comprising a convolutional neural network (CNN) coupled with a recurrent neural network (RNN), and trained them on nine distinct N-back tasks using naturalistic stimuli. By examining the RNN's latent space, we found that: (1) Multi-task RNNs represent both task-relevant and irrelevant information simultaneously while performing tasks; (2) The latent subspaces used to maintain specific object properties in vanilla RNNs are largely shared across tasks, but highly task-specific in gated RNNs such as GRU and LSTM; (3) Surprisingly, RNNs embed objects in new representational spaces in which individual object features are less orthogonalized relative to the perceptual space; (4) The transformation of working memory encodings (i.e., embedding of visual inputs in the RNN latent space) into memory was shared across stimuli, yet the transformations governing the retention of a memory in the face of incoming distractor stimuli were distinct across time. Our findings indicate that goal-driven RNNs employ chronological memory subspaces to track information over short time spans, enabling testable predictions with neural data.
Geometry of naturalistic object representations in recurrent neural network models of working memory
Xiaoxuan Lei
Takuya Ito
Burst firing optimizes invariant coding of natural communication signals by electrosensory neural populations
Michael G. Metzen
Amin Akhshi
Anmar Khadra
Maurice J. Chacron
Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques
Rishika Bhagwatkar
Shravan Nayak
Reza Bayat
Alexis Roger
Daniel Z Kaplan
Vision-Language Models (VLMs) have witnessed a surge in both research and real-world applications. However, as they becoming increasingly pr… (voir plus)evalent, ensuring their robustness against adversarial attacks is paramount. This work systematically investigates the impact of model design choices on the adversarial robustness of VLMs against image-based attacks. Additionally, we introduce novel, cost-effective approaches to enhance robustness through prompt formatting. By rephrasing questions and suggesting potential adversarial perturbations, we demonstrate substantial improvements in model robustness against strong image-based attacks such as Auto-PGD. Our findings provide important guidelines for developing more robust VLMs, particularly for deployment in safety-critical environments.
Local lateral connectivity is sufficient for replicating cortex-like topographical organization in deep neural networks
Xinyu Qian
Amirozhan Dehghani
Asa Borzabadi Farahani
Across the primate cortex, neurons that perform similar functions tend to be spatially grouped together. In high-level visual cortex, this w… (voir plus)idely observed biological rule manifests itself as a modular organization of neuronal clusters, each tuned to a specific object category. The tendency toward short connections is one of the most widely accepted views of why such an organization exists in the brains of many animals. Yet, how such a feat is implemented at the neural level remains unclear. Here, using artificial deep neural networks as test beds, we demonstrate that a topographical organization similar to that in the primary, intermediate, and high-level human visual cortex emerges when units in these models are laterally connected and their weight parameters are tuned by top-down credit assignment. Importantly, the emergence of the modular organization in the absence of explicit topography-inducing learning rules and objectives questions their necessity and suggests that local lateral connectivity alone may be sufficient for the formation of the topographic organization across the cortex.
A Hybrid CNN-Transformer Approach for Continuous Fine Finger Motion Decoding from sEMG Signals
Zihan Weng
Xiabing Zhang
Yufeng Mou
Chanlin Yi
Fali Li
Peng Xu
This work presents a novel approach that synergistically integrates convolutional neural networks (CNNs) and Transformer models for decoding… (voir plus) continuous fine finger motions from surface electromyography (sEMG) signals. This integration capitalizes on CNNs’ proficiency in extracting rich temporal and spatial features from multichannel sEMG data and the Transformer’s superior capability in recognizing complex patterns and long-range dependencies. A significant advancement in this field is the use of a custom-developed Epidermal Electrode Array Sleeve (EEAS) for capturing high-fidelity sEMG signals, enabling more accurate and reliable signal acquisition than traditional methods. The decoded joint angles could be used in seamless and intuitive human-machine interaction in various applications, such as virtual reality, augmented reality, robotic control, and prosthetic control. Evaluations demonstrate the superior performance of the proposed CNN-Transformer hybrid architecture in decoding continuous fine finger motions, outperforming individual CNN and Transformer models. The synergistic integration of CNNs and Transformers presents a powerful framework for sEMG decoding, offering exciting opportunities for naturalistic and intuitive human-machine interaction applications. Its robustness and efficiency make it an ideal choice for real-world applications, promising to enhance the interface between humans and machines significantly. The implications of this research extend to advancing the understanding of human neuromuscular signals and their application in computing interfaces.
How well do models of visual cortex generalize to out of distribution samples?
Yifei Ren