Pouya Bashivan

2025-06-10

bioRxiv (prépublication)

Caption This, Reason That: VLMs Caught in the Middle

Zihan Weng

Lucas Gomez

Taylor Whittington Webb

2025-05-24

ArXiv (prépublication)

Building spatial world models from sparse transitional episodic memories

Zizhan He

Maxime Daigle

2025-05-19

ArXiv (prépublication)

Building spatial world models from sparse transitional episodic memories

Zizhan He

Maxime Daigle

Many animals possess a remarkable capacity to rapidly construct flexible mental models of their environments. These world models are crucial… (voir plus) for ethologically relevant behaviors such as navigation, exploration, and planning. The ability to form episodic memories and make inferences based on these sparse experiences is believed to underpin the efficiency and adaptability of these models in the brain. Here, we ask: Can a neural network learn to construct a spatial model of its surroundings from sparse and disjoint episodic memories? We formulate the problem in a simulated world and propose a novel framework, the Episodic Spatial World Model (ESWM), as a potential answer. We show that ESWM is highly sample-efficient, requiring minimal observations to construct a robust representation of the environment. It is also inherently adaptive, allowing for rapid updates when the environment changes. In addition, we demonstrate that ESWM readily enables near-optimal strategies for exploring novel environments and navigating between arbitrary points, all without the need for additional training.

2025-05-19

ArXiv (prépublication)

Real-time fine finger motion decoding for transradial amputees with surface electromyography

Zihan Weng

Yang Xiao

Peiyang Li

Chanlin Yi

Hailin Ma

Guang Yao

Yuan Lin

Fali Li

Dezhong Yao 0001

Jingming Hou

Yangsong Zhang

Peng Xu

2025-05-01

Neural Networks (publié)

Credit-based self organizing maps: training deep topographic networks with minimal performance degradation

Amirozhan Dehghani

Xinyu Qian

Asa Farahani

In the primate neocortex, neurons with similar function are often found to be spatially close. Kohonen's self-organizing map (SOM) has been … (voir plus)one of the most influential approaches for simulating brain-like topographical organization in artificial neural network models. However, integrating these maps into deep neural networks with multitude of layers has been challenging, with self-organized deep neural networks suffering from substantially diminished capacity to perform visual recognition. We identified a key factor leading to the performance degradation in self-organized topographical neural network models: the discord between predominantly bottom-up learning updates in the self-organizing maps, and those derived from top-down, credit-based learning approaches. To address this, we propose an alternative self organization algorithm, tailored to align with the top-down learning processes in deep neural networks. This model not only emulates critical aspects of cortical topography but also significantly narrows the performance gap between non-topographical and topographical models. This advancement underscores the substantial importance of top-down assigned credits in shaping topographical organization. Our findings are a step in reconciling topographical modeling with the functional efficacy of neural network models, paving the way for more brain-like neural architectures.

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

RGP: Achieving Memory-Efficient Model Fine-tuning Via Randomized Gradient Projection

Ali Saheb Pasand

Training and fine-tuning Large Language Models (LLMs) require significant memory due to the substantial growth in the size of weight paramet… (voir plus)ers and optimizer states. While methods like low-rank adaptation (LoRA), which introduce low-rank trainable modules in parallel to frozen pre-trained weights, effectively reduce memory usage, they often fail to preserve the optimization trajectory and are generally less effective for pre-training models. On the other hand, approaches, such as GaLore, that project gradients onto lower-dimensional spaces maintain the training trajectory and perform well in pre-training but suffer from high computational complexity, as they require repeated singular value decomposition on large matrices. In this work, we propose Randomized Gradient Projection (RGP), which outperforms GaLore, the current state-of-the-art in efficient fine-tuning, on the GLUE task suite, while being 74% faster on average and requiring similar memory.

2024-12-10

Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop (publié)

proceedings.mlr.press

Learning adversarially robust kernel ensembles with kernel average pooling

Reza Bayat

Adam Ibrahim

Amirozhan Dehghani

Yifei Ren

2024-12-01

Expert systems with applications (publié)

Learning adversarially robust kernel ensembles with kernel average pooling

Reza Bayat

Adam Ibrahim

Amirozhan Dehghani

Yifei Ren

2024-12-01

Expert systems with applications (publié)

Learning adversarially robust kernel ensembles with kernel average pooling

Reza Bayat

Adam Ibrahim

Amirozhan Dehghani

Yifei Ren

2024-12-01

Expert systems with applications (publié)

Geometry of naturalistic object representations in recurrent neural network models of working memory

Xiaoxuan Lei

Takuya Ito

Working memory is a central cognitive ability crucial for intelligent decision-making. Recent experimental and computational work studying w… (voir plus)orking memory has primarily used categorical (i.e., one-hot) inputs, rather than ecologically relevant, multidimensional naturalistic ones. Moreover, studies have primarily investigated working memory during single or few cognitive tasks. As a result, an understanding of how naturalistic object information is maintained in working memory in neural networks is still lacking. To bridge this gap, we developed sensory-cognitive models, comprising a convolutional neural network (CNN) coupled with a recurrent neural network (RNN), and trained them on nine distinct N-back tasks using naturalistic stimuli. By examining the RNN's latent space, we found that: (1) Multi-task RNNs represent both task-relevant and irrelevant information simultaneously while performing tasks; (2) The latent subspaces used to maintain specific object properties in vanilla RNNs are largely shared across tasks, but highly task-specific in gated RNNs such as GRU and LSTM; (3) Surprisingly, RNNs embed objects in new representational spaces in which individual object features are less orthogonalized relative to the perceptual space; (4) The transformation of working memory encodings (i.e., embedding of visual inputs in the RNN latent space) into memory was shared across stimuli, yet the transformations governing the retention of a memory in the face of incoming distractor stimuli were distinct across time. Our findings indicate that goal-driven RNNs employ chronological memory subspaces to track information over short time spans, enabling testable predictions with neural data.

2024-11-04

ArXiv (prépublication)

Geometry of naturalistic object representations in recurrent neural network models of working memory

Xiaoxuan Lei

Takuya Ito

2024-09-25

NeurIPS.cc/2024/Conference (poster)