Publications

The Roles of Neural Networks in Language Acquisition
Masoud Jasbi
How can modern neural networks like language models be useful to the field of language acquisition, and more broadly cognitive science, if t… (voir plus)hey are not a priori designed to be cognitive models? As developments towards natural language understanding and generation have improved leaps and bounds, with models like GPT‐4, the question of how they can inform our understanding of human language acquisition has re‐emerged. As such, it is critical to examine how in practice linking hypotheses between models and human learners can be safely established. To address these questions, we propose a model taxonomy, including four modelling approaches, each having differing goals, from exploratory hypothesis generation to hypothesis differentiation and testing. We show how the goals of these approaches align with the overarching goals of science and linguistics by connecting our taxonomy to the realist versus instrumentalist approaches in philosophy of science. We survey recent work having adopted each of our modelling approaches and address the importance of computational modelling in language acquisition studies.
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
The dominant paradigm for RLHF is online and on-policy RL: synchronously generating from the large language model (LLM) policy, labelling wi… (voir plus)th a reward model, and learning using feedback on the LLM's own outputs. While performant, this paradigm is computationally inefficient. Inspired by classical deep RL literature, we propose separating generation and learning in RLHF. This enables asynchronous generation of new samples while simultaneously training on old samples, leading to faster training and more compute-optimal scaling. However, asynchronous training relies on an underexplored regime, online but off-policy RLHF: learning on samples from previous iterations of our model. To understand the challenges in this regime, we investigate a fundamental question: how much off-policyness can we tolerate for asynchronous training to speed up learning but maintain performance? Among several RLHF algorithms we tested, we find that online DPO is most robust to off-policy data, and robustness increases with the scale of the policy model. We study further compute optimizations for asynchronous RLHF but find that they come at a performance cost, giving rise to a trade-off. Finally, we verify the scalability of asynchronous RLHF by training LLaMA 3.1 8B on an instruction-following task 40% faster than a synchronous run while matching final performance.
Modulation of leg trajectory by transcranial magnetic stimulation during walking
H. Bourgeois
Rose Guay-Hottin
E.-M. Meftah
M. Martinez
D. Barthélemy
The primary motor cortex is involved in initiation and adaptive control of locomotion. However, the role of the motor cortex in controlling … (voir plus)gait trajectories remains unclear. In animals, cortical neuromodulation allows for precise control of step height. We hypothesized that a similar control framework applies to humans, whereby cortical stimulation would primarily increase foot elevation. Transcranial magnetic stimulation (TMS) was applied over the motor cortex to assess the involvement of the corticospinal tract over the limb trajectory during human walking. Eight healthy adults (aged 20-32 years) participated in treadmill walking at 1.5 km/h. TMS was applied over the left motor cortex at an intensity of 120% of the threshold to elicit a dorsiflexion of the right ankle during the swing phase of gait. Electromyographic (EMG) measurements and three-dimensional (3D) lower limb kinematics were collected. When delivered during the early swing phase, TMS led to a significant increase in the maximum height of the right toe by a mean of 40.7% ± 14.9% (25.6mm ± 9.4 mm, p = 0.0352) and knee height by 57.8%± 16.8%; (32mm ± 9.3 mm; p = 0.008) across participants. These findings indicate that TMS can influence limb trajectory during walking, highlighting its potential as a tool for studying cortical control of locomotion.
Overcoming State and Action Space Disparities in Multi-Domain, Multi-Task Reinforcement Learning
Reginald McLean
Kai Yuan
Isaac Woungang
Nariman Farsad
Current multi-task reinforcement learning (MTRL) methods have the ability to perform a large number of tasks with a single policy. However w… (voir plus)hen attempting to interact with a new domain, the MTRL agent would need to be re-trained due to differences in domain dynamics and structure. Because of these limitations, we are forced to train multiple policies even though tasks may have shared dynamics, leading to needing more samples and is thus sample inefficient. In this work, we explore the ability of MTRL agents to learn in various domains with various dynamics by simultaneously learning in multiple domains, without the need to fine-tune extra policies. In doing so we find that a MTRL agent trained in multiple domains induces an increase in sample efficiency of up to 70\% while maintaining the overall success rate of the MTRL agent.
Stick-breaking Attention
Shawn Tan
Yikang Shen
Songlin Yang
Rameswar Panda
Stick-breaking Attention
Shawn Tan
Yikang Shen
Songlin Yang
Rameswar Panda
Stick-breaking Attention
Shawn Tan
Yikang Shen
Songlin Yang
Rameswar Panda
Stick-breaking Attention
Shawn Tan
Yikang Shen
Songlin Yang
Rameswar Panda
Symmetry-Aware Generative Modeling through Learned Canonicalization
Arnab Kumar Mondal
Sékou-Oumar Kaba
Generative modeling of symmetric densities has a range of applications in AI for science, from drug discovery to physics simulations. The ex… (voir plus)isting generative modeling paradigm for invariant densities combines an invariant prior with an equivariant generative process. However, we observe that this technique is not necessary and has several drawbacks resulting from the limitations of equivariant networks. Instead, we propose to model a learned slice of the density so that only one representative element per orbit is learned. To accomplish this, we learn a group-equivariant canonicalization network that maps training samples to a canonical pose and train a non-equivariant generative model over these canonicalized samples. We implement this idea in the context of diffusion models. Our preliminary experimental results on molecular modeling are promising, demonstrating improved sample quality and faster inference time.
Fine-Tuning Web Agents: It Works, But It's Trickier Than You Think
Recent advancements in large language models (LLMs) have sparked interest in developing autonomous web agents capable of performing digital … (voir plus)tasks through web interfaces in a human-like manner. However, even the strongest closed-source models often struggle to achieve robust results on several benchmarks, while a notable performance gap exists between them and open-source counterparts. This study investigates the potential of fine-tuning to enhance the performance of a smaller, lower-performing but cost-efficient LLM by leveraging successful traces from stronger LLMs, referred to as experts. We outline a comprehensive pipeline for data collection, filtering, and supervised fine-tuning and explore various behavior cloning parameters. Our experiments provide key insights into the challenges of fine-tuning LLMs into web agents on benchmarks like MiniWoB and WorkArena. Notably, we find that the fine-tuned agents' ability to predict expert trajectories does not consistently lead to improved downstream task performance. This raises issues such as off-policy bias and the loss of reasoning abilities during fine-tuning. We discuss potential solutions to these challenges and make both the codebase and a dataset of 140M tokens open-source for the community to build upon.
Health satisfaction outcome from integrated autonomous mobile clinics
Yuzhang Huang
Shaoshan Liu
Zhongying Pan
Carl Wu
Herng-Chia Chiu
Leiyu Shi
Autonomous mobile clinics (AMCs) have the potential to revolutionize healthcare delivery by bringing healthcare services to patients at the … (voir plus)order of patient’s fingertips. Particularly, AMCs can act as an essential touch point of integrated care, which is a worldwide response to the fragmented delivery of health by focusing on more coordinated and integrated forms of care provision. However, the impact of AMCs on the health satisfaction outcome effectiveness still remains unknown. In this article, in collaboration with United Family Healthcare (UFH), we study the potential effectiveness improvement of integrated care delivery through AMCs. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-024-75611-x.
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Boxing Chen
The growth in prominence of large language models (LLMs) in everyday life can be largely attributed to their generative abilities, yet some … (voir plus)of this is also owed to the risks and costs associated with their use. On one front is their tendency to hallucinate false or misleading information, limiting their reliability. On another is the increasing focus on the computational limitations associated with traditional self-attention based LLMs, which has brought about new alternatives, in particular recurrent models, meant to overcome them. Yet it remains uncommon to consider these two concerns simultaneously. Do changes in architecture exacerbate/alleviate existing concerns about hallucinations? Do they affect how and where they occur? Through an extensive evaluation, we study how these architecture-based inductive biases affect the propensity to hallucinate. While hallucination remains a general phenomenon not limited to specific architectures, the situations in which they occur and the ease with which specific types of hallucinations can be induced can significantly differ based on the model architecture. These findings highlight the need for better understanding both these problems in conjunction with each other, as well as consider how to design more universal techniques for handling hallucinations.