REARANK: Reasoning Re-ranking Agent via Reinforcement Learning
Le Zhang
Bo Wang
Xipeng Qiu
We present REARANK, a large language model (LLM)-based listwise reasoning reranking agent. REARANK explicitly reasons before reranking, sign… (see more)ificantly improving both performance and interpretability. Leveraging reinforcement learning and data augmentation, REARANK achieves substantial improvements over baseline models across popular information retrieval benchmarks, notably requiring only 179 annotated samples. Built on top of Qwen2.5-7B, our REARANK-7B demonstrates performance comparable to GPT-4 on both in-domain and out-of-domain benchmarks and even surpasses GPT-4 on reasoning-intensive BRIGHT benchmarks. These results underscore the effectiveness of our approach and highlight how reinforcement learning can enhance LLM reasoning capabilities in reranking.
SCAR: Shapley Credit Assignment for More Efficient RLHF
Meng Cao
Shuyuan Zhang
Xiaojun Chang
The NaijaVoices Dataset: Cultivating Large-Scale, High-Quality, Culturally-Rich Speech Data for African Languages
Chris Emezue
The NaijaVoices Community
Busayo Awobade
Abraham Owodunni
Handel Emezue
Gloria Monica Tobechukwu Emezue
N. N. Emezue
Sewade Ogun
Bunmi Akinremi
Chris Pal
BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for Behavioural Change
Manuela Gonz'alez-Gonz'alez
Soufiane Belharbi
Muhammad Osama Zeeshan
Masoumeh Sharafi
Muhammad Haseeb Aslam
Alessandro Lameiras Koerich
Simon Bacon
Eric Granger
Recognizing complex emotions linked to ambivalence and hesitancy (A/H) can play a critical role in the personalization and effectiveness of … (see more)digital behaviour change interventions. These subtle and conflicting emotions are manifested by a discord between multiple modalities, such as facial and vocal expressions, and body language. Although experts can be trained to identify A/H, integrating them into digital interventions is costly and less effective. Automatic learning systems provide a cost-effective alternative that can adapt to individual users, and operate seamlessly within real-time, and resource-limited environments. However, there are currently no datasets available for the design of ML models to recognize A/H. This paper introduces a first Behavioural Ambivalence/Hesitancy (BAH) dataset collected for subject-based multimodal recognition of A/H in videos. It contains videos from 224 participants captured across 9 provinces in Canada, with different age, and ethnicity. Through our web platform, we recruited participants to answer 7 questions, some of which were designed to elicit A/H while recording themselves via webcam with microphone. BAH amounts to 1,118 videos for a total duration of 8.26 hours with 1.5 hours of A/H. Our behavioural team annotated timestamp segments to indicate where A/H occurs, and provide frame- and video-level annotations with the A/H cues. Video transcripts and their timestamps are also included, along with cropped and aligned faces in each frame, and a variety of participants meta-data. We include results baselines for BAH at frame- and video-level recognition in multi-modal setups, in addition to zero-shot prediction, and for personalization using unsupervised domain adaptation. The limited performance of baseline models highlights the challenges of recognizing A/H in real-world videos. The data, code, and pretrained weights are available.
Caption This, Reason That: VLMs Caught in the Middle
Zihan Weng
Lucas Gomez
Taylor Whittington Webb
Leveraging Per-Instance Privacy for Machine Unlearning
Nazanin Mohammadi Sepahvand
Anvith Thudi
Berivan Isik
Ashmita Bhattacharyya
Nicolas Papernot
Eleni Triantafillou
Daniel M. Roy
LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs
Pooneh Mousavi
Shubham Gupta
Foundation models based on large language models (LLMs) have shown great success in handling various tasks and modalities. However, adapting… (see more) these models for general-purpose audio-language tasks is challenging due to differences in acoustic environments and task variations. In this work, we introduce LiSTEN Learning Soft Token Embeddings for Neural Audio LLMs), a framework for adapting LLMs to speech and audio tasks. LiSTEN uses a dynamic prompt selection strategy with learnable key-value pairs, allowing the model to balance general and task-specific knowledge while avoiding overfitting in a multitask setting. Our approach reduces dependence on large-scale ASR or captioning datasets, achieves competitive performance with fewer trainable parameters, and simplifies training by using a single-stage process. Additionally, LiSTEN enhances interpretability by analyzing the diversity and overlap of selected prompts across different tasks.
Response letter to “Confounding by indication and exposure misclassification may undermine corticosteroid effect estimates in ICU patients with alcohol-related hepatitis”
Maxime Gasperment
Hafid AIT-OUFELLA
Introduction to the special issue on Computational Terminology
Patrick Drouin
Mind the GAP! The Challenges of Scale in Pixel-based Deep Reinforcement Learning
Ghada Sokar
Uncovering a Universal Abstract Algorithm for Modular Addition in Neural Networks
Gavin McCracken
Gabriela Moisescu-Pareja
Vincent Létourneau
Jonathan Love
We propose a testable universality hypothesis, asserting that seemingly disparate neural network solutions observed in the simple task of mo… (see more)dular addition are unified under a common abstract algorithm. While prior work interpreted variations in neuron-level representations as evidence for distinct algorithms, we demonstrate - through multi-level analyses spanning neurons, neuron clusters, and entire networks - that multilayer perceptrons and transformers universally implement the abstract algorithm we call the approximate Chinese Remainder Theorem. Crucially, we introduce approximate cosets and show that neurons activate exclusively on them. Furthermore, our theory works for deep neural networks (DNNs). It predicts that universally learned solutions in DNNs with trainable embeddings or more than one hidden layer require only O(log n) features, a result we empirically confirm. This work thus provides the first theory-backed interpretation of multilayer networks solving modular addition. It advances generalizable interpretability and opens a testable universality hypothesis for group multiplication beyond modular addition.
Dimension-adapted Momentum Outscales SGD
Damien Ferbach
Katie Everett
Elliot Paquette