Publications

Neither Valid Nor Reliable? Investigating the Use of LLMs as Judges
Reasoning with Preference Constraints: A Benchmark for Language Models in Many-to-One Matching Markets
Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models
Istabrak Abbes
Matthew D Riemer
Tsuguchika Tabaru
Hiroaki Kingetsu
Reward the Reward Designer: Making Reinforcement Learning Useful for Clinical Decision Making
Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework
Clea Chataigner
Rebecca Ma
Elliot Creager
Towards Democratizing LLMs: Investigating Multilingual Mixture-of-Experts Models
Unifying Mechanistic Interpretations of Neural Networks Trained on Modular Addition
FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation
Neural audio codecs are a fundamental component of modern generative audio pipelines. Although recent codecs achieve strong low-bitrate reco… (voir plus)nstruction and provide powerful representations for downstream tasks, most are non-streamable, limiting their use in real-time applications. We present FocalCodec-Stream, a hybrid codec based on focal modulation that compresses speech into a single binary codebook at 0.55 - 0.80 kbps with a theoretical latency of 80 ms. Our approach combines multi-stage causal distillation of WavLM with targeted architectural improvements, including a lightweight refiner module that enhances quality under latency constraints. Experiments show that FocalCodec-Stream outperforms existing streamable codecs at comparable bitrates, while preserving both semantic and acoustic information. The result is a favorable trade-off between reconstruction quality, downstream task performance, latency, and efficiency. Code and checkpoints will be released at https://github.com/lucadellalib/focalcodec.
GROOD: Gradient-Aware Out-of-Distribution Detection
Mostafa Elaraby
Yann Batiste Pequignot
Paul Novello
On the compatibility of generative AI and generative linguistics
Masoud Jasbi
Why all roads don't lead to Rome: Representation geometry varies across the human visual cortical hierarchy
caskade: building Pythonic scientific simulators