Publications

Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework
Cléa Chataigner
Rebecca Ma
Elliot Creager
Towards Democratizing LLMs: Investigating Multilingual Mixture-of-Experts Models
Towards a generalizable, unified framework for decoding from multimodal neural activity
Nanda H Krishna
Avery Hee-Woon Ryoo
Matthew G Perich
Recent advances in neural decoding have led to the development of large-scale deep learning-based neural decoders that can generalize across… (voir plus) sessions and subjects. However, existing approaches predominantly focus on single modalities of neural activity, limiting their applicability to specific modalities and tasks. In this work, we present a multimodal extension of the POYO framework that jointly processes neuronal spikes and local field potentials (LFPs) for behavioural decoding. Our approach employs flexible tokenization schemes for both spikes and LFPs, enabling efficient processing of heterogeneous neural populations without preprocessing requirements like binning. Through experiments on data from nonhuman primates performing motor tasks, we demonstrate that multimodal pretraining yields superior decoding performance compared to unimodal baselines. We also show evidence of cross-modal transfer: models pretrained on both modalities outperform LFP-only models when fine-tuned solely on LFPs, suggesting a path toward more cost-effective brain-computer interfaces that can use performant LFP-based decoders. Our models also exhibit robustness to missing modalities during inference when trained with modality masking, and scale effectively with both model size and pretraining data. Overall, this work represents an important first step towards unified, general-purpose neural decoders capable of leveraging diverse neural signals for a variety of brain-computer interface applications.
Unifying Mechanistic Interpretations of Neural Networks Trained on Modular Addition
Virtual Consistency for Audio Editing
Mirco Ravanaelli
Yusuf Cem Sübakan
Free-form, text-based audio editing remains a persistent challenge, despite progress in inversion-based neural methods. Current approaches r… (voir plus)ely on slow inversion procedures, limiting their practicality. We present a virtual-consistency based audio editing system that bypasses inversion by adapting the sampling process of diffusion models. Our pipeline is model-agnostic, requiring no fine-tuning or architectural changes, and achieves substantial speed-ups over recent neural editing baselines. Crucially, it achieves this efficiency without compromising quality, as demonstrated by quantitative benchmarks and a user study involving 16 participants.
Accelerated Inorganic Materials Design with Generative AI Agents
Teruyasu Mizoguchi
Catalyst GFlowNet for electrocatalyst design: A hydrogen evolution reaction case study
Efficient and inexpensive energy storage is essential for accelerating the adoption of renewable energy and ensuring a stable supply, despit… (voir plus)e fluctuations in sources such as wind and solar. Electrocatalysts play a key role in hydrogen energy storage (HES), allowing the energy to be stored as hydrogen. However, the development of affordable and high-performance catalysts for this process remains a significant challenge. We introduce Catalyst GFlowNet, a generative model that leverages machine learning-based predictors of formation and adsorption energy to design crystal surfaces that act as efficient catalysts. We demonstrate the performance of the model through a proof-of-concept application to the hydrogen evolution reaction, a key reaction in HES, for which we successfully identified platinum as the most efficient known catalyst. In future work, we aim to extend this approach to the oxygen evolution reaction, where current optimal catalysts are expensive metal oxides, and open the search space to discover new materials. This generative modeling framework offers a promising pathway for accelerating the search for novel and efficient catalysts.
Concept-based Steering of Large Language Models for Conditional Molecular Generation
Modern LLMs, with their internet-scale pretraining and advanced human-level capabilities across specialized tasks, have demonstrated promisi… (voir plus)ng performance in molecular discovery using existing text-based molecular representations, such as SMILES and SELFIES. However, generating valid, unique, and high-fidelity molecules while precisely controlling for multiple properties simultaneously remains challenging. While prior works demonstrated success by fine-tuning language models on a novel corpus of molecules with property-conditioned tags, real-world applications require generating molecules from diverse property distributions, previously unseen in the training data. To this end, we present Concept-based Activation STeering (CAST), the first approach to apply activation steering to directly edit a model's internal representation for conditional molecular generation. CAST offers a lightweight, flexible alternative to fine-tuning by computing property-conditioned steering vectors via a concept network that does not require retraining the LLM. Through extensive experiments on datasets such as Therapeutics Data Commons, we show that CAST consistently outperforms existing methods on both in-distribution and out-of-distribution conditional generation tasks. We also conduct comprehensive ablation studies to highlight the extent of control our concept-guided steering provides on the molecules generated by the LLM.
Exposing and Mitigating Calibration Biases and Demographic Unfairness in MLLM Few-Shot In-Context Learning for Medical Image Classification
Mingyang Li
Hengguan Huang
The curriculum effect in visual learning: the role of readout dimensionality
Christopher C. Pack
Variational Visible Layers: A Practical Framework for Uncertainty Estimation
Spherical Harmonic Exponentials for Efficient Glossy Reflections
Ari Silvennoinen
Peter‐Pike Sloan
Michaƚ Iwanicki
Abstract We propose a high‐performance and compact method for computing glossy specular reflections. Commonly‐used prefiltered environme… (voir plus)nt maps have large storage requirements and high error due to constrained treatment of view‐dependence. We propose a factorized spherical harmonic exponential representation that exploits new observations of the benefits of log‐space reconstruction for reflectance. Our method is compact, properly accounts for view‐dependent reflections, and is more accurate than the state‐of‐the‐industry solutions. We achieve higher quality results with an order of magnitude less memory, all with efficient and alias‐free reconstruction of glossy reflections from environment lights and continuously‐varying material roughness.