Publications

Operationalizing Quantized Disentanglement
Vitória Barin-Pacela
P Vincent
RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring
Simulate intelligently: Causal incremental reinforcement learning for streamlined industrial chemical process design optimization
Eslam G. Al-Sakkari
Mohamed Ali
Daria C. Boffito
The role of Large Language Models in IoT security: A systematic review of advances, challenges, and opportunities
Saeid Jamshidi
Negar Shahabi
Amin Nikanjam
Kawser Wazed Nafi
Carol Fung
Understanding the role of depth in the neural tangent kernel for overparameterized neural networks
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
We present WebMMU, a multilingual benchmark that evaluates three core web tasks: (1) website visual question answering, (2) code editing inv… (voir plus)olving HTML/CSS/JavaScript, and (3) mockup-to-code generation. Unlike prior benchmarks that treat these tasks separately, WebMMU unifies them using expert-annotated, real-world web data to assess models'abilities in complex multi-step reasoning, precise element grounding, and functional UI comprehension and coding. Our evaluation shows that while multimodal large language models (MLLMs) perform well on basic information extraction, they struggle with reasoning and grounding, editing code to preserve functionality, and generating design-to-code that maintains hierarchy and supports multilingual content. These findings reveal key limitations in current MLLMs and underscore the need for improved multimodal and cross-lingual reasoning to build future web agents capable of automating diverse web development tasks.
Why Less is More (Sometimes): A Theory of Data Curation
Elvis Dopgima Dohmatob
Discovery of Sustainable Refrigerants through Physics-Informed RL Fine-Tuning of Sequence Models
Most refrigerants currently used in air-conditioning systems, such as hydrofluorocarbons, are potent greenhouse gases and are being phased d… (voir plus)own. Large-scale molecular screening has been applied to the search for alternatives, but in practice only about 300 refrigerants are known, and only a few additional candidates have been suggested without experimental validation. This scarcity of reliable data limits the effectiveness of purely data-driven methods. We present Refgen, a generative pipeline that integrates machine learning with physics-grounded inductive biases. Alongside fine-tuning for valid molecular generation, Refgen incorporates predictive models for critical properties, equations of state, thermochemical polynomials, and full vapor compression cycle simulations. These models enable reinforcement learning fine-tuning under thermodynamic constraints, enforcing consistency and guiding discovery toward molecules that balance efficiency, safety, and environmental impact. By embedding physics into the learning process, Refgen leverages scarce data effectively and enables de novo refrigerant discovery beyond the known set of compounds.
Curly Flow Matching for Learning Non-gradient Field Dynamics
Katarina Petrovi'c
Viggo Moro
Kacper Kapu'sniak
.Ismail .Ilkan Ceylan
Michael M. Bronstein
Avishek Bose
Modeling the transport dynamics of natural processes from population-level observations is a ubiquitous problem in the natural sciences. Suc… (voir plus)h models rely on key assumptions about the underlying process in order to enable faithful learning of governing dynamics that mimic the actual system behavior. The de facto assumption in current approaches relies on the principle of least action that results in gradient field dynamics and leads to trajectories minimizing an energy functional between two probability measures. However, many real-world systems, such as cell cycles in single-cell RNA, are known to exhibit non-gradient, periodic behavior, which fundamentally cannot be captured by current state-of-the-art methods such as flow and bridge matching. In this paper, we introduce Curly Flow Matching (Curly-FM), a novel approach that is capable of learning non-gradient field dynamics by designing and solving a Schr\"odinger bridge problem with a non-zero drift reference process -- in stark contrast to typical zero-drift reference processes -- which is constructed using inferred velocities in addition to population snapshot data. We showcase Curly-FM by solving the trajectory inference problems for single cells, computational fluid dynamics, and ocean currents with approximate velocities. We demonstrate that Curly-FM can learn trajectories that better match both the reference process and population marginals. Curly-FM expands flow matching models beyond the modeling of populations and towards the modeling of known periodic behavior in physical systems. Our code repository is accessible at: https://github.com/kpetrovicc/curly-flow-matching.git
Gistify! Codebase-Level Understanding via Runtime Execution
Hyunji Lee
Minseon Kim
Chinmay Singh
Matheus Pereira
Atharv Sonwane
Isadora White
Elias Stengel-Eskin
Mohit Bansal
Zhengyan Shi
Xingdi Yuan
Lucas Caccia
As coding agents are increasingly deployed in large codebases, the need to automatically design challenging, codebase-level evaluation is ce… (voir plus)ntral. We propose Gistify, a task where a coding LLM must create a single, minimal, self-contained file that can reproduce a specific functionality of a codebase. The coding LLM is given full access to a codebase along with a specific entrypoint (e.g., a python command), and the generated file must replicate the output of the same command ran under the full codebase, while containing only the essential components necessary to execute the provided command. Success on Gistify requires both structural understanding of the codebase, accurate modeling of its execution flow as well as the ability to produce potentially large code patches. Our findings show that current state-of-the-art models struggle to reliably solve Gistify tasks, especially ones with long executions traces.
Multi-Representation Attention Framework for Underwater Bioacoustic Denoising and Recognition
Youssef Soulaymani
Pierre Cauchy
Scaling Latent Reasoning via Looped Language Models
Ruiming Zhu
Zixuan Wang
Kai Hua
Ziniu Li
Haoran Que
Boyi Wei
Zixin Wen
Fan Yin
He Xing
Li Li
Jiajun Shi
Kaijing Ma
Shanda Li
Taylor Kergan
Andrew C. Smith
Xin Qu
Mude Hui
Bohong Wu
Qiyang Min … (voir 13 de plus)
Hongzhi Huang
Xun Zhou
Wei Ye
Jiaheng Liu
Jian Yang 0030
Yunfeng Shi
Chenghua Lin
Enduo Zhao
Tianle Cai
Ge Zhang
Jason K. Eshraghian
Modern LLMs are trained to"think"primarily via explicit text generation, such as chain-of-thought (CoT), which defers reasoning to post-trai… (voir plus)ning and under-leverages pre-training data. We present and open-source Ouro, named after the recursive Ouroboros, a family of pre-trained Looped Language Models (LoopLM) that instead build reasoning into the pre-training phase through (i) iterative computation in latent space, (ii) an entropy-regularized objective for learned depth allocation, and (iii) scaling to 7.7T tokens. Ouro 1.4B and 2.6B models enjoy superior performance that match the results of up to 12B SOTA LLMs across a wide range of benchmarks. Through controlled experiments, we show this advantage stems not from increased knowledge capacity, but from superior knowledge manipulation capabilities. We also show that LoopLM yields reasoning traces more aligned with final outputs than explicit CoT. We hope our results show the potential of LoopLM as a novel scaling direction in the reasoning era. Our model is available here: http://ouro-llm.github.io.