Publications

Spatial CAPTCHA: Generatively Benchmarking Spatial Reasoning for Human-Machine Differentiation
Arina Kharlamova
Bowei He
Xue Liu
Online services rely on CAPTCHAs as a first line of defense against automated abuse, yet recent advances in multi-modal large language model… (voir plus)s (MLLMs) have eroded the effectiveness of conventional designs that focus on text recognition or 2D image understanding. To address this challenge, we present **Spatial CAPTCHA**, a novel human-verification framework that leverages fundamental differences in spatial reasoning between humans and MLLMs. Unlike existing CAPTCHAs that rely on low-level perception tasks vulnerable to modern AI, Spatial CAPTCHA generates dynamic questions requiring geometric reasoning, perspective-taking, occlusion handling, and mental rotation—skills intuitive for humans but difficult for current AI systems. The system employs a procedural generation pipeline with constraint-based difficulty control, automated correctness verification, and human-in-the-loop validation to ensure scalability, robustness, and adaptability. Evaluation on a corresponding benchmark, **Spatial-CAPTCHA-Bench**, demonstrates that humans vastly outperform 10 state-of-the-art MLLMs, with the best model achieving only 31.0\% Pass@1 accuracy. Result comparison with Google reCAPTCHA further confirms the effectiveness of Spatial CAPTCHA as both a security mechanism and a diagnostic tool for spatial reasoning in AI.
Spatial pattern regression for meteorological fields interpolation
Vihotogb eacute Houssou
Species Loss Scenarios Identify Canada's Northern Ecosystems as Disproportionately Vulnerable
Isaac Eckert
Dominique Caron
SSFL: Discovering Sparse Unified Subnetworks at Initialization for Efficient Federated Learning
Riyasat Ohib
Bishal Thapaliya
Jingyu Liu 0001
Vince D. Calhoun
Sergey Plis
In this work, we propose Salient Sparse Federated Learning (SSFL), a streamlined approach for sparse federated learning with efficient commu… (voir plus)nication. SSFL identifies a sparse subnetwork prior to training, leveraging parameter saliency scores computed separately on local client data in non-IID scenarios, and then aggregated, to determine a global mask. Only the sparse model weights are trained and communicated each round between the clients and the server. On standard benchmarks including CIFAR-10, CIFAR-100, and Tiny-ImageNet, SSFL consistently improves the accuracy sparsity trade off, achieving more than 20\% relative error reduction on CIFAR-10 compared to the strongest sparse baseline, while reducing communication costs by
Supervised Multimodal Model for Plasma Spray Diagnostics and Spray Health Monitoring
Sareh Soleimani
Cristian Cojocaru
Kintak Raymond Yu
Suspected Biliary Atresia in Brazil: Impact of Regional Healthcare Variations on Diagnostic Timeliness
Luiza Telles
Paulo Henrique Moreira Melo
Ana Maria Bicudo Diniz
Gabriele Lech
Ayla Gerk
Lauren Kratky
David P. Mooney
Joaquim Bustorff-Silva
Tequila: Deadzone-free Ternary Quantization for Large Language Models
Hong Huang
Decheng Wu
Rui Cen
Guanghua Yu
Zonghang Li
Kai Liu
Jianchen Zhu
Peng Chen
Xue Liu
Dapeng Wu
Quantization techniques are essential for the deployment of Large Language Models (LLMs) on edge devices. However, prevailing methods often … (voir plus)rely on mixed-precision multiplication that lacks efficient hardware support, making it not feasible. Ternary weight quantization addresses this by constraining weights to {-1, 0, 1}, replacing expensive multiplications with hardware-efficient additions. However, such aggressive compression leads to significant accuracy degradation, even after costly quantization-aware training with massive data. We identify the core issue as _**deadzone trapping**: a large number of weights are trapped at the deadzone boundary._ This occurs because these weights receive only noisy, less informative gradients, preventing stable escape from the deadzone and severely impeding model capacity and optimization. To address this issue, we propose **Tequila**, a trapping-free quantization optimization method that reactivates deadzone-trapped weights by repurposing them as dynamic biases. This allows the repurposed weights to provide a continuous signal in the forward pass and, critically, receive direct, meaningful gradient signals during backpropagation, thereby enhancing model capacity and optimization with nearly _zero_ inference overhead. Extensive evaluations demonstrate that Tequila outperforms state-of-the-art (SOTA) ternary quantization methods across five benchmarks. Specifically, on the ARC benchmark, it achieves
The 2nd Workshop on Foundation Models for Science: Real-World Impact and Science-First Design
Wuyang Chen
Yongji Wang
N. Benjamin Erichson
Bo Li
Damian Borth
Swarat Chaudhuri
Scientific foundation models should be built for science, not for generic AI tastes or leaderboard prestige. This workshop centers problem-d… (voir plus)riven design: models that measurably advance real scientific inquiries, e.g., forecasting extreme climate events, accelerating materials discovery, understanding biological mechanisms, co-developed with domain experts and validated against field data, experiments, and downstream impact. We argue that foundation models for science must be built differently from language and vision. Scientific data are physical, causal, spatiotemporal, and often scarce or biased; objectives must reflect mechanistic fidelity, not just predictive accuracy. This calls for scientific priors and constraints, robust uncertainty quantification (UQ), and architectures that natively handle multi-modality (e.g., grids, meshes, spectra, time series, point clouds, text, images, code). It also demands tight integration with classical scientific tools (simulators, PDE solvers, optimization and inference engines, and HPC workflows) to yield hybrid systems that are faster, more accurate, and more trustworthy. We will highlight opportunities and hard problems unique to science: enforcing conservation laws and symmetries; learning across vast spatial and temporal scales; representing extreme events and tipping points; calibrating and validating UQ; and developing evaluation protocols that reward mechanistic insight and actionable reliability. The goal is a roadmap for building, training, and deploying scientific foundation models that accelerate discovery while respecting the structure of the natural world.
The Design Space of Tri-Modal Masked Diffusion Models
Victor Turrisi
Bruno Kacper Mlodozeniec
Pau Rodriguez Lopez
Lokesh Boominathan
Nikhil Bhendawade
Amitis Shidani
Joris Pelemans
Theo X. Olausson
Paul Dixon
Joao Monteiro
Pierre Ablin
Vishnu Banna
Arno Blaas
Nick Henderson
Kari Noriy
Dan Busbridge
Josh Susskind
Marco Cuturi … (voir 4 de plus)
Irina Belousova
Luca Zappella
Russ Webb
Jason Ramapuram
Discrete diffusion models have emerged as strong alternatives to autoregressive language models, with recent work initializing and fine-tuni… (voir plus)ng a base unimodal model for bimodal generation. Diverging from previous approaches, we introduce the first tri-modal masked diffusion model pretrained from scratch on text, image-text, and audio-text data. We systematically analyze multimodal scaling laws, modality mixing ratios, noise schedules, and batch-size effects, and we provide optimized inference sampling defaults. Our batch-size analysis yields a novel stochastic differential equation (SDE)-based reparameterization that eliminates the need for tuning the optimal batch size as reported in recent work. This reparameterization decouples the physical batch size, often chosen based on compute constraints (GPU saturation, FLOP efficiency, wall-clock time), from the logical batch size, chosen to balance gradient variance during stochastic optimization. Finally, we pretrain a preliminary 3B-parameter tri-modal model on 6.4T tokens, demonstrating the capabilities of a unified design and achieving strong results in text generation, text-to-image tasks, and text-to-speech tasks. Our work represents the largest-scale systematic open study of multimodal discrete diffusion models conducted to date, providing insights into scaling behaviors across multiple modalities.
The Expressive Limits of Diagonal SSMs for State-Tracking
State-Space Models (SSMs) have recently been shown to achieve strong empirical performance on a variety of long-range sequence modeling task… (voir plus)s while remaining efficient and highly-parallelizable. However, the theoretical understanding of their expressive power remains limited. In this work, we study the expressivity of input-Dependent Complex-valued Diagonal (DCD) State-Space Models (SSMs) on sequential state-tracking tasks for abstract groups. It is easy to show that a single DCD SSM layer with a universal decoder can track any Abelian group at finite precision by decomposing it into a product of cyclic groups. We show that this is tight by proving that such a model cannot track any non-Abelian group at finite precision. We further establish the expressivity of multi-layer DCD SSMs. We show that a
The Geometry and Topology of Circuits: the Manifolds of Modular Addition
The Clock and Pizza interpretations, associated with architectures differing in either uniform or learnable attention, were introduced to ar… (voir plus)gue that different architectural designs can yield distinct circuits for modular addition. In this work, we show that this is not the case, and that both the uniform and trainable attention architectures implement the same algorithm via topologically and geometrically equivalent representations. Our methodology goes beyond the interpretation of individual neurons and weights. Instead, we identify all of the neurons corresponding to each learned representation and then study the collective group of neurons as one entity. This method reveals that each learned representation is a manifold that we can study utilizing tools from topology. Based on this insight, we can statistically analyze the learned representations across hundreds of circuits to demonstrate the similarity between learned modular addition circuits that arise naturally from common deep learning paradigms.
The Impact of Pediatric Surgery Global Travel Fellowships: A Study by the Canadian Association of Paediatric Surgeons Global Partnership Committee.
Sacha Williams
Natasha Bejjani
Elena Guadagno
Robert Baird
Shahrzad Joharifard
Melanie Morris
Robin Petroze
Sherif Emil