Publications

Dynamics and Representation Structure of Local Approximations to Gradient-Based Learning in Linear Recurrent Neural Networks

Biological and neuromorphic recurrent neural networks (RNNs) are subject to spatial and temporal locality constraints on the information tha… (see more)t can plausibly be used during learning. A common strategy to satisfy these constraints is to modify gradient descent by neglecting non-local terms to varying degrees, as in random feedback local online (RFLO) learning and truncated backpropagation through time (tBPTT). However, the learning dynamics of these algorithms, and how they compare with BPTT, remain poorly understood. We apply dynamical systems theory to data-aligned linear RNNs -- whose dynamics can be separated into orthogonal modes -- to compare stationary solutions, stability properties, and convergence rates, finding qualitatively distinct behaviour for RFLO versus BPTT and one-step tBPTT. We further observe that the solutions learned by RFLO are restricted to low-rank perturbations of initial parameters, a result which holds beyond the data-aligned setting. Our work provides analytical insight into how locality constraints shape learning dynamics, with implications for neuroscientific models of learning and alternative optimization approaches for RNNs.

2026-05-28

arXiv (preprint)

doi.org

arxiv.org

MosaicLeaks:Privacy Risks in Querying-in-the-Open for Deep Research Agents

Alexander Gurung

Spandana Gella

Alexandre Drouin

Issam H. Laradji

Perouz Taslakian

Rafael Pardinas

Deep research agents increasingly combine private local documents with external tools like web retrieval, creating a privacy risk: an agent'… (see more)s external queries may leak sensitive information from its local context. This risk is amplified by the mosaic effect, where individual queries may appear harmless but become revealing in aggregate. We introduce MosaicLeaks, a benchmark of 1,001 multi-hop deep research tasks that chain private enterprise documents and a public web corpus, forcing agents to make external queries that depend on local information. We evaluate leakage with an adversary LLM that observes only the agent's external queries and attempts to infer private information at three levels: the agent's research intent, answers to specific private questions and verifiable claims about the enterprise documents. We find that models across families and sizes frequently leak at all three levels, that zero-shot privacy prompting reduces but does not eliminate leakage and that reinforcement learning for task performance alone worsens leakage. To address this, we propose Privacy-Aware Deep Research (PA-DR), an RL framework that combines situational rewards for task success with a learned privacy classifier to provide dense credit assignment over both per-query and mosaic-level leakage. Training Qwen3-4B-Instruct with PA-DR improves accuracy from 48.7% to 58.7% and reduces answer and full-information leakage from 34.0% to 9.9%.

2026-05-28

arXiv (preprint)

doi.org

arxiv.org

Quantitative Equational Logic

Giorgio Bacci

Radu Mardare

Prakash Panangaden

Gordon Plotkin

We develop a quantitative analogue of equational reasoning, which we call quantitative equational logic. The quantitative equations use, ins… (see more)tead of classical equality, quantitative equalities, which are equalities indexed with nonnegative reals. Thus, s = ε t means that “ s and t are points in a metric space and their distance is less than ε”. Quantitative equalities will be used to encode behavioural distances, with ε being an upper bound on the measure of dissimilarity between two terms. We develop the metatheory of this subject. We define a notion of quantitative algebra, which is the quantitative analogue of universal algebra. We prove a completeness theorem for quantitative equational logic, and we show that we obtain monads on suitable categories of metric spaces. We present a set of examples where the free algebra of a quantitative equational theory corresponds to some well-known structure. These examples are: Hausdorff metrics from quantitative semilattices; p -Wasserstein metrics (hence also the Kantorovich metric), and the total variation metric.

2026-05-28

Journal of the ACM (published)

doi.org

Reinforcement Learning with Pairwise Preferences in Long-Term Decision Problems

Jonathan Colaço Carr

Prakash Panangaden

Doina Precup

Benjamin Van Roy

Reinforcement learning problems typically define the goal as maximizing the expected value of a scalar reward function. But, pairwise prefer… (see more)ences are often easier to specify than scalar rewards, and they express certain goals that scalar rewards cannot. Methods for reinforcement learning with pairwise preferences have thus received growing interest. Unfortunately, these methods are inefficient in problems with long time horizons, and they lack guarantees on the performance of Markov policies relative to history-dependent policies, which bridge the theory and practice of reinforcement learning. We therefore propose the \textit{Markov decision contest} as a new problem model for reinforcement learning with pairwise preferences. We prove that stationary Markov policies are optimal among all history-dependent policies, that solving a Markov decision contest exactly is in P, and that a simple iterative algorithm converges to an optimal policy at a sublinear rate. Lastly, in a set of high-dimensional decision problems with long time horizons, we show that our approximate algorithm is significantly more learning-efficient than prior work.

2026-05-28

arXiv (preprint)

doi.org

arxiv.org

Beyond Go/No-Go Decisions: A Regional Selection Framework for Uncertainty-Aware Molecule Screening

Weihua Shi

Yixuan Li

Tian Bai

Yijie Zhang

Kaiqiong Zhao

Marc‐André Legault

Hui Peng

Yue Zhao

Eric D. Kolaczyk

Xiang Yu

Archer Y. Yang

In drug discovery, quantitative structure–activity relationship (QSAR) models are widely used to guide Go/No-Go decisions within the Desig… (see more)n–Make–Test–Analyze (DMTA) cycle. However, conventional decision heuristics typically rely on a single cutoff, leading to a rigid binary select/discard paradigm. This approach is particularly ill-suited for borderline compounds near the decision boundary, where screening decisions are especially sensitive to prediction uncertainty and premature choices may either discard viable leads or advance likely failures, thereby increasing downstream assay costs. To address this limitation, we propose Regional Selection (RS), an uncertainty-aware three-way decision framework that partitions compounds into Predicted Pass, Predicted Fail, and Predicted Indeterminate regions. By explicitly reserving high-uncertainty compounds for targeted follow-up, RS avoids the pitfalls of premature binary classification. We formalize this framework through Regional Selection Inference (RSI), which casts region assignment as a multiple-hypothesis testing problem. We develop two imple- mentations of RSI: an empirical calibration-based method (RSI-EC), which thresholds uncertainty-normalized scores via empirical calibration, and a conformal selectionbased method (RSI-CS), which constructs conformal p-values for region assignment. RSI-EC is supported by large-sample calibration arguments, whereas RSI-CS provides finite-sample, distribution-free guarantees under exchangeability. Extensive evaluations across 15 high-dimensional QSAR benchmarks show that both RSI procedures reliably control the false discovery rate while maintaining high screening power. In limited-data regimes, RSI-CS yields particularly stable FDR control, whereas RSI-EC can be slightly less conservative; both perform strongly as sample sizes increase. We further study a cost-aware extension that incorporates asymmetric downstream costs through the score construction while keeping the nominal FDR target fixed. This extension introduces a tuning parameter that can reduce realized downstream cost, with dataset-dependent trade-offs against screening power. Overall, RSI offers a mathematically grounded and resource-aware alternative to single-threshold screening, allowing discovery teams to better balance decision confidence with assay budgets.

2026-05-27

ChemRxiv (accepted)

doi.org

EASE Configuration Facilitates A Reproducible Science of LLM Social Simulations

Sneheel Sarangi

Maximilian Puelma Touzel

Aurélien Bück-Kaeffer

Zachary Yang

Jean-François Godbout

Reihaneh Rabbany

LLMs are increasingly deployed to simulate social interactions, yet many of the existing simulators remain ad hoc and monolithic. This lack … (see more)of architectural standardization prevents reproducible research and complicates downstream evaluation. We advance a rigorous science of LLM-based multi-agent simulation by modularizing core components into Environments, Agents, Simulation engines, and Evaluation metrics (EASE). We demonstrate the utility of EASE configuration by wrapping it in an experimental study schema for orchestrating workflows centered around answering explicit research questions in generated scenarios. We contribute SiliSocS, an open-source, research-ready Silicon Society Sandbox implementing a study-structured EASE configuration to enable highly configurable and reproducible LLM-based social simulations. Using SiliSocS and EASE, we present three case studies, showcasing the system's comprehensive assessment of existing questions, ability to dive deeper into complex questions, and elaboration of existing studies, respectively. Together, these case studies highlight the limitations of current modeling approaches and isolate the impacts of design choices on key results.

2026-05-27

arXiv (preprint)

doi.org

arxiv.org

GFETM: Genome Foundation-based Embedded Topic Model for scATAC-seq Modeling

Yimin Fan

Yu Li

Jun Ding

Yue Li

Single-cell Assay for Transposase-Accessible Chromatin with sequencing (scATAC-seq) enables investigation of open chromatin landscapes at si… (see more)ngle-cell resolution, but its analysis remains challenging because of sparsity, noise, and dataset-specific peak vocabularies. Genome Foundation Models (GFMs), pre-trained on large DNA sequence corpora, offer a potential source of transferable sequence information for scATAC-seq modeling. We introduce the Genome Foundation Embedded Topic Model (\model{}), an interpretable framework that combines GFMs with the Embedded Topic Model (ETM) for sequence-informed scATAC-seq analysis. By integrating GFM-derived DNA sequence embeddings into a topic-model decoder, \model{} improves clustering quality on standard benchmarks and captures cell-state-specific transcription factor activity through motif scoring and attention-based interpretation.

2026-05-27

FM4LS @ International Conference on Machine Learning (poster)

openreview.net

ImmunoFoundation: A Multimodal Foundation Model for Immunogenicity Prediction and Peptide Optimization

João Felipe Rocha

Hiren Madhu

Jenny Yongjia Liu

Apurva Mishra

Chen Liu

Rishabh Anand

Rex Ying

Smita Krishnaswamy

Peptide immunogenicity, whether a peptide presented by an MHC molecule elicits a T-cell response, is central to designing vaccines, cancer i… (see more)mmunotherapy, and therapeutic proteins. Existing tools rely on a single modality, such as peptide sequences or peptide-MHC interactions, and often ignore the T-cell response that depends on the TCR-peptide-MHC complex (TCR-pMHC) and its three-dimensional structure. The scarcity of labeled TCR-pMHC data with known structures makes it difficult to build a model that captures how all components of the TCR-pMHC contribute to immunogenicity. However, a foundation model of TCR-pMHCs can learn transferable representations across components, which can be adapted to immunogenicity, binding, and TCR specificity tasks, even with limited labeled data. We introduce **ImmunoFoundation**, a self-supervised multimodal backbone for protein-complex representation, fine-tuned for peptide--MHC immunogenicity. The model couples an ESM-2 sequence encoder with a graph transformer over structure, fused via cross-modal attention. Pretraining follows a curriculum that progressively introduces structural inductive bias. **ImmunoFoundation** ourperforms prior multimodal class-I predictors on cancer neoepitope and infectious-disease tasks.

2026-05-27

FM4LS @ International Conference on Machine Learning (poster)

openreview.net

Leveraging Routing Dynamics in Mixture-of-Experts Models for Efficient Language Adaptation

Mixture-of-Experts (MoE) models are widely used to scale language models, yet their expert routing behavior and adaptation in a multilingual… (see more) setting remain underexplored. In this work, we study multilingual routing dynamics during continual pre-training of an English-centric MoE model on a multilingual corpus, analyzing how expert usage varies across languages. We find that continual multilingual pre-training leads to diffused, language-agnostic routing in early and middle layers, with language specialization primarily emerging in the final layers. We also show that token-level vocabulary overlap between languages plays an important role in how languages are routed. Motivated by these findings, we propose a parameter-efficient adaptation strategy that updates language-specific and shared experts in the final MoE layers. Experiments on MultiBLiMP and Belebele show that our method achieves a strong performance-efficiency trade-off, attaining competitive performance relative to fine-tuning complete final layers, while updating less than 2% of the parameters. Overall, our findings provide insights into where and how language specialization emerges in MoEs during continual pre-training and provide practical insights for low-resource multilingual adaptation. Our code is available at https://github.com/aditi184/moe-routing-adaptation.

2026-05-27

arXiv (preprint)

doi.org

arxiv.org

Rethinking Literature Search Evaluation: Deep Research Helps, and Human Citation Lists Are Not a Ground Truth

Gaurav Sahu

Laurent Charlin

Christopher Pal

We study large-scale literature search from two complementary angles: improving the retrieval pipeline, and stress-testing the human referen… (see more)ce list as an evaluation target. First, we implement a Deep Research pipeline that processes the full query paper and expands the retrieved results breadth-first along their bibliographies, and show that it substantially outperforms vanilla API-only search, raising recall on RollingEval-Jun25 (a 250-paper literature-search benchmark) from below 20% to above 80%. Second, we use a neutral LLM-as-a-judge to determine if human references are sound ground truth for the task. We find significant limitations: only 51% of human citations are judged moderately relevant or higher, against 86--88% for the strongest AI-based re-rankers. We study this gap on the OpenAlex co-authorship graph, finding that humans are 2.5x more likely than the best AI re-rankers to cite a direct collaborator. Together, our results argue against single-axis literature-search evaluation: recall, topical-relevance scoring, ranked-list diversity, and a co-authorship-distance diagnostic each measure complementary properties of citation quality and should be reported jointly.

2026-05-27

arXiv (preprint)

doi.org

arxiv.org

STING dampens the unfolded protein response to enable the presentation of self-antigens on MHC-I during inflammation

Ahmed M. Fahmy

Ali Ahmadi

Joël Lanoix

Tyler Cannon

moustafa Nouh Badr Elemeery

Camberly Hernandez Paredes

Benoit Barrette

Éric Bonneil

Yong Zhong Xu

Maha Ibrahim

Guillermo Arango-Duque

Éric Audemard

Sébastien Lemieux

Éric Chevet

Erwin Schurr

P. Pierre

Samantha Gruenheid

Pierre Thibault

Heidi M. McBride

Michel Desjardins

Summary A growing body of evidence supports the contribution of the long-lasting adaptive immune system in Parkinson’s disease (PD). We sh… (see more)owed that the PD-associated protein PINK1 negatively regulates the presentation of mitochondrial antigens (MitAP) on MHC-I molecules. In vivo evidence indicated that MitAP activation in mice, in the absence of PINK1, led to cytotoxic CD8 + T cell stimulation and severe motor impairments, reversible by L-DOPA. We show here that following TLR4 activation, MitAP is engaged through a pathway involving cGAS-STING, which acts as a rheostat to dampen the unfolded protein response (UPR). Without STING, the stress response is amplified, leading to a translational attenuation that inhibits the expression of XBP1s, a transcription factor required for MitAP. STING activity also regulates the repertoire of peptides displayed at the cell surface during inflammation, highlighting a potential role in immunosurveillance. These findings establish STING and the UPR as key immune regulators targetable for therapeutic intervention during autoimmune diseases and PD.

2026-05-27

bioRxiv (preprint)

doi.org

UniSafe: Modality-Agnostic Hateful Content Detection via Shared-Space Projection

Siam Shibly Antar

Syem Shibly Ador

Steven H. H. Ding

Benjamin C. M. Fung

2026-05-27

ACM Web Conference (published)

doi.org

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Publications