Publications

Imitation Game: Reproducing Deep Learning Bugs Leveraging an Intelligent Agent

Mehil B. Shah

Mohammad Masudur Rahman

Despite their wide adoption in various domains (e.g., healthcare, finance, software engineering), Deep Learning (DL)-based applications suff… (see more)er from many bugs, failures, and vulnerabilities. Reproducing these bugs is essential for their resolution, but it is extremely challenging due to the inherent nondeterminism of DL models and their tight coupling with hardware and software environments. According to recent studies, only about 3% of DL bugs can be reliably reproduced using manual approaches. To address these challenges, we present RepGen, a novel, automated, and intelligent approach for reproducing deep learning bugs. RepGen constructs a learning-enhanced context from a project, develops a comprehensive plan for bug reproduction, employs an iterative generate-validate-refine mechanism, and thus generates such code using an LLM that reproduces the bug at hand. We evaluate RepGen on 106 real-world deep learning bugs and achieve a reproduction rate of 80.19%, a 19.81% improvement over the state-of-the-art measure. A developer study involving 27 participants shows that RepGen improves the success rate of DL bug reproduction by 23.35%, reduces the time to reproduce by 56.8%, and lowers participants'cognitive load.

2025-11-30

arXiv (published)

doi.org

arxiv.org

Mitochondria‐nucleus crosstalk characterizes Alzheimer's disease across 1,5 million brain cells

Emerging insight from stem cell research reinforces Alzheimer's disease (AD) to affect mitochondrial protein expression. Compelling new evid… (see more)ence points to mitochondrial reactive oxygen species (ROS) as potential driving player in Aβ toxicity, mediated through glial cells and ultimately impacting neuronal health. A comprehensive understanding of how oxidative phosphorylation variations relate to cell function remains largely unexplored, especially through a cell type lens. Leveraging today's largest single‐nucleus RNA sequencing dataset of AD, we unveil how cell‐type‐specific mitochondrial alterations reverberate in the nuclear transcriptome, in 424 AD patients and healthy controls from ROSMAP. By adopting a supervised latent factor modelling approach, we identified distinct gene modules capturing unique aspects of the mitochondrial crosstalk in 6 major brain cell types across 5,427 nuclear and 13 mitochondrial genes. We found that nuclear‐mitochondrial crosstalk varies distinctly with cell identity, reflecting metabolic demands and functional specialization. In neurons and oligodendrocytes, ATP synthase (complex V) takes a central role, whereas type 1 NADH dehydrogenase (complex I) is more prominent in astrocytes, microglia, and OPCs. Screening across >1 million gene expression profiles from ∼20,000 drug perturbations identified mitochondrial‐nuclear signatures that resemble those activated by parthenolide and niclosamide—two chemical compounds previously associated with oxidative stress and cytotoxicity via ubiquitination—as most predictive of AD. Microglia and OPCs achieved the highest overall classification accuracy, with stronger predictive performance observed in males than in females. Mapping gene module expressions to the Allen Human Brain Atlas revealed shared whole‐brain patterns highlighting the precuneus, which we implicated in ubiquitin‐cascade‐enriched modules. Clinical phenotyping revealed that males with higher AD risk, as indicated by their mitochondrial‐nuclear scores on glial gene modules, exhibited a greater pathological burden, including higher amyloid load, Parkinson's‐like symptoms, and neuroticism‐related traits. Finally, by comparing our findings with 2.5 million CRISPRi‐based perturbations, we identified neural signatures associated with female‐biased transcription factors and fatty acid biosynthesis, while glial signatures were linked to DNA damage and oxidative stress. By integrating multiple layers of biological data from established reference atlases, our analysis of mitochondria‐nuclear crosstalk revealed distinct transcriptional signatures associated with AD risk in glial and neural cells, with these associations exhibiting sex‐biased patterns.

2025-11-30

Alzheimer's & Dementia (published)

doi.org

On Mobile Ad Hoc Networks for Coverage of Partially Observable Worlds

Edwin Meriaux

Shuo Wen

Louis-Roy Langevin

Doina Precup

Antonio Lor'ia

Gregory Dudek

2025-11-30

arXiv (published)

doi.org

arxiv.org

Neural Coherence : Find higher performance to out-of-distribution tasks from few samples

Simon Guiroy

Mats Richter

A. Chandar

Christopher Pal

2025-11-30

arXiv (published)

doi.org

arxiv.org

OXtal: An All-Atom Diffusion Model for Organic Crystal Structure Prediction

Emily Jin

Andrei Cristian Nica

Mikhail Galkin

Jarrid Rector-Brooks

Kin Long Kelvin Lee

Santiago Miret

Frances H. Arnold

Michael M. Bronstein

Avishek Bose

Alexander Tong

Cheng-Hao Liu

2025-11-30

arXiv (published)

doi.org

arxiv.org

Probing the effectiveness of World Models for Spatial Reasoning through Test-time Scaling

Saurav Jha

Muhammad Jehanzeb Mirza

Wei Lin

Shiqi Yang

A. Chandar

Vision-Language Models (VLMs) remain limited in spatial reasoning tasks that require multi-view understanding and embodied perspective shift… (see more)s. Recent approaches such as MindJourney attempt to mitigate this gap through test-time scaling where a world model imagines action-conditioned trajectories and a heuristic verifier selects helpful views from such trajectories. In this work, we systematically examine how such test-time verifiers behave across benchmarks, uncovering both their promise and their pitfalls. Our uncertainty-based analyses show that MindJourney's verifier provides little meaningful calibration, and that random scoring often reduces answer entropy equally well, thus exposing systematic action biases and unreliable reward signals. To mitigate these, we introduce a Verification through Spatial Assertions (ViSA) framework that grounds the test-time reward in verifiable, frame-anchored micro-claims. This principled verifier consistently improves spatial reasoning on the SAT-Real benchmark and corrects trajectory-selection biases through more balanced exploratory behavior. However, on the challenging MMSI-Bench, none of the verifiers, including ours, achieve consistent scaling, suggesting that the current world models form an information bottleneck where imagined views fail to enrich fine-grained reasoning. Together, these findings chart the bad, good, and ugly aspects of test-time verification for world-model-based reasoning. Our code is available at https://github.com/chandar-lab/visa-for-mindjourney.

2025-11-30

arXiv (published)

doi.org

arxiv.org

QMon: Monitoring the Execution of Quantum Circuits with Mid-Circuit Measurement and Reset

Ning Ma

Jianjun Zhao

Foutse Khomh

Shaukat Ali

Heng Li

Unlike classical software, where logging and runtime tracing can effectively reveal internal execution status, quantum circuits possess uniq… (see more)ue properties, such as the no-cloning theorem and measurement-induced collapse, that prevent direct observation or duplication of their states. These characteristics make it especially challenging to monitor the execution of quantum circuits, complicating essential tasks such as debugging and runtime monitoring. This paper presents QMON, a practical methodology that leverages mid-circuit measurements and reset operations to monitor the internal states of quantum circuits while preserving their original runtime behavior. QMON enables the instrumentation of monitoring operators at developer-specified locations within the circuit, allowing comparisons between expected and observed quantum-state probabilities at those locations. We evaluated QMON by analyzing its impact on circuit behavior, monitoring coverage, and effectiveness in bug localization. Experimental results involving 154 quantum circuits show that all circuits preserve their intended functionality after instrumentation and that QMON successfully detects and localizes various programming errors. Although monitoring coverage is limited by the need to preserve delicate quantum properties, such as entanglement, QMON effectively detects errors while introducing no or negligible disturbance to the original quantum states. QMON facilitates the development of more robust and reliable quantum software as the field continues to mature.

2025-11-30

arXiv (published)

doi.org

arxiv.org

Revisiting the Learning Objectives of Vision-Language Reward Models

Simon Roy

Samuel Barbeau

Giovanni Beltrame

Christian Desrosiers

Nicolas Thome

Learning generalizable reward functions is a core challenge in embodied intelligence. Recent work leverages contrastive vision language mode… (see more)ls (VLMs) to obtain dense, domain-agnostic rewards without human supervision. These methods adapt VLMs into reward models through increasingly complex learning objectives, yet meaningful comparison remains difficult due to differences in training data, architectures, and evaluation settings. In this work, we isolate the impact of the learning objective by evaluating recent VLM-based reward models under a unified framework with identical backbones, finetuning data, and evaluation environments. Using Meta-World tasks, we assess modeling accuracy by measuring consistency with ground truth reward and correlation with expert progress. Remarkably, we show that a simple triplet loss outperforms state-of-the-art methods, suggesting that much of the improvements in recent approaches could be attributed to differences in data and architectures.

2025-11-30

arXiv (published)

doi.org

arxiv.org

Securing the Model Context Protocol: Defending LLMs Against Tool Poisoning and Adversarial Attacks

Saeid Jamshidi

Kawser Wazed Nafi

Arghavan Moradi Dakhel

Negar Shahabi

Foutse Khomh

Naser Ezzati-Jivan

2025-11-30

arXiv (published)

doi.org

arxiv.org

Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks

Abhranil Chandra

Ayush Agrawal

Arian Hosseini

Sebastian Fischmeister

Rishabh Agarwal

Navin Goyal

Aaron Courville

2025-11-30

arXiv (published)

doi.org

arxiv.org

Shielded Controller Units for RL with Operational Constraints Applied to Remote Microgrids

Hadi Nekoei

Alexandre Blondin Mass'e

Rachid Hassani

A. Chandar

Vincent Mai

Reinforcement learning (RL) is a powerful framework for optimizing decision-making in complex systems under uncertainty, an essential challe… (see more)nge in real-world settings, particularly in the context of the energy transition. A representative example is remote microgrids that supply power to communities disconnected from the main grid. Enabling the energy transition in such systems requires coordinated control of renewable sources like wind turbines, alongside fuel generators and batteries, to meet demand while minimizing fuel consumption and battery degradation under exogenous and intermittent load and wind conditions. These systems must often conform to extensive regulations and complex operational constraints. To ensure that RL agents respect these constraints, it is crucial to provide interpretable guarantees. In this paper, we introduce Shielded Controller Units (SCUs), a systematic and interpretable approach that leverages prior knowledge of system dynamics to ensure constraint satisfaction. Our shield synthesis methodology, designed for real-world deployment, decomposes the environment into a hierarchical structure where each SCU explicitly manages a subset of constraints. We demonstrate the effectiveness of SCUs on a remote microgrid optimization task with strict operational requirements. The RL agent, equipped with SCUs, achieves a 24% reduction in fuel consumption without increasing battery degradation, outperforming other baselines while satisfying all constraints. We hope SCUs contribute to the safe application of RL to the many decision-making challenges linked to the energy transition.

2025-11-30

arXiv (published)

doi.org

arxiv.org

Sliding Window Recurrences for Sequence Models

Dragos Secrieru

Garyk Brixi

Yoshua Bengio

Taiji Suzuki

Michael Poli

Stefano Massaroli

Multi-hybrid architectures are poised to take over language modeling due to better quality and performance. We introduce a hierarchical deco… (see more)mposition framework for linear recurrences that allows us to develop algorithms aligned with GPU memory hierarchies, yielding Sliding Window Recurrences. We focus specifically on truncating recurrences to hardware-aligned windows which are naturally jagged, limiting costly inter-warp communication. Using SWR, we develop Phalanx layers that serve as drop-in replacements for windowed attention or linear recurrences. In 1B parameter multi-hybrid models, Phalanx achieves over 10-40% speedup across 4K to 32K context length over optimized Transformers while matching perplexity.

2025-11-30

arXiv (published)

doi.org

arxiv.org

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Publications

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Popular keywords:

Publications