Publications

Robust and Interpretable Relational Reasoning with Large Language Models and Symbolic Solvers
Ge Zhang
Mohammad Alomrani
Jiaming Zhou
Yaochen Hu
B. Wang
Qun Liu
Yingxue Zhang
Jianye HAO
Large language models (LLMs) possess vast semantic knowledge but often struggle with complex reasoning tasks, particularly in relational rea… (voir plus)soning problems such as kinship or spatial reasoning. In this paper, we present Path-of-Thoughts (PoT), a novel framework designed to tackle relation reasoning by decomposing the task into three key stages: graph extraction, path identification, and reasoning. Unlike previous approaches, PoT efficiently extracts a task-agnostic graph that identifies crucial entities, relations, and attributes within the problem context. Subsequently, PoT identifies relevant reasoning chains within the graph corresponding to the posed question, facilitating inference of potential answers. Experimental evaluations on four benchmark datasets, demanding long reasoning chains, demonstrate that PoT surpasses state-of-the-art baselines by a significant margin (maximum 21.3\%) without necessitating fine-tuning or extensive LLM calls. Furthermore, as opposed to prior neuro-symbolic methods, PoT exhibits improved resilience against LLM errors by leveraging the compositional nature of graphs.
Cervical Spinal Cord Magnetization Transfer Ratio and Its Relationship With Clinical Outcomes in Multiple Sclerosis
Lisa Eunyoung Lee
Irene M. Vavasour
Melanie Guenette
Katherine Sawicka
Neda Rashidi‐Ranjbar
Nathan Churchill
Akash Chopra
Adelia Adelia
Pierre-Louis Benveniste
Anthony Traboulsee
Nathalie Arbour
Fabrizio Giuliani
Larry D. Lynd
Scott B. Patten
Alexandre Prat
Alice Schabas
Penelope Smyth
Roger Tam
Yunyan Zhang … (voir 6 de plus)
Simon J. Graham
Mojgan Hodaie
Anthony Feinstein
Shannon Kolind
Tom A. Schweizer
Jiwon Oh
ABSTRACT Objective The cervical spinal cord (cSC) is highly relevant to clinical dysfunction in multiple sclerosis (MS) but remains understu… (voir plus)died using quantitative magnetic resonance imaging (MRI). We assessed magnetization transfer ratio (MTR), a semi‐quantitative MRI measure sensitive to MS‐related tissue microstructural changes, in the cSC and its relationship with clinical outcomes in radiologically isolated syndrome (RIS) and MS. Methods MTR data were acquired from 52 RIS, 201 relapsing–remitting MS (RRMS), 47 primary progressive MS (PPMS), and 43 control (CON) participants across four sites in the Canadian Prospective Cohort Study to Understand Progression in MS (CanProCo) using 3.0 T MRI systems. Mean MTR was compared between groups in whole cSC and sub‐regions between C2‐C4. Multiple linear regression was used to evaluate relationships between MTR and clinical outcomes, including the expanded disability status scale (EDSS), walking speed test (WST), and manual dexterity test (MDT). Results There were consistent group differences in MTR, which were most pronounced between PPMS and CON (−5.8% to −3.7%, p ≤ 0.01). In PPMS, lower MTR was associated with greater disability as measured by EDSS (β = −0.3 to −0.1, p ≤ 0.03), WST (β = −0.9 to −0.5, p ≤ 0.04), and MDT (β = −0.6 and − 0.5, p = 0.04). In RRMS, MTR was associated with only EDSS (β = −0.1, p ≤ 0.03). Interpretation In this large sample of RIS and MS, cSC MTR was lowest in PPMS, with associations between MTR and clinical outcomes in MS but not RIS. These findings suggest that MTR provides important information about the underlying tissue microstructural integrity of the cSC relevant to clinical disability in established MS.
Cervical Spinal Cord Magnetization Transfer Ratio and Its Relationship With Clinical Outcomes in Multiple Sclerosis
Lisa Eunyoung Lee
Irene M. Vavasour
Melanie Guenette
Katherine Sawicka
Neda Rashidi‐Ranjbar
Nathan Churchill
Akash Chopra
Adelia Adelia
Pierre-Louis Benveniste
Anthony Traboulsee
Nathalie Arbour
Fabrizio Giuliani
Larry D. Lynd
Scott B. Patten
Alexandre Prat
Alice Schabas
Penelope Smyth
Roger Tam
Yunyan Zhang … (voir 6 de plus)
Simon J. Graham
Mojgan Hodaie
Anthony Feinstein
Shannon Kolind
Tom A. Schweizer
Jiwon Oh
Cervical Spinal Cord Magnetization Transfer Ratio and Its Relationship With Clinical Outcomes in Multiple Sclerosis
Lisa Eunyoung Lee
Irene M. Vavasour
Melanie Guenette
Katherine Sawicka
Neda Rashidi‐Ranjbar
Nathan Churchill
Akash Chopra
Adelia Adelia
Pierre-Louis Benveniste
Anthony Traboulsee
Nathalie Arbour
Fabrizio Giuliani
Larry D. Lynd
Scott B. Patten
Alexandre Prat
Alice Schabas
Penelope Smyth
Roger Tam
Yunyan Zhang … (voir 6 de plus)
Simon J. Graham
Mojgan Hodaie
Anthony Feinstein
Shannon Kolind
Tom A. Schweizer
Jiwon Oh
Exposing and Mitigating Calibration Biases and Demographic Unfairness in MLLM Few-Shot In-Context Learning for Medical Image Classification
Mingyang Li
Hengguan Huang
Multimodal large language models (MLLMs) have enormous potential to perform few-shot in-context learning in the context of medical image ana… (voir plus)lysis. However, safe deployment of these models into real-world clinical practice requires an in-depth analysis of the accuracies of their predictions, and their associated calibration errors, particularly across different demographic subgroups. In this work, we present the first investigation into the calibration biases and demographic unfairness of MLLMs'predictions and confidence scores in few-shot in-context learning for medical image classification. We introduce CALIN, an inference-time calibration method designed to mitigate the associated biases. Specifically, CALIN estimates the amount of calibration needed, represented by calibration matrices, using a bi-level procedure: progressing from the population level to the subgroup level prior to inference. It then applies this estimation to calibrate the predicted confidence scores during inference. Experimental results on three medical imaging datasets: PAPILA for fundus image classification, HAM10000 for skin cancer classification, and MIMIC-CXR for chest X-ray classification demonstrate CALIN's effectiveness at ensuring fair confidence calibration in its prediction, while improving its overall prediction accuracies and exhibiting minimum fairness-utility trade-off. Our codebase can be found at https://github.com/xingbpshen/medical-calibration-fairness-mllm.
A Novel Sequential Framework for Transmission Network Expansion Planning: Benders Decomposition Preceding Semidefinite Programming
Elmira Fathipasandideh
Hussein Suprême
Dalal Asber
The transmission network expansion planning (TNEP) problem is inherently complex because of its nonlinear and nonconvex nature, arising from… (voir plus) the inclusion of AC power flow constraints, discrete investment decisions, and multiple operating scenarios. These characteristics make the problem computationally challenging, particulary when scaling to larger systems with multistage planning horizons. Addressing this complexity requires advanced methodologies that balance the solution accuracy and computational efficiency. This paper presents a novel two-step framework for TNEP that first applies Benders decomposition to separate investment and operational decisions, followed by semidefinite linearization to reformulate the operational subproblems. The proposed approach enhances the solution quality by ensuring convexity in the subproblems and improves computational efficiency through decomposition. Numerical results for 6- , 10-, and 24-bus test systems demonstrate that the proposed method achieves superior performance compared to existing approaches in terms of solution accuracy and computational efficiency.
A Novel Sequential Framework for Transmission Network Expansion Planning: Benders Decomposition Preceding Semidefinite Programming
Elmira Fathipasandideh
Hussein Suprême
Dalal Asber
The transmission network expansion planning (TNEP) problem is inherently complex because of its nonlinear and nonconvex nature, arising from… (voir plus) the inclusion of AC power flow constraints, discrete investment decisions, and multiple operating scenarios. These characteristics make the problem computationally challenging, particulary when scaling to larger systems with multistage planning horizons. Addressing this complexity requires advanced methodologies that balance the solution accuracy and computational efficiency. This paper presents a novel two-step framework for TNEP that first applies Benders decomposition to separate investment and operational decisions, followed by semidefinite linearization to reformulate the operational subproblems. The proposed approach enhances the solution quality by ensuring convexity in the subproblems and improves computational efficiency through decomposition. Numerical results for 6- , 10-, and 24-bus test systems demonstrate that the proposed method achieves superior performance compared to existing approaches in terms of solution accuracy and computational efficiency.
Small Encoders Can Rival Large Decoders in Detecting Groundedness
Istabrak Abbes
Fernando Rodriguez
Alaa Boukhary
Adam Elwood
Augmenting large language models (LLMs) with external context significantly improves their performance in natural language processing (NLP) … (voir plus)tasks. However, LLMs struggle to answer queries reliably when the provided context lacks information, often resorting to ungrounded speculation or internal knowledge. Groundedness - generating responses strictly supported by the context - is essential for ensuring factual consistency and trustworthiness. This study focuses on detecting whether a given query is grounded in a document provided in context before the costly answer generation by LLMs. Such a detection mechanism can significantly reduce both inference time and resource consumption. We show that lightweight, task specific encoder models such as RoBERTa and NomicBERT, fine-tuned on curated datasets, can achieve accuracy comparable to state-of-the-art LLMs, such as Llama3 8B and GPT4o, in groundedness detection while reducing inference latency by orders of magnitude. The code is available at : https://github.com/chandarlab/Hallucinate-less
Small Encoders Can Rival Large Decoders in Detecting Groundedness
Istabrak Abbes
Fernando Rodriguez
Alaa Boukhary
Adam Elwood
Augmenting large language models (LLMs) with external context significantly improves their performance in natural language processing (NLP) … (voir plus)tasks. However, LLMs struggle to answer queries reliably when the provided context lacks information, often resorting to ungrounded speculation or internal knowledge. Groundedness - generating responses strictly supported by the context - is essential for ensuring factual consistency and trustworthiness. This study focuses on detecting whether a given query is grounded in a document provided in context before the costly answer generation by LLMs. Such a detection mechanism can significantly reduce both inference time and resource consumption. We show that lightweight, task specific encoder models such as RoBERTa and NomicBERT, fine-tuned on curated datasets, can achieve accuracy comparable to state-of-the-art LLMs, such as Llama3 8B and GPT4o, in groundedness detection while reducing inference latency by orders of magnitude. The code is available at : https://github.com/chandarlab/Hallucinate-less
T-GRAB: A Synthetic Diagnostic Benchmark for Learning on Temporal Graphs
Dynamic graph learning methods have recently emerged as powerful tools for modelling relational data evolving through time. However, despite… (voir plus) extensive benchmarking efforts, it remains unclear whether current Temporal Graph Neural Networks (TGNNs) effectively capture core temporal patterns such as periodicity, cause-and-effect, and long-range dependencies. In this work, we introduce the Temporal Graph Reasoning Benchmark (T-GRAB), a comprehensive set of synthetic tasks designed to systematically probe the capabilities of TGNNs to reason across time. T-GRAB provides controlled, interpretable tasks that isolate key temporal skills: counting/memorizing periodic repetitions, inferring delayed causal effects, and capturing long-range dependencies over both spatial and temporal dimensions. We evaluate 11 temporal graph learning methods on these tasks, revealing fundamental shortcomings in their ability to generalize temporal patterns. Our findings offer actionable insights into the limitations of current models, highlight challenges hidden by traditional real-world benchmarks, and motivate the development of architectures with stronger temporal reasoning abilities. The code for T-GRAB can be found at: https://github.com/alirezadizaji/T-GRAB.
Diffusion Tree Sampling: Scalable inference-time alignment of diffusion models
Adapting a pretrained diffusion model to new objectives at inference time remains an open problem in generative modeling. Existing steering … (voir plus)methods suffer from inaccurate value estimation, especially at high noise levels, which biases guidance. Moreover, information from past runs is not reused to improve sample quality, resulting in inefficient use of compute. Inspired by the success of Monte Carlo Tree Search, we address these limitations by casting inference-time alignment as a search problem that reuses past computations. We introduce a tree-based approach that samples from the reward-aligned target density by propagating terminal rewards back through the diffusion chain and iteratively refining value estimates with each additional generation. Our proposed method, Diffusion Tree Sampling (DTS), produces asymptotically exact samples from the target distribution in the limit of infinite rollouts, and its greedy variant, Diffusion Tree Search (DTS
Diffusion Tree Sampling: Scalable inference-time alignment of diffusion models
Adapting a pretrained diffusion model to new objectives at inference time remains an open problem in generative modeling. Existing steering … (voir plus)methods suffer from inaccurate value estimation, especially at high noise levels, which biases guidance. Moreover, information from past runs is not reused to improve sample quality, resulting in inefficient use of compute. Inspired by the success of Monte Carlo Tree Search, we address these limitations by casting inference-time alignment as a search problem that reuses past computations. We introduce a tree-based approach that samples from the reward-aligned target density by propagating terminal rewards back through the diffusion chain and iteratively refining value estimates with each additional generation. Our proposed method, Diffusion Tree Sampling (DTS), produces asymptotically exact samples from the target distribution in the limit of infinite rollouts, and its greedy variant, Diffusion Tree Search (DTS