Publications

Iterative Monte Carlo Tree Search for Neural Architecture Search
Mehraveh Javan
Matthew Toews
LT-Soups: Bridging Head and Tail Classes via Subsampled Model Soups
Masih Aminbeidokhti
Subhankar Roy
Eric Granger
Elisa Ricci
Real-world datasets typically exhibit long-tailed (LT) distributions, where a few head classes dominate and many tail classes are severely u… (see more)nderrepresented. While recent work shows that parameter-efficient fine-tuning (PEFT) methods like LoRA and AdaptFormer preserve tail-class performance on foundation models such as CLIP, we find that they do so at the cost of head-class accuracy. We identify the head-tail ratio, the proportion of head to tail classes, as a crucial but overlooked factor influencing this trade-off. Through controlled experiments on CIFAR100 with varying imbalance ratio (
Comparability of Canadian SARS-CoV-2 seroprevalence estimates with statistical adjustment for socio-demographic representation
Yuan Yu
Jiacheng Chen
Matthew J. Knight
Sheila F. O’Brien
David L. Buckeridge
Carmen L. Charlton
W. Alton Russell
OBJECTIVE SARS-CoV-2 serological surveillance used blood donors, research cohorts, and residual patient samples. Differences in socio-demogr… (see more)aphic characteristics across these sources may bias seroprevalence estimates, necessitating statistical adjustment. METHODS We re-analyzed data from six serosurveillance sources, comparing the estimated percent of the population positive for SARS-CoV-2 anti-nucleocapsid antibodies for six regions during periods when the sources' sample collection overlapped. We assessed the concordance between sources with and without using multilevel regression and poststratification (MRP) to adjust for differences in representation by age, sex, and race. RESULTS Across regions and timepoints, unadjusted seroprevalence differed between sources by up to 20%. MRP did not consistently improve comparability of seroprevalence across sources. In 2022, seroprevalence was consistently highest among blood donors, and MRP increased regional seroprevalence across all sources (except in Manitoba during January-April 2022 in ABC Study). In a secondary regression analysis, immunoassay kit and sample type (dried blood spot or venous blood draw) strongly influenced the odds that a sample was classified as seropositive. CONCLUSION Adjusting for representativeness using common socio-demographic variables did not systematically improve concordance in seropositivity estimates between serosurveillance sources. While discrepancies between sources might be influenced by studies' representativeness of characteristics we did not assess, methods for measuring seropositivity appear to explain much of the differences between sources. Serosurveillance findings are influenced by many aspects of study design beyond representativeness, such as sample type (venous blood draw or dried blood spots), choice of immunoassay, and laboratory procedures such as dilution or immunoassay calibration.
Efficient, Non‐Destructive Transfer of Wafer‐Scale Monolayer MoS
<sub>2</sub>
by Interface Engineering
Zheng Wei
Yongqing Cai
Jieying Liu
Liyan Zhang
Jiaojiao Zhao
Li Li
Qinqin Wang
Huimin Zhang
Zhihua Zhang
Dongxia Shi
Luojun Du
Grounding Computer Use Agents on Human Demonstrations
Xiangru Jian
Kevin Qinghong Lin
Kaixin Li
Johan Obando-Ceron
Juan A. Rodriguez
Adriana Romero-Soriano
Christopher Pal
Sai Rajeswar
Building reliable computer-use agents requires grounding: accurately connecting natural language instructions to the correct on-screen eleme… (see more)nts. While large datasets exist for web and mobile interactions, high-quality resources for desktop environments are limited. To address this gap, we introduce GroundCUA, a large-scale desktop grounding dataset built from expert human demonstrations. It covers 87 applications across 12 categories and includes 56K screenshots, with every on-screen element carefully annotated for a total of over 3.56M human-verified annotations. From these demonstrations, we generate diverse instructions that capture a wide range of real-world tasks, providing high-quality data for model training. Using GroundCUA, we develop the GroundNext family of models that map instructions to their target UI elements. At both 3B and 7B scales, GroundNext achieves state-of-the-art results across five benchmarks using supervised fine-tuning, while requiring less than one-tenth the training data of prior work. Reinforcement learning post-training further improves performance, and when evaluated in an agentic setting on the OSWorld benchmark using o3 as planner, GroundNext attains comparable or superior results to models trained with substantially more data,. These results demonstrate the critical role of high-quality, expert-driven datasets in advancing general-purpose computer-use agents.
Ibom NLP: A Step Toward Inclusive Natural Language Processing for Nigeria's Minority Languages
Oluwadara Kalejaiye
Luel Hagos Beyene
Mmekut-Mfon Gabriel Edet
A. D. Akpan
Eno-Abasi Urua
Anietie U Andy
LLMs and Cultural Values: the Impact of Prompt Language and Explicit Cultural Framing
Bram Bulté
Large Language Models (LLMs) are rapidly being adopted by users across the globe, who interact with them in a diverse range of languages. At… (see more) the same time, there are well-documented imbalances in the training data and optimisation objectives of this technology, raising doubts as to whether LLMs can represent the cultural diversity of their broad user base. In this study, we look at LLMs and cultural values and examine how prompt language and cultural framing influence model responses and their alignment with human values in different countries. We probe 10 LLMs with 63 items from the Hofstede Values Survey Module and World Values Survey, translated into 11 languages, and formulated as prompts with and without different explicit cultural perspectives. Our study confirms that both prompt language and cultural perspective produce variation in LLM outputs, but with an important caveat: While targeted prompting can, to a certain extent, steer LLM responses in the direction of the predominant values of the corresponding countries, it does not overcome the models' systematic bias toward the values associated with a restricted set of countries in our dataset: the Netherlands, Germany, the US, and Japan. All tested models, regardless of their origin, exhibit remarkably similar patterns: They produce fairly neutral responses on most topics, with selective progressive stances on issues such as social tolerance. Alignment with cultural values of human respondents is improved more with an explicit cultural perspective than with a targeted prompt language. Unexpectedly, combining both approaches is no more effective than cultural framing with an English prompt. These findings reveal that LLMs occupy an uncomfortable middle ground: They are responsive enough to changes in prompts to produce variation, but too firmly anchored to specific cultural defaults to adequately represent cultural diversity.
Medication-based mortality prediction in COPD using machine learning and conventional statistical methods.
Ana Paula Pena-Gralle
Amélie Forget
Yohann Chiu
M. Beauchesne
Lucie Blais
Oneiris: An AI-augmented Brain-Computer Interface for Exploring Personal and Collective Dreamscapes
Antoine Bellemare‐Pepin
Karim Jerbi CoCo Lab
Suzanne Kite
DeLLMphi: A Multi-Turn Method for Multi-Agent Forecasting
Andrew Robert Williams
Victoria Feere
Nasim Rahaman
The Delphi method is a structured forecasting process that engages experts in iterative prediction and reflection. Each round, experts submi… (see more)t forecasts to a mediator, receive an aggregated and synthesized response highlighting key arguments, and update their forecasts based on collective insight. However, Delphi panels are labour intensive, slow and hard to reproduce, requiring diverse knowledgeable participants to engage periodically across weeks or months. To address these constraints, we propose **DeLLMphi**, a forecasting method that replaces human experts and mediators with LLMs. We show (i) that providing example superforecaster reasoning traces and predictions helps to elicit more accurate forecasts from LLM experts, (ii) that the mediator plays the crucial role of surfacing different lines of reasoning and points of disagreement, and (iii) that multiple rounds and experts lead to better forecasts, showing that multi-turn interaction is key to DeLLMphi.
Longitudinal functional connectivity during rest and task is differentially related to Alzheimer's pathology and episodic memory in older adults
Larissa Fischer
Jenna N. Adams
Eóin N. Molloy
Jennifer Tremblay-Mercier
Jordana Remz
Alexa Pichet Binette
M. Natasha Rajah
Sylvia Villeneuve
Anne Maass
PREVENT-AD Research Group
Changes in functional connectivity (FC) strength involving the medial temporal lobe (MTL) and posteromedial cortex (PMC) are related to earl… (see more)y Alzheimer’s pathology and alterations in episodic memory performance in cognitively unimpaired older adults, but their dynamics remain unclear. We examined how longitudinal changes in FC involving MTL and PMC during resting-state, episodic memory encoding, and retrieval relate to subsequent amyloid- and tau-PET burden, longitudinal episodic memory performance, and the APOE4 genotype in 152 cognitively unimpaired older adults from the PREVENT-AD cohort. We found APOE4- and fMRI paradigm-dependent associations of change in FC strength with pathology burden and change in episodic memory performance. Decreasing FC over time, or “hypoconnectivity”, within PMC during rest in APOE4 carriers and during retrieval in APOE4 non-carriers was related to more amyloid and tau, respectively. Conversely, increasing FC over time, or “hyperconnectivity”, within MTL during encoding in APOE4 carriers and between MTL and PMC during retrieval independent of APOE4 status was related to more tau. Further, increasing FC between MTL and PMC during rest, unlike during encoding, was beneficial for episodic memory. Our study highlights that pathology-related episodic memory network changes manifest differently during rest and task and have differential implications for episodic memory trajectories. The online version contains supplementary material available at 10.1038/s41598-025-21596-0.
Simultaneous detection and estimation in olfactory sensing
Matthew Y. He
Venkatesh N. Murthy
Cengiz Pehlevan
Jacob A. Zavatone-Veth
The mammalian olfactory system shows an exceptional ability for rapid and accurate decoding of both the identity and concentration of odoran… (see more)ts. Previous works have used the theory of compressed sensing to elucidate the algorithmic basis for this capability: decoding odor information from the responses of a restricted repertoire of receptors is possible because only a few relevant odorants are present in any given sensory scene. However, existing circuit models for olfactory decoding still cannot contend with the complexity of naturalistic olfactory scenes; they are limited to detection of a handful of odorants. Here, we propose a model for olfactory compressed sensing inspired by simultaneous localization and mapping algorithms in navigation: the set of odors that are present in a given scene, and the concentration of those present odors, are inferred separately. To enable rapid inference of odor presence in a biologically-plausible recurrent circuit, our model leverages the framework of Mirrored Langevin Dynamics, which gives a general recipe for sampling from constrained distributions using rate-based dynamics. This results in a recurrent circuit model that can accurately infer presence and concentration at scale and can be mapped onto the primary cell types of the olfactory bulb. This frame-work offers a path towards circuit models—for olfactory sensing and beyond—that both perform well in naturalistic environments and make experimentally-testable predictions for neural response dynamics.