Publications

A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring

Usman Anwar

Julianna Piskorz

David D. Baek

David Africa

Jim Weatherall

Max Tegmark

Christian Schroeder de Witt

Mihaela van der Schaar

David Krueger

Large language models are beginning to show steganographic capabilities. Such capabilities could allow misaligned models to evade oversight … (see more)mechanisms. Yet principled methods to detect and quantify such behaviours are lacking. Classical definitions of steganography, and detection methods based on them, require a known reference distribution of non-steganographic signals. For the case of steganographic reasoning in LLMs, knowing such a reference distribution is not feasible; this renders these approaches inapplicable. We propose an alternative, **decision-theoretic view of steganography**. Our central insight is that steganography creates an asymmetry in usable information between agents who can and cannot decode the hidden content (present within a steganographic signal), and this otherwise latent asymmetry can be inferred from the agents’ observable actions. To formalise this perspective, we introduce generalised

2026-02-28

AIWILD @ International Conference on Learning Representations (published)

doi.org

openreview.net

Is Depth Heterogeneity a Barrier to Model Merging?

Model merging offers a way to combine the capabilities of several networks at test time without retraining or additional finetuning, but mos… (see more)t merging methods assume identical architectures. Depth differences are commonly viewed as a major obstacle because they remove clear layer correspondences. We test this assumption by merging residual networks that differ only in depth, using a simple training-free pipeline based on identity expansion and permutation alignment. Across both same-task and multitask image classification experiments, heterogeneous merges closely match homogeneous ones. The results suggest that, for residual networks, depth mismatch is not the main barrier to effective model merging, and that the main difficulty in model merging comes from aligning independently trained weights in a homogeneous setting.

2026-02-28

TTU_Main_Track @ International Conference on Learning Representations (published)

openreview.net

From Large-Scale Winds to Urban Decision Making: A Cross-Scale Framework for Wind-Aware UAV Navigation

Shaoxiang Qin

Fuyuan Lyu

Di Zhou

Xue Liu

Xiongye Xiao

Anima Anandkumar

Liangzhu Leon Wang

Large-scale weather and climate models provide reliable wind information at regional scales, yet their outputs are typically too coarse for … (see more)direct UAV decision making in geometrically complex urban environments. This paper investigates how large-scale atmospheric information can be transformed into city-scale wind representations and utilized for downstream navigation decisions. We propose a cross-scale prediction and decision framework that takes background wind conditions from existing weather or climate models and combines them with detailed 3D urban geometry to predict time-averaged urban wind fields using a 3D neural operator. The predicted wind fields are then incorporated into a wind-aware UAV trajectory optimization problem to minimize energy consumption under kinematic feasibility and safety constraints. By comparing trajectories planned against a wind-agnostic baseline, we demonstrate significant efficiency gains enabled by AI-predicted wind, specifically 10.3% savings in tailwinds, 7.7% in headwinds, and 3.9% in crosswind conditions. These results indicate that learning decision-relevant urban wind representations offers a practical pathway for bridging large-scale atmospheric information and fine-scale urban decision making.

2026-02-28

AI_and_PDE @ International Conference on Learning Representations (poster)

openreview.net

Machine learning–based prediction of Metabolic Syndrome risk in the Quebec population

Haowei Qiu

Shayan Nejadshamsi

Ting Wang

Stella S. Daskalopoulou

Samira Abbasgholizadeh Rahimi

Objective This study evaluates multiple machine learning approaches to predict metabolic syndrome (MetS) risk in the Quebec, Canada populati… (see more)on. We further perform explainability analysis to interpret model predictions and identify key features driving risk classification. Methods and analysis This study followed the Minimum Information about Clinical Artificial Intelligence Modeling (MI-CLAIM) guideline for reporting. We used cross-sectional data from the Canadian Community Health Survey (2015–2018) for the population living in the province of Quebec, which includes 42,279 participants. Partial sampling was used to obtain a balanced dataset for model development. We evaluated seven machine learning models for the defined classification task, including Logistic Regression, XGBoost, LightGBM, TabNet, NODE, 1D-CNN and Regularisation Cocktails. Performance was assessed using accuracy, precision, recall, F1-score, AUROC, and AUPRC, and interpretability was examined using SHAP to identify key predictors of MetS risk. Results After partial sampling, 7,866 participants (4,856 high-risk and 3,010 low-risk MetS cases) were included in the machine learning analysis. XGBoost and NODE showed the strongest performance. XGBoost achieved the highest accuracy (80.4%) and AUROC (84.1%), while NODE achieved the highest precision (80.1%) and AUPRC (86.0%). Explainability analysis identified age, perceived health, and sex as the most important features contributing to MetS risk predictions. Conclusion This study shows that machine learning can accurately predict MetS risk using self-reported health survey data from the Quebec population. Comparison of classical and deep learning approaches identified the optimal predictive model, and explainability analyses identified the most important features contributing to the risk predictions, which align with established clinical evidence. These results support a machine learning–driven initial screening framework for population-level early identification of high-risk individuals, enabling targeted interventions and efficient allocation of healthcare resources.

2026-02-28

BMJ Digital Health & AI (published)

doi.org

Objective Misalignment in LLM-based Multi Agent Social Deception Game

Large language model–based multi-agent systems have attracted increasing attention for their strong performance in collaborative tasks and… (see more) social simulations. However, these interactive settings also introduce vulnerabilities, as a single agent's hidden goals and misaligned behavior can propagate misleading or malicious information throughout the system. In this work, we study these risks in the context of social deception games. We focus on the Werewolf Game, which requires agents to reason, communicate, and collaborate under asymmetric and incomplete information. We modify the individual objectives of some agents to induce benevolent, individualistic, and malevolent strategies that can make agents depart from the objectives of their own team. We evaluate how objective divergence affects game outcomes, collaboration, and goal satisfaction. Misaligned agents often succeed in achieving their own objectives, with effects amplified by role-based power asymmetries. Qualitative analyses further show that agents remain coherent and adaptive, strategically adjusting their reasoning, communication, voting behavior, and influence on group dynamics. These results indicate that risks in LLM-based multi-agent systems extend beyond collaborative task settings and persist even in environments where competition is structurally expected.

2026-02-28

AIWILD @ International Conference on Learning Representations (published)

openreview.net

Piezoelectric tuning of thermal conductivity in nano-architected gallium nitride metamaterials

Jun Cai

Alireza Seyedkanani

Benyamin Shahryari

Hsiu-Chin Lin

Abdolhamid Akbarzadeh

2026-02-28

International Journal of Heat and Mass Transfer (published)

doi.org

PPO-CIS : A deep reinforcement learning framework for real-time toxicity detection in social media

Arezo Bodaghi

Benjamin C.M. Fung

Ketra A. Schmitt

2026-02-28

Knowledge-Based Systems (published)

doi.org

Scalable Multi-Agent Reinforcement Learning Framework for Multi-Machine Tending

Abdalwhab Abdalwhab

Giovanni Beltrame

David St-Onge

Robotic manipulators hold significant untapped potential for manufacturing industries, particularly when deployed in multi-robot configurati… (see more)ons that can enhance resource utilization, increase throughput, and reduce costs. However, industrial manipulators typically operate in isolated one-robot, one-machine setups, limiting both utilization and scalability. Even mobile robot implementations generally rely on centralized architectures, creating vulnerability to single points of failure and requiring robust communication infrastructure. This paper introduces SMAPPO (Scalable Multi-Agent Proximal Policy Optimization), a scalable input-size invariant multi-agent reinforcement learning model for decentralized multi-robot management in industrial environments. MAPPO (Multi-Agent Proximal Policy Optimization) represents the current state-of-the-art approach. We optimized an existing simulator to handle complex multi-agent reinforcement learning scenarios and designed a new multi-machine tending scenario for evaluation. Our novel observation encoder enables SMAPPO to handle varying numbers of agents, machines, and storage areas with minimal or no retraining. Results demonstrate SMAPPO's superior performance compared to the state-of-the-art MAPPO across multiple conditions: full retraining (up to 61% improvement), curriculum learning (up to 45% increased productivity and up to 49% fewer collisions), zero-shot generalization to significantly different scale scenarios (up to 272% better performance without retraining), and adaptability under extremely low initial training (up to 100% increase in parts delivery).

2026-02-28

IEEE Robotics and Automation Letters (published)

doi.org

Semantic Anchor Transport: Robust Test-Time Adaptation for Vision-Language Models

Shambhavi Mishra

Julio Silva-Rodríguez

Ismail Ben Ayed

Marco Pedersoli

Jose Dolz

Large pre-trained vision-language models (VLMs) like CLIP exhibit strong zero-shot performance but struggle under distributional shifts. We … (see more)propose Semantic Anchor Transport (SAT), a method that generates pseudo-labels for test samples by aligning visual embeddings with reliable text-based semantic anchors using Optimal Transport for batch-wise label assignment. These pseudo-labels enable efficient test-time adaptation through principled cross-modal alignment. We further incorporate multi-template distillation to leverage diverse textual clues, replicating multi-view contrastive learning without added computational cost. Extensive experiments demonstrate consistent performance gains over state-of-the-art methods across multiple benchmarks while maintaining computational efficiency.

2026-02-28

TTU_Main_Track @ International Conference on Learning Representations (published)

doi.org

openreview.net

Street review: A participatory AI-based framework for assessing streetscape inclusivity

Rashid Mushkani

Shin Koseki

Urban centers undergo social, demographic, and cultural changes that shape public street use and require systematic evaluation of public spa… (see more)ces. This study presents Street Review, a mixed-methods approach that combines participatory research with AI-based analysis to assess streetscape inclusivity. In Montréal, Canada, 28 residents participated in semi-directed interviews and image evaluations, supported by the analysis of approximately 45,000 street-view images from Mapillary. The approach produced visual analytics, such as heatmaps, to correlate subjective user ratings with physical attributes like sidewalk, maintenance, greenery, and seating. Findings reveal variations in perceptions of inclusivity and accessibility across demographic groups, demonstrating that incorporating diverse user feedback can enhance machine learning models through careful data-labeling and co-production strategies. The Street Review framework offers a systematic method for urban planners and policy analysts to inform planning, policy development, and management of public streets.

2026-02-28

Cities (published)

doi.org

arxiv.org

<b>A Systematic Literature Review of Automated Feedback Generation in Education</b><b></b>

Yajie Song

Yimei Zhang

Maria Cutumisu

Feedback that is individualized and immediate is essential to improving learning outcomes but providing it to every learner is difficult. Au… (see more)tomatic feedback generation (AFG) aims to alleviate this problem, especially with technology-enhanced learning environments. This systematic literature review of AFG in education, following the PRISMA framework, examines 34 peer-reviewed publications. The findings revealed that the reviewed studies (1) gained momentum after 2019; (2) often used secondary cognitive data to evaluate AFG approaches; (3) mainly targeted computer science domain; (4) frequently combined multiple methods to generate feedback; (5) employed multiple performance evaluations; and (6) mostly provided written feedback aimed at correcting student errors. This review also highlighted several gaps, including the lack of (1) in-depth cognitive and affective data from user studies to evaluate feedback and understand how students interpret it; (2) research on feedback use and strategies to close feedback loop; (3) AFG systems for ill-defined domains with strong transferability; (4) elaborated feedback that scaffolds problem-solving rather than giving answers; (5) feedback using multiple modalities and valences; and (6) integration of learning theories in AFG design. This review advances understanding of current AFG practices, evaluates and extends conceptual frameworks of AFG, and provides insights for future AFG design and evaluation.

2026-02-28

International Journal of Technology in Education (published)

doi.org

Understanding Representation Gaps across Scales in Tropical Tree Species Classification from Drone Imagery

Sulagna Saha

Arthur Ouaknine

Étienne Laliberté

Carol Altimas

Evan M. Gora

Adriane Esquivel Muelbert

Ian R. McGregor

Cesar Gutierrez

Vanessa E. Rubio

David Rolnick

Accurate classification of tropical tree species from unoccupied aerial vehicle (UAV) imagery remains challenging due to high species divers… (see more)ity and strong visual similarity among species at typical image resolutions (centimeters per pixel). In contrast, models trained on close-up citizen science photographs captured with smartphones achieve strong plant species classification performance. Recent advances in UAV data acquisition now enable the collection of close-up images that are spatially registered with top-view aerial imagery and approach the level of visual detail found in smartphone photographs, with the trade-off that such high-resolution photos cannot be acquired for many trees. In this work, we evaluate the performance of existing methods using paired top-view and close-up UAV imagery collected in a species-rich tropical forest. Through fine-tuning experiments, we quantify the performance gap between vision foundation models and in-domain generalist plant recognition models across both image types (high-resolution close-up versus coarser-resolution top-view imagery). We show that classification performance is consistently higher on close-up images than on top-view aerial imagery, and that this performance gap widens for rare species. Finally, we propose that self-supervised representation alignment across these two spatial scales offers a promising approach for integrating fine-grained visual information into canopy-level species classification models based on top-view UAV imagery. Leveraging high-resolution close-up UAV imagery to enhance canopy-level species classification could substantially improve large-scale monitoring of tropical forest biodiversity.

2026-02-28

ML4RS @ International Conference on Learning Representations (published)

openreview.net

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications