Publications

Multi-Robot Decentralized Collaborative SLAM in Planetary Analogue Environments: Dataset, Challenges, and Lessons Learned
Pierre-Yves Lajoie
Karthik Soma
Alice Lemieux-Bourque
Rongge Zhang
Vivek Shankar Varadharajan
Decentralized collaborative simultaneous localization and mapping (C-SLAM) is essential to enable multirobot missions in unknown environment… (voir plus)s without relying on preexisting localization and communication infrastructure. This technology is anticipated to play a key role in the exploration of the Moon, Mars, and other planets. In this article, we share insights and lessons learned from C-SLAM experiments involving three robots operating on a Mars analogue terrain and communicating over an ad hoc network. We examine the impact of limited and intermittent communication on C-SLAM performance, as well as the unique localization challenges posed by planetary-like environments. Additionally, we introduce a novel dataset collected during our experiments, which includes real-time peer-to-peer inter-robot throughput and latency measurements. This dataset aims to support future research on communication-constrained, decentralized multirobot operations.
A Multi-Robot Exploration Planner for Space Applications
Vivek Shankar Vardharajan
We propose a distributed multi-robot exploration planning method designed for complex, unconstrained environments featuring steep elevation … (voir plus)changes. The method employs a two-tiered approach: a local exploration planner that constructs a grid graph to maximize exploration gain and a global planner that maintains a sparse navigational graph to track visited locations and frontier information. The global graphs are periodically synchronized among robots within communication range to maintain an updated representation of the environment. Our approach integrates localization loop closure estimates to correct global graph drift. In simulation and field tests, the proposed method achieves 50% lower computational runtime compared to state-of-the-art methods while demonstrating superior exploration coverage. We evaluate its performance in two simulated subterranean environments and in field experiments at a Mars-analog terrain.
NeoBERT: A Next-Generation BERT
Mariam El Mezouar
John Xavier Morris
A. Chandar
Recent innovations in architecture, pre-training, and fine-tuning have led to the remarkable in-context learning and reasoning abilities of … (voir plus)large auto-regressive language models such as LLaMA and DeepSeek. In contrast, encoders like BERT and RoBERTa have not seen the same level of progress despite being foundational for many downstream NLP applications. To bridge this gap, we introduce NeoBERT, a next-generation encoder that redefines the capabilities of bidirectional models by integrating state-of-the-art advancements in architecture, modern data, and optimized pre-training methodologies. NeoBERT is designed for seamless adoption: it serves as a plug-and-play replacement for existing base models, relies on an optimal depth-to-width ratio, and leverages an extended context length of 4,096 tokens. Despite its compact 250M parameter footprint, it achieves state-of-the-art results on the massive MTEB benchmark, outperforming BERT large, RoBERTa large, NomicBERT, and ModernBERT under identical fine-tuning conditions. In addition, we rigorously evaluate the impact of each modification on GLUE and design a uniform fine-tuning and evaluation framework for MTEB. We release all code, data, checkpoints, and training scripts to accelerate research and real-world adoption.
Neural Incremental Dynamic Inversion Control of a Multirotor Robotic Airship
Ely Carneiro de Paiva
José Raul Azinheira
Rafael de Angelis Cordeiro
José Reginaldo H. Carvalho
Apolo Marton
Nteasee: A mixed methods study of expert and general population perspectives on deploying AI for health in African countries
Mercy Nyamewaa Asiedu
Iskandar Haykel
Awa Dieng
K. Kauer
Tousif Ahmed
Florence Ofori
Charisma Chan
Stephen R. Pfohl
Katherine Heller
Online Influence Campaigns: Strategies and Vulnerabilities
Ethan Kosak-Hine
Tom Gibbs
U. Montr'eal
Ivado
M. University
In order to combat the creation and spread of harmful content online, this paper defines and contextualizes the concept of inauthentic, soci… (voir plus)etal-scale manipulation by malicious actors. We review the literature on societally harmful content and how it proliferates to analyze the manipulation strategies used by such actors and the vulnerabilities they target. We also provide an overview of three case studies of extensive manipulation campaigns to emphasize the severity of the problem. We then address the role that Artificial Intelligence plays in the development and dissemination of harmful content, and how its evolution presents new threats to societal cohesion for countries across the globe. Our survey aims to increase our understanding of not just particular aspects of these threats, but also the strategies underlying their deployment, so we can effectively prepare for the evolving cybersecurity landscape.
Open Technical Problems in Open-Weight AI Model Risk Management
Stephen Casper
Kyle O'Brien
Shayne Longpre
Elizabeth Seger
Kevin Klyman
Rishi Bommasani
Aniruddha Nrusimha
Ilia Shumailov
Sören Mindermann
Steven Basart
Frank Rudzicz
Avijit Ghosh
Andrew Strait
Robert Kirk
Dan Hendrycks
J. Zico Kolter
Geoffrey Irving
Yarin Gal … (voir 2 de plus)
Dylan Hadfield-Menell
OpenFake: An Open Dataset and Platform Toward Real-World Deepfake Detection
Deepfakes, synthetic media created using advanced AI techniques, pose a growing threat to information integrity, particularly in politically… (voir plus) sensitive contexts. This challenge is amplified by the increasing realism of modern generative models, which our human perception study confirms are often indistinguishable from real images. Yet, existing deepfake detection benchmarks rely on outdated generators or narrowly scoped datasets (e.g., single-face imagery), limiting their utility for real-world detection. To address these gaps, we present OpenFake, a large politically grounded dataset specifically crafted for benchmarking against modern generative models with high realism, and designed to remain extensible through an innovative crowdsourced adversarial platform that continually integrates new hard examples. OpenFake comprises nearly four million total images: three million real images paired with descriptive captions and almost one million synthetic counterparts from state-of-the-art proprietary and open-source models. Detectors trained on OpenFake achieve near-perfect in-distribution performance, strong generalization to unseen generators, and high accuracy on a curated in-the-wild social media test set, significantly outperforming models trained on existing datasets. Overall, we demonstrate that with high-quality and continually updated benchmarks, automatic deepfake detection is both feasible and effective in real-world settings.
PairBench: Are Vision-Language Models Reliable at Comparing What They See?
Sai Rajeswar
Adriana Romero
Valentina Zantedeschi
Joao Monteiro
Understanding how effectively large vision language models (VLMs) compare visual inputs is crucial across numerous applications, yet this fu… (voir plus)ndamental capability remains insufficiently assessed. While VLMs are increasingly deployed for tasks requiring comparative judgment, including automated evaluation, re-ranking, and retrieval-augmented generation, no systematic framework exists to measure their performance in these scenarios. We present PairBench, a simple framework that evaluates VLMs as customizable similarity tools using widely available image datasets. Our approach introduces four key metrics for reliable comparison: alignment with human annotations, consistency across pair ordering, distribution smoothness, and controllability through prompting. Our analysis reveals that no model consistently excels across all metrics, with each demonstrating distinct strengths and weaknesses. Most concerning is the widespread inability of VLMs to maintain symmetric similarity scores. Interestingly, we demonstrate that performance on our benchmark strongly correlates with popular benchmarks used for more complex tasks, while providing additional metrics into controllability, smoothness and ordering. This makes PairBench a unique and comprehensive framework to evaluate the performance of VLMs for automatic evaluation depending on the task.
PEACE: Prompt Engineering Automation for CLIPSeg Enhancement for Safe-Landing Zone Segmentation
Rongge Zhang
Antoine Robillard
Safe landing is essential in robotics applications, from industrial settings to space exploration. As artificial intelligence advances, we h… (voir plus)ave developed PEACE (Prompt Engineering Automation for CLIPSeg Enhancement), a system that automatically generates and refines prompts for identifying landing zones in changing environments. Traditional approaches using fixed prompts for open-vocabulary models struggle with environmental changes and can lead to dangerous outcomes when conditions are not represented in the predefined prompts. PEACE addresses this limitation by dynamically adapting to shifting data distributions. Our key innovation is the dual segmentation of safe and unsafe landing zones, allowing the system to refine the results by removing unsafe areas from potential landing sites. Using only monocular cameras and image segmentation, PEACE can safely guide descent operations from 100 meters to altitudes as low as 20 meters. The testing shows that PEACE significantly outperforms the standard CLIP and CLIPSeg prompting methods, improving the successful identification of safe landing zones from 57% to 92%. We have also demonstrated enhanced performance when replacing CLIPSeg with FastSAM. The complete source code is available as an open-source software 1.
Performance modulations phase-locked to action depend on internal state
Gustavo Rohenkohl
Pascal Fries
Several studies have probed perceptual performance at different times after a self-paced motor action and found frequency-specific modulatio… (voir plus)ns of perceptual performance phase-locked to the action. Such action-related modulation has been reported for various frequencies and modulation strengths. In an attempt to establish a basic effect at the population level, we had a relatively large number of participants (n=50) perform a self-paced button press followed by a detection task at threshold, and we applied both fixed- and random-effects tests. The combined data of all trials and participants surprisingly did not show any significant action-related modulation. However, based on previous studies, we explored the possibility that such modulation depends on the participant’s internal state. Indeed, when we split trials based on performance in neighboring trials, then trials in periods of low performance showed an action-related modulation at ≈17 Hz. When we split trials based on the performance in the preceding trial, we found that trials following a “miss” showed an action-related modulation at ≈17 Hz. Finally, when we split participants based on their false-alarm rate, we found that participants with no false alarms showed an action-related modulation at ≈17 Hz. All these effects were significant in random-effects tests, supporting an inference on the population. Together, these findings indicate that action-related modulations are not always detectable. However, the results suggest that specific internal states such as lower attentional engagement and/or higher decision criterion are characterized by a modulation in the beta-frequency range.
Personalized Negative Reservoir for Incremental Learning in Recommender Systems
Yuening Wang
Yingxue Zhang
Mark J. Coates