Portrait of Giovanni Beltrame

Giovanni Beltrame

Affiliate Member
Full Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering
Research Topics
Autonomous Robotics Navigation
Computer Vision
Distributed Systems
Human-Robot Interaction
Online Learning
Reinforcement Learning
Robotics
Swarm Intelligence

Biography

Giovanni Beltrame obtained his PhD in computer engineering from Politecnico di Milano in 2006, after which he worked as a microelectronics engineer at the European Space Agency on a number of projects, from radiation-tolerant systems to computer-aided design.

In 2010, he moved to Montréal, where he is currently a professor at Polytechnique Montréal in the Computer and Software Engineering Department.

Beltrame directs the Making Innovative Space Technology (MIST) Lab, where he has more than twenty-five students and postdocs under his supervision. He has completed several projects in collaboration with industry and government agencies in the area of robotics, disaster response and space exploration. He and his team have participated in several field missions with ESA, the Canadian Space Agency (CSA) and NASA, including BRAILLE, PANAGAEA-X and IGLUNA.

His research interests include the modelling and design of embedded systems, AI and robotics, and he has published his findings in top journals and conferences.

Current Students

PhD - Polytechnique Montréal
Co-supervisor :
Collaborating researcher - Polytechnique Montréal Montreal
Co-supervisor :
Master's Research - Polytechnique Montréal
Co-supervisor :
PhD - Polytechnique Montréal
Co-supervisor :
Master's Research - Université de Montréal
Co-supervisor :
PhD - Polytechnique Montréal
Co-supervisor :

Publications

Can Vision Foundation Models Navigate? Zero-Shot Real-World Evaluation and Lessons Learned
Visual Navigation Models (VNMs) promise generalizable, robot navigation by learning from large-scale visual demonstrations. Despite growing … (see more)real-world deployment, existing evaluations rely almost exclusively on success rate, whether the robot reaches its goal, which conceals trajectory quality, collision behavior, and robustness to environmental change. We present a real-world evaluation of five state-of-the-art VNMs (GNM, ViNT, NoMaD, NaviBridger, and CrossFormer) across two robot platforms and five environments spanning indoor and outdoor settings. Beyond success rate, we combine path-based metrics with vision-based goal-recognition scores and assess robustness through controlled image perturbations (motion blur, sunflare). Our analysis uncovers three systematic limitations: (a) even architecturally sophisticated diffusion and transformer-based models exhibit frequent collisions, indicating limited geometric understanding; (b) models fail to discriminate between different locations that are perceptually similar, however some semantics differences are present, causing goal prediction errors in repetitive environments; and (c) performance degrades under distribution shift. We will publicly release our evaluation codebase and dataset to facilitate reproducible benchmarking of VNMs.
Scalable Multi-Agent Reinforcement Learning Framework for Multi-Machine Tending
Abdalwhab Abdalwhab
David St-Onge
Robotic manipulators hold significant untapped potential for manufacturing industries, particularly when deployed in multi-robot configurati… (see more)ons that can enhance resource utilization, increase throughput, and reduce costs. However, industrial manipulators typically operate in isolated one-robot, one-machine setups, limiting both utilization and scalability. Even mobile robot implementations generally rely on centralized architectures, creating vulnerability to single points of failure and requiring robust communication infrastructure. This paper introduces SMAPPO (Scalable Multi-Agent Proximal Policy Optimization), a scalable input-size invariant multi-agent reinforcement learning model for decentralized multi-robot management in industrial environments. MAPPO (Multi-Agent Proximal Policy Optimization) represents the current state-of-the-art approach. We optimized an existing simulator to handle complex multi-agent reinforcement learning scenarios and designed a new multi-machine tending scenario for evaluation. Our novel observation encoder enables SMAPPO to handle varying numbers of agents, machines, and storage areas with minimal or no retraining. Results demonstrate SMAPPO's superior performance compared to the state-of-the-art MAPPO across multiple conditions: full retraining (up to 61% improvement), curriculum learning (up to 45% increased productivity and up to 49% fewer collisions), zero-shot generalization to significantly different scale scenarios (up to 272% better performance without retraining), and adaptability under extremely low initial training (up to 100% increase in parts delivery).
Sociodynamics of Reinforcement Learning
Reinforcement Learning (RL) has emerged as a core algorithmic paradigm explicitly driving innovation in a growing number of industrial appli… (see more)cations, including large language models and quantitative finance. Furthermore, computational neuroscience has long found evidence of natural forms of RL in biological brains. Therefore, it is crucial for the study of social dynamics to develop a scientific understanding of how RL shapes population behaviors. We leverage the framework of Evolutionary Game Theory (EGT) to provide building blocks and insights toward this objective. We propose a methodology that enables simulating large populations of RL agents in simple game theoretic interaction models. More specifically, we derive fast and parallelizable implementations of two fundamental revision protocols from multi-agent RL - Policy Gradient (PG) and Opponent-Learning Awareness (LOLA) - tailored for population simulations of random pairwise interactions in stateless normal-form games. Our methodology enables us to simulate large populations of 200,000 independent co-learning agents, yielding compelling insights into how non-stationarity-aware learners affect social dynamics. In particular, we find that LOLA learners promote cooperation in the Stag Hunt model, delay cooperative outcomes in the Hawk-Dove model, and reduce strategy diversity in the Rock-Paper-Scissors model.
Swarm robotics localization: comparing methods from infrared to foundation models
Ali Imran
Vivek Shankar Vardharajan
Rafael Gomes Braga
David St-Onge
E-RGB-D: Real-Time Event-Based Perception with Structured Light
Seyed Ehsan Marjani Bajestani
Event-based cameras (ECs) have emerged as bio-inspired sensors that report pixel brightness changes asynchronously, offering unmatched speed… (see more) and efficiency in vision sensing. Despite their high dynamic range, temporal resolution, low power consumption, and computational simplicity, traditional monochrome ECs face limitations in detecting static or slowly moving objects and lack color information essential for certain applications. To address these challenges, we present a novel approach that integrates a Digital Light Processing (DLP) projector, forming Active Structured Light (ASL) for RGB-D sensing. By combining the benefits of ECs and projection-based techniques, our method enables the detection of color and the depth of each pixel separately. Dynamic projection adjustments optimize bandwidth, ensuring selective color data acquisition and yielding colorful point clouds without sacrificing spatial resolution. This integration, facilitated by a commercial TI LightCrafter 4500 projector and a monocular monochrome EC, not only enables frameless RGB-D sensing applications but also achieves remarkable performance milestones. With our approach, we achieved a color detection speed equivalent to 1400 fps and 4 kHz of pixel depth detection, significantly advancing the realm of computer vision across diverse fields from robotics to 3D reconstruction methods. Our code is publicly available: https://github.com/MISTLab/event_based_rgbd_ros
Revisiting the Learning Objectives of Vision-Language Reward Models
Simon Roy
Samuel Barbeau
Christian Desrosiers
Learning generalizable reward functions is a core challenge in embodied intelligence. Recent work leverages contrastive vision language mode… (see more)ls (VLMs) to obtain dense, domain-agnostic rewards without human supervision. These methods adapt VLMs into reward models through increasingly complex learning objectives, yet meaningful comparison remains difficult due to differences in training data, architectures, and evaluation settings. In this work, we isolate the impact of the learning objective by evaluating recent VLM-based reward models under a unified framework with identical backbones, finetuning data, and evaluation environments. Using Meta-World tasks, we assess modeling accuracy by measuring consistency with ground truth reward and correlation with expert progress. Remarkably, we show that a simple triplet loss outperforms state-of-the-art methods, suggesting that much of the improvements in recent approaches could be attributed to differences in data and architectures.
Attention-Based Multi-Agent RL for Multi-Machine Tending Using Mobile Robots
Abdalwhab Bakheet Mohamed Abdalwhab
David St-Onge
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborat… (see more)ive robots can tackle that can also greatly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. We introduce a multi-agent multi-machine-tending learning framework using mobile robots based on multi-agent reinforcement learning (MARL) techniques, with the design of a suitable observation and reward. Moreover, we integrate an attention-based encoding mechanism into the Multi-Agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine-tending scenarios. Our model (AB-MAPPO) outperforms MAPPO in this new challenging scenario in terms of task success, safety, and resource utilization. Furthermore, we provided an extensive ablation study to support our design decisions.
Personalizing brain stimulation: continual learning for sleep spindle detection
Hugo R Jourde
S Ehsan M Bajestani
Emily B J Coffey
Objective. Personalized stimulation, in which algorithms used to detect neural events adapt to a user’s unique neural characteristics, may… (see more) be crucial to enable optimized and consistent stimulation quality for both fundamental research and clinical applications. Precise stimulation of sleep spindles-transient patterns of brain activity that occur during non rapid eye movement sleep that are involved in memory consolidation-presents an exciting frontier for studying memory functions; however, this endeavour is challenged by the spindles’ fleeting nature, inter-individual variability, and the necessity of real-time detection. Approach. We tackle these challenges using a novel continual learning framework. Using a pre-trained model capable of both online classification of sleep stages and spindle detection, we implement an algorithm that refines spindle detection, tailoring it to the individual throughout one or more nights without manual intervention. Main results. Our methodology achieves accurate, subject-specific targeting of sleep spindles and enables advanced closed-loop stimulation studies. While fine-tuning alone offers minimal benefits for single nights, our approach combining weight averaging demonstrates significant improvement over multiple nights, effectively mitigating catastrophic forgetting. Significance. This work represents an important step towards signal-level personalization of brain stimulation that can be applied to different brain stimulation paradigms including closed-loop brain stimulation, and to different neural events. Applications in fundamental neuroscience may enhance the investigative potential of brain stimulation to understand cognitive processes such as the role of sleep spindles in memory consolidation, and may lead to novel therapeutic applications.
GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions
Ali Imran
David St-Onge
In industrial environments, predicting human actions is essential for ensuring safe and effective collaboration between humans and robots. T… (see more)his paper introduces a perception framework that enables mobile robots to understand and share information about human actions in a decentralized way. The framework first allows each robot to build a spatial graph representing its surroundings, which it then shares with other robots. This shared spatial data is combined with temporal information to track human behavior over time. A swarm-inspired decision-making process is used to ensure all robots agree on a unified interpretation of the human's actions. Results show that adding more robots and incorporating longer time sequences improve prediction accuracy. Additionally, the consensus mechanism increases system resilience, making the multi-robot setup more reliable in dynamic industrial settings.
Neurophysiological effects of targeting sleep spindles with closed-loop auditory stimulation
Hugo R Jourde
Emily B J Coffey
Sleep spindles are neural events unique to nonrapid eye movement sleep that play key roles in memory reactivation and consolidation. However… (see more), much of the evidence for their function remains correlational rather than causal. Closed-loop brain stimulation uses real-time monitoring of neural events (often via electroencephalography; EEG) to deliver precise auditory, magnetic, or electrical stimulation for research or therapeutic purposes. Automated online algorithms to detect and stimulate sleep spindles have recently been validated, but the time- and frequency-resolved physiological responses generated by them have not yet been documented. Building on the recent findings that sleep spindles do not block the transmission of sound to cortex, the present work investigates the neurophysiological responses to closed-loop auditory stimulation of sleep spindles. EEG data were collected from 10 healthy human adults (6 nights each), whilst sleep spindles were detected and in half the nights, targeted with auditory stimulation. Spindles were successfully stimulated before their offset in 97.6% of detections and did not disturb sleep. Comparing stimulation with sham, we observed that stimulation resulted in increased sigma activity (11–16 Hz) at about 1 second poststimulation but that stimulation occurring at the beginning of the spindle also resulted in early termination of the spindle. Finally, we observed that stimulating an evoked spindle did not elicit additional sigma activity. Our results validate the use of closed-loop auditory stimulation targeting sleep spindles, and document its neural effects, as a basis for future causal investigations concerning spindles’ roles in memory consolidation.
Learning Multi-agent Multi-machine Tending by Mobile Robots
Abdalwhab Abdalwhab
S Ebrahimi Kahou
David St-Onge
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborat… (see more)ive robots can tackle that can also highly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. In this work, we introduce a multi-agent multi-machine tending learning framework by mobile robots based on Multi-agent Reinforcement Learning (MARL) techniques with the design of a suitable observation and reward. Moreover, an attention-based encoding mechanism is developed and integrated into Multi-agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine tending scenarios. Our model (AB-MAPPO) outperformed MAPPO in this new challenging scenario in terms of task success, safety, and resources utilization. Furthermore, we provided an extensive ablation study to support our various design decisions.
A Blockchain Framework for Equitable and Secure Task Allocation in Robot Swarms
Alexandre Pacheco
Xue Liu
Marco Dorigo
Recent studies demonstrate the potential of blockchain to enable robots in a swarm to achieve secure consensus about the environment, partic… (see more)ularly when robots are homogeneous and perform identical tasks. Typically, robots receive rewards for their contributions to consensus achievement, but no studies have yet targeted heterogeneous swarms, in which the robots have distinct physical capabilities suited to different tasks. We present a novel framework that leverages domain knowledge to decompose the swarm mission into a hierarchy of tasks within smart contracts. This allows the robots to reach a consensus about both the environment and the action plan, allocating tasks among robots with diverse capabilities to improve their performance while maintaining security against faults and malicious behaviors. We refer to this concept as equitable and secure task allocation. Validated in Simultaneous Localization and Mapping missions, our approach not only achieves equitable task allocation among robots with varying capabilities, improving mapping accuracy and efficiency, but also shows resilience against malicious attacks.