Portrait of Giovanni Beltrame

Giovanni Beltrame

Affiliate Member
Full Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering
Research Topics
Autonomous Robotics Navigation
Computer Vision
Distributed Systems
Human-Robot Interaction
Online Learning
Reinforcement Learning
Robotics
Swarm Intelligence

Biography

Giovanni Beltrame obtained his PhD in computer engineering from Politecnico di Milano in 2006, after which he worked as a microelectronics engineer at the European Space Agency on a number of projects, from radiation-tolerant systems to computer-aided design.

In 2010, he moved to Montréal, where he is currently a professor at Polytechnique Montréal in the Computer and Software Engineering Department.

Beltrame directs the Making Innovative Space Technology (MIST) Lab, where he has more than twenty-five students and postdocs under his supervision. He has completed several projects in collaboration with industry and government agencies in the area of robotics, disaster response and space exploration. He and his team have participated in several field missions with ESA, the Canadian Space Agency (CSA) and NASA, including BRAILLE, PANAGAEA-X and IGLUNA.

His research interests include the modelling and design of embedded systems, AI and robotics, and he has published his findings in top journals and conferences.

Current Students

PhD - Polytechnique Montréal
Co-supervisor :
Collaborating researcher - Polytechnique Montréal Montreal
Co-supervisor :
Master's Research - Polytechnique Montréal
Co-supervisor :
PhD - Polytechnique Montréal
Co-supervisor :
Master's Research - Université de Montréal
Co-supervisor :
PhD - Polytechnique Montréal
Co-supervisor :

Publications

3D Foundation Model-Based Loop Closing for Decentralized Collaborative SLAM
Pierre-Yves Lajoie
Benjamin Ramtoula
Daniele De Martini
Decentralized Collaborative Simultaneous Localization and Mapping (C-SLAM) techniques often struggle to identify map overlaps due to signifi… (see more)cant viewpoint variations among robots. Motivated by recent advancements in 3D foundation models, which can register images despite large viewpoint differences, we propose a robust loop closing approach that leverages these models to establish inter-robot measurements. In contrast to resource-intensive methods requiring full 3D reconstruction within a centralized map, our approach integrates foundation models into existing SLAM pipelines, yielding scalable and robust multi-robot mapping. Our contributions include: 1) integrating 3D foundation models to reliably estimate relative poses from monocular image pairs within decentralized C-SLAM; 2) introducing robust outlier mitigation techniques critical to the use of these relative poses and 3) developing specialized pose graph optimization formulations that efficiently resolve scale ambiguities. We evaluate our method against state-of-the-art approaches, demonstrating improvements in localization and mapping accuracy, alongside significant gains in computational and memory efficiency. These results highlight the potential of our approach for deployment in large-scale multi-robot scenarios.
Attention-Based Multi-Agent RL for Multi-Machine Tending Using Mobile Robots
Abdalwhab Abdalwhab
David St-Onge
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborat… (see more)ive robots can tackle that can also greatly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. We introduce a multi-agent multi-machine-tending learning framework using mobile robots based on multi-agent reinforcement learning (MARL) techniques, with the design of a suitable observation and reward. Moreover, we integrate an attention-based encoding mechanism into the Multi-Agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine-tending scenarios. Our model (AB-MAPPO) outperforms MAPPO in this new challenging scenario in terms of task success, safety, and resource utilization. Furthermore, we provided an extensive ablation study to support our design decisions.
Attention-Based Multi-Agent RL for Multi-Machine Tending Using Mobile Robots
Abdalwhab Bakheet Mohamed Abdalwhab
David St-Onge
A Blockchain Framework for Equitable and Secure Task Allocation in Robot Swarms
Recent studies demonstrate the potential of blockchain to enable robots in a swarm to achieve secure consensus about the environment, partic… (see more)ularly when robots are homogeneous and perform identical tasks. Typically, robots receive rewards for their contributions to consensus achievement, but no studies have yet targeted heterogeneous swarms, in which the robots have distinct physical capabilities suited to different tasks. We present a novel framework that leverages domain knowledge to decompose the swarm mission into a hierarchy of tasks within smart contracts. This allows the robots to reach a consensus about both the environment and the action plan, allocating tasks among robots with diverse capabilities to improve their performance while maintaining security against faults and malicious behaviors. We refer to this concept as equitable and secure task allocation. Validated in Simultaneous Localization and Mapping missions, our approach not only achieves equitable task allocation among robots with varying capabilities, improving mapping accuracy and efficiency, but also shows resilience against malicious attacks.
GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions
Ali Imran
David St-Onge
In industrial environments, predicting human actions is essential for ensuring safe and effective collaboration between humans and robots. T… (see more)his paper introduces a perception framework that enables mobile robots to understand and share information about human actions in a decentralized way. The framework first allows each robot to build a spatial graph representing its surroundings, which it then shares with other robots. This shared spatial data is combined with temporal information to track human behavior over time. A swarm-inspired decision-making process is used to ensure all robots agree on a unified interpretation of the human's actions. Results show that adding more robots and incorporating longer time sequences improve prediction accuracy. Additionally, the consensus mechanism increases system resilience, making the multi-robot setup more reliable in dynamic industrial settings.
Learning Multi-agent Multi-machine Tending by Mobile Robots
Abdalwhab Abdalwhab
David St-Onge
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborat… (see more)ive robots can tackle that can also highly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. In this work, we introduce a multi-agent multi-machine tending learning framework by mobile robots based on Multi-agent Reinforcement Learning (MARL) techniques with the design of a suitable observation and reward. Moreover, an attention-based encoding mechanism is developed and integrated into Multi-agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine tending scenarios. Our model (AB-MAPPO) outperformed MAPPO in this new challenging scenario in terms of task success, safety, and resources utilization. Furthermore, we provided an extensive ablation study to support our various design decisions.
GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions
Ali Imran
David St-Onge
In industrial environments, predicting human actions is essential for ensuring safe and effective collaboration between humans and robots. T… (see more)his paper introduces a perception framework that enables mobile robots to understand and share information about human actions in a decentralized way. The framework first allows each robot to build a spatial graph representing its surroundings, which it then shares with other robots. This shared spatial data is combined with temporal information to track human behavior over time. A swarm-inspired decision-making process is used to ensure all robots agree on a unified interpretation of the human's actions. Results show that adding more robots and incorporating longer time sequences improve prediction accuracy. Additionally, the consensus mechanism increases system resilience, making the multi-robot setup more reliable in dynamic industrial settings.
Multi-Robot Decentralized Collaborative SLAM in Planetary Analogue Environments: Dataset, Challenges, and Lessons Learned
Pierre-Yves Lajoie
Karthik Soma
Alice Lemieux-Bourque
Rongge Zhang
Vivek Shankar Vardharajan
A Multi-Robot Exploration Planner for Space Applications
Vivek Shankar Vardharajan
A Multi-Robot Exploration Planner for Space Applications
Vivek Shankar Vardharajan
We propose a distributed multi-robot exploration planning method designed for complex, unconstrained environments featuring steep elevation … (see more)changes. The method employs a two-tiered approach: a local exploration planner that constructs a grid graph to maximize exploration gain and a global planner that maintains a sparse navigational graph to track visited locations and frontier information. The global graphs are periodically synchronized among robots within communication range to maintain an updated representation of the environment. Our approach integrates localization loop closure estimates to correct global graph drift. In simulation and field tests, the proposed method achieves 50% lower computational runtime compared to state-of-the-art methods while demonstrating superior exploration coverage. We evaluate its performance in two simulated subterranean environments and in field experiments at a Mars-analog terrain.
BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation
Ricardo de Azambuja
Real-time aerial image segmentation plays an important role in the environmental perception of Uncrewed Aerial Vehicles (UAVs). We introduce… (see more) BlabberSeg, an optimized Vision-Language Model built on CLIPSeg for on-board, real-time processing of aerial images by UAVs. BlabberSeg improves the efficiency of CLIPSeg by reusing prompt and model features, reducing computational overhead while achieving real-time open-vocabulary aerial segmentation. We validated BlabberSeg in a safe landing scenario using the Dynamic Open-Vocabulary Enhanced SafE-Landing with Intelligence (DOVESEI) framework, which uses visual servoing and open-vocabulary segmentation. BlabberSeg reduces computational costs significantly, with a speed increase of 927.41% (16.78 Hz) on a NVIDIA Jetson Orin AGX (64GB) compared with the original CLIPSeg (1.81Hz), achieving real-time aerial segmentation with negligible loss in accuracy (2.1% as the ratio of the correctly segmented area with respect to CLIPSeg). BlabberSeg's source code is open and available online.
Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration