Portrait de Giovanni Beltrame

Giovanni Beltrame

Membre affilié
Professeur titulaire, Polytechnique Montréal, Département de génie informatique et génie logiciel
Sujets de recherche
Apprentissage en ligne
Apprentissage par renforcement
Intelligence en essaim
Interaction humain-robot
Navigation robotique autonome
Robotique
Systèmes distribués
Vision par ordinateur

Biographie

Giovanni Beltrame a obtenu un doctorat en génie informatique du Politecnico di Milano en 2006, après quoi il a travaillé comme ingénieur en microélectronique à l'Agence spatiale européenne (ESA) sur un certain nombre de projets, allant des systèmes tolérants aux radiations à la conception assistée par ordinateur. En 2010, il s'est installé à Montréal. Il est actuellement professeur au Département de génie informatique et logiciel de Polytechnique Montréal. Il dirige notamment le laboratoire MIST, qui se consacre aux technologies spatiales, où plus de 25 étudiant·e·s et postdoctorant·e·s sont sous sa supervision. Il a réalisé plusieurs projets en collaboration avec l'industrie et les agences gouvernementales dans les domaines de la robotique, de l'intervention en cas de catastrophe et de l'exploration spatiale. Avec son équipe, il a participé à plusieurs missions sur le terrain avec l'ESA, l'Agence spatiale canadienne (ASC) et la NASA (BRAILLE, PANAGAEA-X et IGLUNA, entre autres). Ses recherches portent sur la modélisation et la conception de systèmes embarqués, l'intelligence artificielle et la robotique, sujets sur lesquels il a publié plusieurs articles dans des revues et des conférences de premier plan.

Étudiants actuels

Doctorat - Polytechnique
Collaborateur·rice de recherche - Polytechnique Montreal
Maîtrise recherche - Polytechnique
Doctorat - Polytechnique
Doctorat - Polytechnique

Publications

Learning Multi-agent Multi-machine Tending by Mobile Robots
Abdalwhab Abdalwhab
David St-Onge
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborat… (voir plus)ive robots can tackle that can also highly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. In this work, we introduce a multi-agent multi-machine tending learning framework by mobile robots based on Multi-agent Reinforcement Learning (MARL) techniques with the design of a suitable observation and reward. Moreover, an attention-based encoding mechanism is developed and integrated into Multi-agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine tending scenarios. Our model (AB-MAPPO) outperformed MAPPO in this new challenging scenario in terms of task success, safety, and resources utilization. Furthermore, we provided an extensive ablation study to support our various design decisions.
GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions
Ali Imran
David St-Onge
In industrial environments, predicting human actions is essential for ensuring safe and effective collaboration between humans and robots. T… (voir plus)his paper introduces a perception framework that enables mobile robots to understand and share information about human actions in a decentralized way. The framework first allows each robot to build a spatial graph representing its surroundings, which it then shares with other robots. This shared spatial data is combined with temporal information to track human behavior over time. A swarm-inspired decision-making process is used to ensure all robots agree on a unified interpretation of the human's actions. Results show that adding more robots and incorporating longer time sequences improve prediction accuracy. Additionally, the consensus mechanism increases system resilience, making the multi-robot setup more reliable in dynamic industrial settings.
GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions
Ali Imran
David St-Onge
In industrial environments, predicting human actions is essential for ensuring safe and effective collaboration between humans and robots. T… (voir plus)his paper introduces a perception framework that enables mobile robots to understand and share information about human actions in a decentralized way. The framework first allows each robot to build a spatial graph representing its surroundings, which it then shares with other robots. This shared spatial data is combined with temporal information to track human behavior over time. A swarm-inspired decision-making process is used to ensure all robots agree on a unified interpretation of the human's actions. Results show that adding more robots and incorporating longer time sequences improve prediction accuracy. Additionally, the consensus mechanism increases system resilience, making the multi-robot setup more reliable in dynamic industrial settings.
A Multi-Robot Exploration Planner for Space Applications
Vivek Shankar Vardharajan
We propose a distributed multi-robot exploration planning method designed for complex, unconstrained environments featuring steep elevation … (voir plus)changes. The method employs a two-tiered approach: a local exploration planner that constructs a grid graph to maximize exploration gain and a global planner that maintains a sparse navigational graph to track visited locations and frontier information. The global graphs are periodically synchronized among robots within communication range to maintain an updated representation of the environment. Our approach integrates localization loop closure estimates to correct global graph drift. In simulation and field tests, the proposed method achieves 50% lower computational runtime compared to state-of-the-art methods while demonstrating superior exploration coverage. We evaluate its performance in two simulated subterranean environments and in field experiments at a Mars-analog terrain.
A Multi-Robot Exploration Planner for Space Applications
Vivek Shankar Vardharajan
BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation
Haechan Mark Bong
Ricardo de Azambuja
Real-time aerial image segmentation plays an important role in the environmental perception of Uncrewed Aerial Vehicles (UAVs). We introduce… (voir plus) BlabberSeg, an optimized Vision-Language Model built on CLIPSeg for on-board, real-time processing of aerial images by UAVs. BlabberSeg improves the efficiency of CLIPSeg by reusing prompt and model features, reducing computational overhead while achieving real-time open-vocabulary aerial segmentation. We validated BlabberSeg in a safe landing scenario using the Dynamic Open-Vocabulary Enhanced SafE-Landing with Intelligence (DOVESEI) framework, which uses visual servoing and open-vocabulary segmentation. BlabberSeg reduces computational costs significantly, with a speed increase of 927.41% (16.78 Hz) on a NVIDIA Jetson Orin AGX (64GB) compared with the original CLIPSeg (1.81Hz), achieving real-time aerial segmentation with negligible loss in accuracy (2.1% as the ratio of the correctly segmented area with respect to CLIPSeg). BlabberSeg's source code is open and available online.
Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration
Rongge Zhang
Haechan Mark Bong
Physical Simulation for Multi-agent Multi-machine Tending
Abdalwhab Abdalwhab
David St-Onge
Multi-Objective Risk Assessment Framework for Exploration Planning Using Terrain and Traversability Analysis
Riana Gagnon Souleiman
Vivek Shankar Vardharajan
Frequency-based View Selection in Gaussian Splatting Reconstruction
Monica Li
Pierre-Yves Lajoie
Three-dimensional reconstruction is a fundamental problem in robotics perception. We examine the problem of active view selection to perform… (voir plus) 3D Gaussian Splatting reconstructions with as few input images as possible. Although 3D Gaussian Splatting has made significant progress in image rendering and 3D reconstruction, the quality of the reconstruction is strongly impacted by the selection of 2D images and the estimation of camera poses through Structure-from-Motion (SfM) algorithms. Current methods to select views that rely on uncertainties from occlusions, depth ambiguities, or neural network predictions directly are insufficient to handle the issue and struggle to generalize to new scenes. By ranking the potential views in the frequency domain, we are able to effectively estimate the potential information gain of new viewpoints without ground truth data. By overcoming current constraints on model architecture and efficacy, our method achieves state-of-the-art results in view selection, demonstrating its potential for efficient image-based 3D reconstruction.
Swarming Out of the Lab: Comparing Relative Localization Methods for Collective Behavior
Rafael Gomes Braga
Vivek Shankar Vardharajan
David St-Onge
Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration
Rongge Zhang
Haechan Mark Bong
Exploration in unknown and unstructured environments is a pivotal requirement for robotic applications. A robot’s exploration behavior can… (voir plus) be inherently affected by the performance of its Simultaneous Localization and Mapping (SLAM) subsystem, although SLAM and exploration are generally studied separately. In this paper, we formulate exploration as an active mapping problem and extend it with semantic information. We introduce a novel active metric-semantic SLAM approach, leveraging recent research advances in information theory and spectral graph theory: we combine semantic mutual information and the connectivity metrics of the underlying pose graph of the SLAM subsystem. We use the resulting utility function to evaluate different trajectories to select the most favorable strategy during exploration. Exploration and SLAM metrics are analyzed in experiments. Running our algorithm on the Habitat dataset, we show that, while maintaining efficiency close to the state-of-the-art exploration methods, our approach effectively increases the performance of metric-semantic SLAM with a 21% reduction in average map error and a 9% improvement in average semantic classification accuracy.