Haechan Mark Bong

Co-supervisor

Jana Pavlasek

Sujets de recherche

IA appliquée

Modèles de fondation

Navigation robotique autonome

Optimisation

Réseaux de neurones profonds

Robotique

Vision par ordinateur

Site web

Google Scholar

Publications

Multi-Robot Decentralized Collaborative SLAM in Planetary Analogue Environments: Dataset, Challenges, and Lessons Learned

Pierre-Yves Lajoie

Karthik Soma

Alice Lemieux-Bourque

Rongge Zhang

Vivek Shankar Vardharajan

2025-01-01

IEEE Transactions on Field Robotics (publié)

BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation

Ricardo de Azambuja

Real-time aerial image segmentation plays an important role in the environmental perception of Uncrewed Aerial Vehicles (UAVs). We introduce… (voir plus) BlabberSeg, an optimized Vision-Language Model built on CLIPSeg for on-board, real-time processing of aerial images by UAVs. BlabberSeg improves the efficiency of CLIPSeg by reusing prompt and model features, reducing computational overhead while achieving real-time open-vocabulary aerial segmentation. We validated BlabberSeg in a safe landing scenario using the Dynamic Open-Vocabulary Enhanced SafE-Landing with Intelligence (DOVESEI) framework, which uses visual servoing and open-vocabulary segmentation. BlabberSeg reduces computational costs significantly, with a speed increase of 927.41% (16.78 Hz) on a NVIDIA Jetson Orin AGX (64GB) compared with the original CLIPSeg (1.81Hz), achieving real-time aerial segmentation with negligible loss in accuracy (2.1% as the ratio of the correctly segmented area with respect to CLIPSeg). BlabberSeg's source code is open and available online.

2024-10-16

ArXiv (prépublication)

Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration

Rongge Zhang

2024-10-14

2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (publié)

Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration

Rongge Zhang

Exploration in unknown and unstructured environments is a pivotal requirement for robotic applications. A robot’s exploration behavior can… (voir plus) be inherently affected by the performance of its Simultaneous Localization and Mapping (SLAM) subsystem, although SLAM and exploration are generally studied separately. In this paper, we formulate exploration as an active mapping problem and extend it with semantic information. We introduce a novel active metric-semantic SLAM approach, leveraging recent research advances in information theory and spectral graph theory: we combine semantic mutual information and the connectivity metrics of the underlying pose graph of the SLAM subsystem. We use the resulting utility function to evaluate different trajectories to select the most favorable strategy during exploration. Exploration and SLAM metrics are analyzed in experiments. Running our algorithm on the Habitat dataset, we show that, while maintaining efficiency close to the state-of-the-art exploration methods, our approach effectively increases the performance of metric-semantic SLAM with a 21% reduction in average map error and a 9% improvement in average semantic classification accuracy.

2024-08-27

ArXiv (prépublication)

PEACE: Prompt Engineering Automation for CLIPSeg Enhancement in Aerial Robotics

Rongge Zhang

Ricardo de Azambuja

From industrial to space robotics, safe landing is an essential component for flight operations. With the growing interest in artificial int… (voir plus)elligence, we direct our attention to learning based safe landing approaches. This paper extends our previous work, DOVESEI, which focused on a reactive UAV system by harnessing the capabilities of open vocabulary image segmentation. Prompt-based safe landing zone segmentation using an open vocabulary based model is no more just an idea, but proven to be feasible by the work of DOVESEI. However, a heuristic selection of words for prompt is not a reliable solution since it cannot take the changing environment into consideration and detrimental consequences can occur if the observed environment is not well represented by the given prompt. Therefore, we introduce PEACE (Prompt Engineering Automation for CLIPSeg Enhancement), powering DOVESEI to automate the prompt generation and engineering to adapt to data distribution shifts. Our system is capable of performing safe landing operations with collision avoidance at altitudes as low as 20 meters using only monocular cameras and image segmentation. We take advantage of DOVESEI's dynamic focus to circumvent abrupt fluctuations in the terrain segmentation between frames in a video stream. PEACE shows promising improvements in prompt generation and engineering for aerial images compared to the standard prompt used for CLIP and CLIPSeg. Combining DOVESEI and PEACE, our system was able improve successful safe landing zone selections by 58.62% compared to using only DOVESEI. All the source code is open source and available online.

2023-09-29

ArXiv (prépublication)