Gregory Dudek

Github

Google Scholar

Steve Wen

Master's Research - McGill University

Principal supervisor :

Doina Precup

Publications

Policy Reuse for Communication Load Balancing in Unseen Traffic Scenarios

Yi Tian Xu

Jimmy Li

M. Jenkin

Seowoo Jang

Xue Liu

With the continuous growth in communication network complexity and traffic volume, communication load balancing solutions are receiving incr… (see more)easing attention. Specifically, reinforcement learning (RL)-based methods have shown impressive performance compared with traditional rule-based methods. However, standard RL methods generally require an enormous amount of data to train, and generalize poorly to scenarios that are not encountered during training. We propose a policy reuse framework in which a policy selector chooses the most suitable pre-trained RL policy to execute based on the current traffic condition. Our method hinges on a policy bank composed of policies trained on a diverse set of traffic scenarios. When deploying to an unknown traffic scenario, we select a policy from the policy bank based on the similarity between the previous-day traffic of the current scenario and the traffic observed during training. Experiments demonstrate that this framework can outperform classical and adaptive rule-based methods by a large margin.

2023-05-31

ICC 2023 - IEEE International Conference on Communications (published)

Robust Scuba Diver Tracking and Recovery in Open Water Using YOLOv7, SORT, and Spiral Search

Faraz Lotfi

Khalil Virji

Target tracking is a classic problem in computer vision, with numerous applications in robotics. However, tracking targets underwater presen… (see more)ts additional complications due to the six degrees of freedom nature of the problem and the challenging visual environment. In this paper, we address the problem of robotic underwater tracking of scuba divers by partitioning it into two parts: vision and control. We propose a new approach that exploits a highly-maneuverable underwater robot to perform experiments in open water, coupling sensing and control for improved performance. To evaluate the temporal stability of different tracking paradigms, we introduce a new metric, frame-to-frame vari-ance, which is better suited to assess the smoothness of detections from the vision side. We implement PID controllers for control and a spiral search algorithm for target recovery in case of a tracking failure. Our approach only uses observations in the image plane, eliminating the need for robot localization or camera calibration. Using a tracking-by-detection paradigm that combines YOLOv7 for target detection, a tuned filtering technique for temporal stability, and a spiral search algorithm for target recovery, we demonstrate promising performance for long-term tracking. We evaluate our proposed paradigm on the VDD-C dataset and deploy it on an underwater robot for several experiments in open water. Our outcomes show consistency with the ones in the initial studies, and the spiral search algorithm demonstrates promising performance for recapturing a target after a tracking failure. Our approach delivers promising performance for robust underwater tracking, achieving successful open-water tracking scenarios in the presence of strong water currents.

2023-05-31

Canadian Conference on Computer and Robot Vision (published)

Self-Supervised Transformer Architecture for Change Detection in Radio Access Networks

Igor Kozlov

Dmitriy Rivkin

Wei-Di Chang

Xue Liu

Radio Access Networks (RANs) for telecommunications represent large agglomerations of interconnected hardware consisting of hundreds of thou… (see more)sands of transmitting devices (cells). Such networks undergo frequent and often heterogeneous changes caused by network operators, who are seeking to tune their system parameters for optimal performance. The effects of such changes are challenging to predict and will become even more so with the adoption of fifth-generation/sixth-generation (5G/6G) networks. Therefore, RAN monitoring is vital for network operators. We propose a self-supervised learning framework that leverages self-attention and self-distillation for this task. It works by detecting changes in Performance Measurement data, a collection of time-varying metrics which reflect a set of diverse measurements of the network performance at the cell level. Experimental results show that our approach outperforms the state of the art by 4% on a real-world based dataset consisting of about hundred thousands time series. It also has the merits of being scalable and generalizable. This allows it to provide deep insight into the specifics of mode of operation changes while relying minimally on expert knowledge.

2023-05-31

ICC 2023 - IEEE International Conference on Communications (published)

Reinforcement learning for communication load balancing: approaches and challenges

Jimmy Li

Amal Ferini

Yi Tian Xu

M. Jenkin

Seowoo Jang

Xue Liu

The amount of cellular communication network traffic has increased dramatically in recent years, and this increase has led to a demand for e… (see more)nhanced network performance. Communication load balancing aims to balance the load across available network resources and thus improve the quality of service for network users. Most existing load balancing algorithms are manually designed and tuned rule-based methods where near-optimality is almost impossible to achieve. Furthermore, rule-based methods are difficult to adapt to quickly changing traffic patterns in real-world environments. Reinforcement learning (RL) algorithms, especially deep reinforcement learning algorithms, have achieved impressive successes in many application domains and offer the potential of good adaptabiity to dynamic changes in network load patterns. This survey presents a systematic overview of RL-based communication load-balancing methods and discusses related challenges and opportunities. We first provide an introduction to the load balancing problem and to RL from fundamental concepts to advanced models. Then, we review RL approaches that address emerging communication load balancing issues important to next generation networks, including 5G and beyond. Finally, we highlight important challenges, open issues, and future research directions for applying RL for communication load balancing.

2023-05-30

Frontiers of Computer Science (published)

Neural Bee Colony Optimization: A Case Study in Public Transit Network Design

Andrew Holliday

In this work we explore the combination of metaheuristics and learned neural network solvers for combinatorial optimization. We do this in t… (see more)he context of the transit network design problem, a uniquely challenging combinatorial optimization problem with real-world importance. We train a neural network policy to perform single-shot planning of individual transit routes, and then incorporate it as one of several sub-heuristics in a modified Bee Colony Optimization (BCO) metaheuristic algorithm. Our experimental results demonstrate that this hybrid algorithm outperforms the learned policy alone by up to 20% and the original BCO algorithm by up to 53% on realistic problem instances. We perform a set of ablations to study the impact of each component of the modified algorithm.

2023-05-17

ArXiv (preprint)

Eliminating Space Scanning: Fast mmWave Beam Alignment with UWB Radios

Ju Wang

X. T. Chen

Xue Liu

Due to their large bandwidth and impressive data speed, millimeter-wave (mmWave) radios are expected to play a key role in the 5G and beyond… (see more) (e.g., 6G) communication networks. Yet, to release mmWave’s true power, the highly directional mmWave beams need to be aligned perfectly. Most existing beam alignment methods adopt an exhaustive or semi-exhaustive space scanning, which introduces up to seconds of delays. To eliminate the need for complex space scanning, this article presents an Ultra-wideband (UWB)-assisted mmWave communication framework, which leverages the co-located UWB antennas to estimate the best angles for mmWave beam alignment. One major challenge of applying this idea in the real world is the barrier of limited antenna numbers. Commercial-Off-The-Shelf (COTS) devices are usually equipped with only a small number of UWB antennas, which are not enough for the existing algorithms to provide an accurate angle estimation. To solve this challenge, we design a novel Multi-Frequency MUltiple SIgnal Classification (MF-MUSIC) algorithm, which extends the classic MUltiple SIgnal Classification (MUSIC) algorithm to the frequency domain and overcomes the antenna limitation barrier in the spatial domain. Extensive real-world experiments and numerical simulations illustrate the advantage of the proposed MF-MUSIC algorithm. MF-MUSIC uses only three antennas to achieve an accurate angle estimation, which is a mere 0.15° (or a relative difference of 3.6%) different from the state-of-the-art 16-antenna-based angle estimation method.

2023-05-15

ACM Transactions on Sensor Networks (published)

Bayesian Q-learning With Imperfect Expert Demonstrations

Xiru Zhu

Guided exploration with expert demonstrations improves data efficiency for reinforcement learning, but current algorithms often overuse expe… (see more)rt information. We propose a novel algorithm to speed up Q-learning with the help of a limited amount of imperfect expert demonstrations. The algorithm avoids excessive reliance on expert data by relaxing the optimal expert assumption and gradually reducing the usage of uninformative expert data. Experimentally, we evaluate our approach on a sparse-reward chain environment and six more complicated Atari games with delayed rewards. With the proposed methods, we can achieve better results than Deep Q-learning from Demonstrations (Hester et al., 2017) in most environments.

2022-12-08

NeurIPS.cc/2022/Workshop/DeepRL (accepted)

openreview.net

IL-flOw: Imitation Learning from Observation using Normalizing Flows

Wei-Di Chang

Juan Higuera

Scott Fujimoto

2022-05-18

ArXiv (preprint)

Learning Assisted Identification of Scenarios Where Network Optimization Algorithms Under-Perform

Dmitriy Rivkin

X. T. Chen

Xue Liu

We present a generative adversarial method that uses deep learning to identify network load traffic conditions in which network optimization… (see more) algorithms under-perform other known algorithms: the Deep Convolutional Failure Generator (DCFG). The spatial distribution of network load presents challenges for network operators for tasks such as load balancing, in which a network optimizer attempts to maintain high quality communication while at the same time abiding capacity constraints. Testing a network optimizer for all possible load distributions is challenging if not impossible. We propose a novel method that searches for load situations where a target network optimization method underperforms baseline, which are key test cases that can be used for future refinement and performance optimization. By modeling a realistic network simulator's quality assessments with a deep network and, in parallel, optimizing a load generation network, our method efficiently searches the high dimensional space of load patterns and reliably finds cases in which a target network optimization method under-performs a baseline by a significant margin.

2021-11-30

Global Communications Conference (published)

Latent Attention Augmentation for Robust Autonomous Driving Policies

Ran Cheng

Christopher Agia

Florian Shkurti

Model-free reinforcement learning has become a viable approach for vision-based robot control. However, sample complexity and adaptability t… (see more)o domain shifts remain persistent challenges when operating in high-dimensional observation spaces (images, LiDAR), such as those that are involved in autonomous driving. In this paper, we propose a flexible framework by which a policy’s observations are augmented with robust attention representations in the latent space to guide the agent’s attention during training. Our method encodes local and global descriptors of the augmented state representations into a compact latent vector, and scene dynamics are approximated by a recurrent network that processes the latent vectors in sequence. We outline two approaches for constructing attention maps; a supervised pipeline leveraging semantic segmentation networks, and an unsupervised pipeline relying only on classical image processing techniques. We conduct our experiments in simulation and test the learned policy against varying seasonal effects and weather conditions. Our design decisions are supported in a series of ablation studies. The results demonstrate that our state augmentation method both improves learning efficiency and encourages robust domain adaptation when compared to common end-to-end frameworks and methods that learn directly from intermediate representations.

2021-09-26

IEEE/RJS International Conference on Intelligent Robots and Systems (published)

Trajectory-Constrained Deep Latent Visual Attention for Improved Local Planning in Presence of Heterogeneous Terrain

Stefan Wapnick

Travis Manderson

We present a reward-predictive, model-based deep learning method featuring trajectory-constrained visual attention for local planning in vis… (see more)ual navigation tasks. Our method learns to place visual attention at locations in latent image space which follow trajectories caused by vehicle control actions to enhance predictive accuracy during planning. The attention model is jointly optimized by the task-specific loss and an additional trajectory-constraint loss, allowing adaptability yet encouraging a regularized structure for improved generalization and reliability. Importantly, visual attention is applied in latent feature map space instead of raw image space to promote efficient planning. We validated our model in visual navigation tasks of planning low turbulence, collision-free trajectories in off-road settings and hill climbing with locking differentials in the presence of slippery terrain. Experiments involved randomized procedural generated simulation and real-world environments. We found our method improved generalization and learning efficiency when compared to no-attention and self-attention alternatives.

2021-09-26

2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (published)

An Autonomous Probing System for Collecting Measurements at Depth from Small Surface Vehicles

Yuying Huang

Yiming Yao

Johanna Hansen

Jeremy Mallette

Sandeep Manjanna

This paper presents the portable autonomous probing system (APS), a low-cost robotic design for collecting water quality measurements at tar… (see more)geted depths from an autonomous surface vehicle (ASV). This system fills an important but often overlooked niche in marine sampling by enabling mobile sensor observations throughout the near-surface water column without the need for advanced underwater equipment. We present a probe delivery mechanism built with commercially available components and describe the corresponding open-source simulator and winch controller. Finally, we demonstrate the system in a field deployment and discuss design trade-offs and areas for future improvement. Project details are available on https://johannah.github.io/publication/sample-at-depth our website

2021-09-19

OCEANS 2021: San Diego – Porto (published)