Publications

Fast Fine-Tuning Using Curriculum Domain Adaptation

Lulan Shen

Ibtihel Amara

Ruofeng Li

Brett Meyer

James J. Clark

Current deep neural networks (DNNs) have achieved remarkable accuracy in various downstream tasks. However, their training and fine-tuning a… (voir plus)re challenging due to several factors, such as limited computational resources, extended training and fine-tuning times, and over-fitting due to small datasets. To address these challenges, we propose a three-stage fast fine-tuning method that efficiently trains DNNs for edge devices. Our method combines curriculum learning and domain adaptation techniques to accelerate training while achieving comparable performance. First, we develop a data curriculum approach, which ranks the dataset according to difficulty and split it into the source domain (containing easy data) and the target domain (containing difficult data). Second, we adapt the pretrained model from the source domain to the target domain using an unsupervised domain adaptation (UDA) method called Deep CORAL. Finally, we continue training the adapted model on the source domain with fewer epochs. Our method achieves high accuracy quickly on various modern neural network architectures and datasets such as CIFAR-10, CIFAR-100, and CINIC-10.

2023-06-01

Canadian Conference on Computer and Robot Vision (publié)

doi.org

Geometry Regularized Autoencoders

Andres F. Duque Correa

Sacha Morin

Guy Wolf

Kevin R. Moon

A fundamental task in data exploration is to extract low dimensional representations that capture intrinsic geometry in data, especially for… (voir plus) faithfully visualizing data in two or three dimensions. Common approaches use kernel methods for manifold learning. However, these methods typically only provide an embedding of the input data and cannot extend naturally to new data points. Autoencoders have also become popular for representation learning. While they naturally compute feature extractors that are extendable to new data and invertible (i.e., reconstructing original features from latent representation), they often fail at representing the intrinsic data geometry compared to kernel-based manifold learning. We present a new method for integrating both approaches by incorporating a geometric regularization term in the bottleneck of the autoencoder. This regularization encourages the learned latent representation to follow the intrinsic data geometry, similar to manifold learning algorithms, while still enabling faithful extension to new data and preserving invertibility. We compare our approach to autoencoder models for manifold learning to provide qualitative and quantitative evidence of our advantages in preserving intrinsic structure, out of sample extension, and reconstruction. Our method is easily implemented for big-data applications, whereas other methods are limited in this regard.

2023-06-01

IEEE Transactions on Pattern Analysis and Machine Intelligence (publié)

doi.org

Grow-push-prune: Aligning deep discriminants for effective structural network compression

Qing Tian

Tal Arbel

James J. Clark

2023-06-01

Computer Vision and Image Understanding (publié)

doi.org

arxiv.org

Mixed-Variable PSO with Fairness on Multi-Objective Field Data Replication in Wireless Networks

Dun Yuan

Yujin Nam

Amal Feriani

Abhisek Konar

Di Wu

Seowoo Jang

Xue (Steve) Liu

Gregory Dudek

Digital twins have shown a great potential in supporting the development of wireless networks. They are virtual representations of 5G/6G sys… (voir plus)tems enabling the design of machine learning and optimization-based techniques. Field data replication is one of the critical aspects of building a simulation-based twin, where the objective is to calibrate the simulation to match field performance measurements. Since wireless networks involve a variety of key performance indicators (KPIs), the replication process becomes a multi-objective optimization problem in which the purpose is to minimize the error between the simulated and field data KPIs. Unlike previous works, we focus on designing a data-driven search method to calibrate the simulator and achieve accurate and reliable reproduction of field performance. This work proposes a search-based algorithm based on mixed-variable particle swarm optimization (PSO) to find the optimal simulation parameters. Furthermore, we extend this solution to account for potential conflicts between the KPIs using a-fairness concept to adjust the importance attributed to each KPI during the search. Experiments on field data showcase the effectiveness of our approach to (i) improve the accuracy of the replication, (ii) enhance the fairness between the different KPIs, and (iii) guarantee faster convergence compared to other methods.

2023-06-01

ICC 2023 - IEEE International Conference on Communications (publié)

doi.org

arxiv.org

Multi-Agent Attention Actor-Critic Algorithm for Load Balancing in Cellular Networks

Jikun Kang

Di Wu

Ju Wang

Ekram Hossain

Xue (Steve) Liu

Gregory Dudek

In cellular networks, User Equipment (UE) handoff from one Base Station (BS) to another, giving rise to the load balancing problem among the… (voir plus) BSs. To address this problem, BSs can work collaboratively to deliver a smooth migration (or handoff) and satisfy the UEs' service requirements. This paper formulates the load balancing problem as a Markov game and proposes a Robust Multi-agent Attention Actor-Critic (Robust-MA3C) algorithm that can facilitate collaboration among the BSs (i.e., agents). In particular, to solve the Markov game and find a Nash equilibrium policy, we embrace the idea of adopting a nature agent to model the system uncertainty. Moreover, we utilize the self-attention mechanism, which encourages high-performance BSs to assist low-performance BSs. In addition, we consider two types of schemes, which can facilitate load balancing for both active UEs and idle UEs. We carry out extensive evaluations by simulations, and simulation results illustrate that, compared to the state-of-the-art MARL methods, Robust-MA3C scheme can improve the overall performance by up to 45%.

2023-06-01

ICC 2023 - IEEE International Conference on Communications (publié)

doi.org

arxiv.org

Policy Reuse for Communication Load Balancing in Unseen Traffic Scenarios

Yi Tian Xu

Jimmy Li

Di Wu

M. Jenkin

Seowoo Jang

Xue (Steve) Liu

Gregory Dudek

With the continuous growth in communication network complexity and traffic volume, communication load balancing solutions are receiving incr… (voir plus)easing attention. Specifically, reinforcement learning (RL)-based methods have shown impressive performance compared with traditional rule-based methods. However, standard RL methods generally require an enormous amount of data to train, and generalize poorly to scenarios that are not encountered during training. We propose a policy reuse framework in which a policy selector chooses the most suitable pre-trained RL policy to execute based on the current traffic condition. Our method hinges on a policy bank composed of policies trained on a diverse set of traffic scenarios. When deploying to an unknown traffic scenario, we select a policy from the policy bank based on the similarity between the previous-day traffic of the current scenario and the traffic observed during training. Experiments demonstrate that this framework can outperform classical and adaptive rule-based methods by a large margin.

2023-06-01

ICC 2023 - IEEE International Conference on Communications (publié)

doi.org

arxiv.org

Robust Scuba Diver Tracking and Recovery in Open Water Using YOLOv7, SORT, and Spiral Search

Faraz Lotfi

Khalil Virji

Gregory Dudek

Target tracking is a classic problem in computer vision, with numerous applications in robotics. However, tracking targets underwater presen… (voir plus)ts additional complications due to the six degrees of freedom nature of the problem and the challenging visual environment. In this paper, we address the problem of robotic underwater tracking of scuba divers by partitioning it into two parts: vision and control. We propose a new approach that exploits a highly-maneuverable underwater robot to perform experiments in open water, coupling sensing and control for improved performance. To evaluate the temporal stability of different tracking paradigms, we introduce a new metric, frame-to-frame vari-ance, which is better suited to assess the smoothness of detections from the vision side. We implement PID controllers for control and a spiral search algorithm for target recovery in case of a tracking failure. Our approach only uses observations in the image plane, eliminating the need for robot localization or camera calibration. Using a tracking-by-detection paradigm that combines YOLOv7 for target detection, a tuned filtering technique for temporal stability, and a spiral search algorithm for target recovery, we demonstrate promising performance for long-term tracking. We evaluate our proposed paradigm on the VDD-C dataset and deploy it on an underwater robot for several experiments in open water. Our outcomes show consistency with the ones in the initial studies, and the spiral search algorithm demonstrates promising performance for recapturing a target after a tracking failure. Our approach delivers promising performance for robust underwater tracking, achieving successful open-water tracking scenarios in the presence of strong water currents.

2023-06-01

Canadian Conference on Computer and Robot Vision (publié)

doi.org

Self-Supervised Transformer Architecture for Change Detection in Radio Access Networks

Igor Kozlov

Dmitriy Rivkin

Wei-Di Chang

Di Wu

Xue (Steve) Liu

Gregory Dudek

Radio Access Networks (RANs) for telecommunications represent large agglomerations of interconnected hardware consisting of hundreds of thou… (voir plus)sands of transmitting devices (cells). Such networks undergo frequent and often heterogeneous changes caused by network operators, who are seeking to tune their system parameters for optimal performance. The effects of such changes are challenging to predict and will become even more so with the adoption of fifth-generation/sixth-generation (5G/6G) networks. Therefore, RAN monitoring is vital for network operators. We propose a self-supervised learning framework that leverages self-attention and self-distillation for this task. It works by detecting changes in Performance Measurement data, a collection of time-varying metrics which reflect a set of diverse measurements of the network performance at the cell level. Experimental results show that our approach outperforms the state of the art by 4% on a real-world based dataset consisting of about hundred thousands time series. It also has the merits of being scalable and generalizable. This allows it to provide deep insight into the specifics of mode of operation changes while relying minimally on expert knowledge.

2023-06-01

ICC 2023 - IEEE International Conference on Communications (publié)

doi.org

arxiv.org

Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?

Salah Zaiem

Youcef Kemiche

Titouan Parcollet

Slim Essid

Mirco Ravanelli

Self-supervised learning (SSL) has recently allowed leveraging large datasets of unlabeled speech signals to reach impressive performance on… (voir plus) speech tasks using only small amounts of annotated data. The high number of proposed approaches fostered the need and rise of extended benchmarks that evaluate their performance on a set of downstream tasks exploring various aspects of the speech signal. However, and while the number of considered tasks has been growing, most rely upon a single decoding architecture that maps the frozen SSL representations to the downstream labels. This work investigates the robustness of such benchmarking results to changes in the decoder architecture. Interestingly, it appears that varying the architecture of the downstream decoder leads to significant variations in the leaderboards of most tasks. Concerningly, our study reveals that benchmarking using limited decoders may cause a counterproductive increase in the sizes of the developed SSL models.

2023-06-01

ArXiv (preprint)

doi.org

arxiv.org

The clinical value of Aspergillus-specific IgG antibody test in the diagnosis of nonneutropenic invasive pulmonary aspergillosis

Yajie Lu

Lulu Liu

Hongxing Li

Bilin Chen

Yu Gu

Li Wang

Chunlai Feng

Cheng Chen

Yanbin Chen

Wenkui Sun

Xuefan Cui

Min Cao

Yujian Tao

Jinjin Zhong

Huanhuan Zhong

Yueyan Ni

Yuchen Cai

Mengyue Song

Xiaoguang Liu

Yi Shi … (voir 1 de plus)

Xin Su

2023-06-01

Clinical Microbiology and Infection (publié)

doi.org

The Plausibility of Sampling as an Algorithmic Theory of Sentence Processing

Jacob Louis Hoover

Morgan Sonderegger

Steven T. Piantadosi

Timothy O'Donnell

Abstract Words that are more surprising given context take longer to process. However, no incremental parsing algorithm has been shown to di… (voir plus)rectly predict this phenomenon. In this work, we focus on a class of algorithms whose runtime does naturally scale in surprisal—those that involve repeatedly sampling from the prior. Our first contribution is to show that simple examples of such algorithms predict runtime to increase superlinearly with surprisal, and also predict variance in runtime to increase. These two predictions stand in contrast with literature on surprisal theory (Hale, 2001; Levy, 2008a) which assumes that the expected processing cost increases linearly with surprisal, and makes no prediction about variance. In the second part of this paper, we conduct an empirical study of the relationship between surprisal and reading time, using a collection of modern language models to estimate surprisal. We find that with better language models, reading time increases superlinearly in surprisal, and also that variance increases. These results are consistent with the predictions of sampling-based algorithms.

2023-06-01

Open Mind (publié)

doi.org

Wuerstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models

Pablo Pernias

Dominic Rampas

Mats Leon Richter

Christopher J. Pal

Marc Aubreville

We introduce W\"urstchen, a novel architecture for text-to-image synthesis that combines competitive performance with unprecedented cost-eff… (voir plus)ectiveness for large-scale text-to-image diffusion models. A key contribution of our work is to develop a latent diffusion technique in which we learn a detailed but extremely compact semantic image representation used to guide the diffusion process. This highly compressed representation of an image provides much more detailed guidance compared to latent representations of language and this significantly reduces the computational requirements to achieve state-of-the-art results. Our approach also improves the quality of text-conditioned image generation based on our user preference study. The training requirements of our approach consists of 24,602 A100-GPU hours - compared to Stable Diffusion 2.1's 200,000 GPU hours. Our approach also requires less training data to achieve these results. Furthermore, our compact latent representations allows us to perform inference over twice as fast, slashing the usual costs and carbon footprint of a state-of-the-art (SOTA) diffusion model significantly, without compromising the end performance. In a broader comparison against SOTA models our approach is substantially more efficient and compares favorably in terms of image quality. We believe that this work motivates more emphasis on the prioritization of both performance and computational accessibility.

2023-06-01

ArXiv (prépublication)

arxiv.org

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Publications

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications