Publications

156. Modeling Eye Gaze to Videos Using Dynamic Trajectory Variability Analysis
Qianying Wu
Na Yeon Kim
Jasmin Turner
Umit Keles
Lynn Paul
Ralph Adolphs
ArK: Augmented Reality with Knowledge Interactive Emergent Ability
Qiuyuan Huang
J. Park
Abhinav Gupta
Pan Lu
Paul N. Bennett
Ran Gong
Subhojit Som
Baolin Peng
Owais Khan Mohammed
Yejin Choi
Jianfeng Gao
Despite the growing adoption of mixed reality and interactive AI agents, it remains challenging for these systems to generate high quality 2… (see more)D/3D scenes in unseen environments. The common practice requires deploying an AI agent to collect large amounts of data for model training for every new task. This process is costly, or even impossible, for many domains. In this study, we develop an infinite agent that learns to transfer knowledge memory from general foundation models (e.g. GPT4, DALLE) to novel domains or scenarios for scene understanding and generation in the physical or virtual world. The heart of our approach is an emerging mechanism, dubbed Augmented Reality with Knowledge Inference Interaction (ArK), which leverages knowledge-memory to generate scenes in unseen physical world and virtual reality environments. The knowledge interactive emergent ability (Figure 1) is demonstrated as the observation learns i) micro-action of cross-modality: in multi-modality models to collect a large amount of relevant knowledge memory data for each interaction task (e.g., unseen scene understanding) from the physical reality; and ii) macro-behavior of reality-agnostic: in mix-reality environments to improve interactions that tailor to different characterized roles, target variables, collaborative information, and so on. We validate the effectiveness of ArK on the scene generation and editing tasks. We show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes, compared to baselines, demonstrating the potential benefit of incorporating ArK in generative AI for applications such as metaverse and gaming simulation.
Bird Distribution Modelling using Remote Sensing and Citizen Science data
Mélisande Teng
Amna Elmustafa
Benjamin Akera
Combining Parameter-efficient Modules for Task-level Generalisation
Distinct Social Behavior and Inter-Brain Connectivity in Dyads with autistic individuals
Quentin Moreau
Florence Brun
Anaël Ayrolles
Jacqueline Nadel
Embracing Channel Estimation in Multi-Packet Reception of ZigBee
Zhe Wang
L. Kong
Xuemei Liu
Guihai Chen
As a low-power and low-cost wireless protocol, the promising ZigBee has been widely used in sensor networks and cyber-physical systems. Sinc… (see more)e ZigBee based networks usually adopt tree or cluster topology, the convergecast scenarios are common in which multiple transmitters send packets to one receiver, leading to the severe collision problem. The conventional ZigBee adopts carrier sense multiple access with collisions avoidance to avoid collisions, which introduces additional time/energy overhead. The state-of-the-art methods resolve collisions instead of avoidance, in which mZig decomposes a collision by the collision itself and reZig decodes a collision by comparing with reference waveforms. However, mZig falls into high decoding errors only exploiting the signal amplitudes while reZig incurs high computational complexity for waveform comparison. In this paper, we propose CmZig to embrace channel estimation in multiple-packet reception (MPR) of ZigBee, which effectively improves MPR via lightweight computing used for channel estimation and collision decomposition. First, CmZig enables accurate collision decomposition with low computational complexity, which uses the estimated channel parameters modeling both signal amplitudes and phases. Second, CmZig adopts reference waveform comparison only for collisions without chip-level time offsets, instead of the complex machine learning based method. We implement CmZig on USRP-N210 and establish a six-node testbed. Results show that CmZig achieves a bit error rate in the order of
Embracing Channel Estimation in Multi-Packet Reception of ZigBee
Zhe Wang
Linghe Kong
Guihai Chen
As a low-power and low-cost wireless protocol, the promising ZigBee has been widely used in sensor networks and cyber-physical systems. Sinc… (see more)e ZigBee based networks usually adopt tree or cluster topology, the convergecast scenarios are common in which multiple transmitters send packets to one receiver, leading to the severe collision problem. The conventional ZigBee adopts carrier sense multiple access with collisions avoidance to avoid collisions, which introduces additional time/energy overhead. The state-of-the-art methods resolve collisions instead of avoidance, in which mZig decomposes a collision by the collision itself and reZig decodes a collision by comparing with reference waveforms. However, mZig falls into high decoding errors only exploiting the signal amplitudes while reZig incurs high computational complexity for waveform comparison. In this paper, we propose CmZig to embrace channel estimation in multiple-packet reception (MPR) of ZigBee, which effectively improves MPR via lightweight computing used for channel estimation and collision decomposition. First, CmZig enables accurate collision decomposition with low computational complexity, which uses the estimated channel parameters modeling both signal amplitudes and phases. Second, CmZig adopts reference waveform comparison only for collisions without chip-level time offsets, instead of the complex machine learning based method. We implement CmZig on USRP-N210 and establish a six-node testbed. Results show that CmZig achieves a bit error rate in the order of
Low-Complexity Sphere Decoding for Polar-Coded MIMO Systems
Huayi Zhou
Jian Zheng
Minhua Yang
Xiaohu You
Chuan Zhang
For polar-coded MIMO systems, separate detection and decoding (SDD) is the traditional scheme. In SDD systems, sphere decoding (SD) is one o… (see more)f the competitive MIMO detection schemes. However, SD may not utilize the coding information sufficiently in SDD systems, causing an error-correction performance loss. The existed joint detection and decoding using breadth-first SD (BSD) improves the performance than SDD, whereas the limited search space still causes a performance loss. In this paper, we propose joint detection and decoding based on SD (SD JDD) for polar-coded MIMO systems to reach maximum likelihood (ML) bound. Subsequently, two approaches are further proposed to reduce the computational complexity. The first approach reduces the layers of the SD search tree by exploiting symbol synchro sets, which could accelerate the convergence of SD JDD. The second efficient approach performs multiple tree searches. A small initial radius of the sphere for the first search is assigned to reduce the search space. The ML optimality could be preserved by the following multiple tree searches with increasing radius. It is shown from the numerical results that the proposed JDD outperforms SDD by 3.1 dB at FER
MAPL: Parameter-Efficient Adaptation of Unimodal Pre-Trained Models for Vision-Language Few-Shot Prompting
Oscar Mañas
Pau Rodriguez
Saba Ahmadi
Aida Nematzadeh
Yash Goyal
Large pre-trained models have proved to be remarkable zero- and (prompt-based) few-shot learners in unimodal vision and language tasks. We p… (see more)ropose MAPL, a simple and parameter-efficient method that reuses frozen pre-trained unimodal models and leverages their strong generalization capabilities in multimodal vision-language (VL) settings. MAPL learns a lightweight mapping between the representation spaces of unimodal models using aligned image-text data, and can generalize to unseen VL tasks from just a few in-context examples. The small number of trainable parameters makes MAPL effective at low-data and in-domain learning. Moreover, MAPL’s modularity enables easy extension to other pre-trained models. Extensive experiments on several visual question answering and image captioning benchmarks show that MAPL achieves superior or competitive performance compared to similar methods while training orders of magnitude fewer parameters. MAPL can be trained in just a few hours using modest computational resources and public datasets. We release our code and pre-trained model weights at https://github.com/oscmansan/mapl.
OC-0290 Investigation of the feasibility of selenium-75 as a viable brachytherapy source
J. Reid
Jonathan Kalinowski
J. Munro
A. Armstrong
PD-0334 Techniques to optimize auto-segmentation of small OARs in pediatric patients undergoing CSI
J. Tsui
M. Popovic
O. Ates
C. Hua
J. Schneider
S. Skamene
C. Freeman
PD-0505 Monte Carlo simulated correction factors of a novel phantom for brachytherapy dosimetry audits
K. Chelminski
R. Abdulrahim
A. Dimitriadis
E. Granizo-Roman
Jonathan Kalinowski
G. Azangwe
J. Swamidas