Publications

FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters

Yuwei Cheng

Jiannan Zhu

Mengxin Jiang

Jie Fu

Changsong Pang

Peidong Wang

Kris Sankaran

Olawale Moses Onabola

Yimin Liu

Dianbo Liu

Yoshua Bengio

Marine debris is severely threatening the marine lives and causing sustained pollution to the whole ecosystem. To prevent the wastes from ge… (see more)tting into the ocean, it is helpful to clean up the floating wastes in inland waters using the autonomous cleaning devices like unmanned surface vehicles. The cleaning efficiency relies on a high-accurate and robust object detection system. However, the small size of the target, the strong light reflection over water surface, and the reflection of other objects on bank-side all bring challenges to the vision-based object detection system. To promote the practical application for autonomous floating wastes cleaning, we present FloW†, the first dataset for floating waste detection in inland water areas. The dataset consists of an image sub-dataset FloW-Img and a multimodal sub-dataset FloW-RI which contains synchronized millimeter wave radar data and images. Accurate annotations for images and radar data are provided, supporting floating waste detection strategies based on image, radar data, and the fusion of two sensors. We perform several baseline experiments on our dataset, including vision-based and radar-based detection methods. The results show that, the detection accuracy is relatively low and floating waste detection still remains a challenging task.

2021-09-30

IEEE International Conference on Computer Vision (published)

doi.org

Generative Compositional Augmentations for Scene Graph Prediction

Boris Knyazev

Harm de Vries

Cătălina Cangea

Graham W. Taylor

Aaron Courville

Eugene Belilovsky

Inferring objects and their relationships from an image in the form of a scene graph is useful in many applications at the intersection of v… (see more)ision and language. We consider a challenging problem of compositional generalization that emerges in this task due to a long tail data distribution. Current scene graph generation models are trained on a tiny fraction of the distribution corresponding to the most frequent compositions, e.g. . However, test images might contain zero- and few-shot compositions of objects and relationships, e.g. . Despite each of the object categories and the predicate (e.g. 'on') being frequent in the training data, the models often fail to properly understand such unseen or rare compositions. To improve generalization, it is natural to attempt increasing the diversity of the training distribution. However, in the graph domain this is non-trivial. To that end, we propose a method to synthesize rare yet plausible scene graphs by perturbing real ones. We then propose and empirically study a model based on conditional generative adversarial networks (GANs) that allows us to generate visual features of perturbed scene graphs and learn from them in a joint fashion. When evaluated on the Visual Genome dataset, our approach yields marginal, but consistent improvements in zero- and few-shot metrics. We analyze the limitations of our approach indicating promising directions for future research.

2021-09-30

2021 IEEE/CVF International Conference on Computer Vision (ICCV) (published)

doi.org

arxiv.org

Inter-Brain Synchronization: From Neurobehavioral Correlation to Causal Explanation

Guillaume Dumas

2021-09-30

International Journal of Psychophysiology (published)

doi.org

Normalizing automatic spinal cord cross-sectional area measures

S. Bédard

J. Cohen-Adad

Spinal cord cross-sectional area (CSA) is a relevant biomarker to assess spinal cord atrophy in various neurodegenerative diseases. However,… (see more) the considerable inter-subject variability among healthy participants currently limits its usage. Previous studies explored factors contributing to the variability, yet the normalization models were based on a relatively limited number of participants (typically < 300 participants), required manual intervention, and were not implemented in an open-access comprehensive analysis pipeline. Another limitation is related to the imprecise prediction of the spinal levels when using vertebral levels as a reference; a question never addressed before in the search for a normalization method. In this study we implemented a method to measure CSA automatically from a spatial reference based on the central nervous system (the pontomedullary junction, PMJ), we investigated various factors to explain variability, and we developed normalization strategies on a large cohort (N=804). Cervical spinal cord CSA was computed on T1w MRI scans for 804 participants from the UK Biobank database. In addition to computing cross-sectional at the C2-C3 vertebral disc, it was also measured at 64 mm caudal from the PMJ. The effect of various biological, demographic and anatomical factors was explored by computing Pearson’s correlation coefficients. A stepwise linear regression found significant predictors; the coefficients of the best fit model were used to normalize CSA. The correlation between CSA measured at C2-C3 and using the PMJ was y = 0.98 x + 1.78 ( R 2 = 0.97). The best normalization model included thalamus volume, brain volume, sex and interaction between brain volume and sex. With this model, the coefficient of variation went down from 10.09% (without normalization) to 8.59%, a reduction of 14.85%. In this study we identified factors explaining inter-subject variability of spinal cord CSA over a large cohort of participants, and developed a normalization model to reduce the variability. We implemented an approach, based on the PMJ, to measure CSA to overcome limitations associated with the vertebral reference. This approach warrants further validation, especially in longitudinal cohorts. The PMJ-based method and normalization models are readily available in the Spinal Cord Toolbox.

2021-09-30

bioRxiv (preprint)

doi.org

Reward is enough

David Silver

Satinder Singh

Doina Precup

Richard S. Sutton

2021-09-30

Artificial Intelligence (published)

doi.org

Season-Based Occupancy Prediction in Residential Buildings Using Machine Learning Models

Bowen Yang

Fariborz Haghighat

Benjamin C. M. Fung

Karthik Panchabikesan

2021-09-30

e-Prime (published)

doi.org

THE EFFECT SIZE OF GENES ON COGNITIVE ABILITIES IS LINKED TO THEIR EXPRESSION ALONG THE MAJOR HIERARCHICAL GRADIENT IN THE HUMAN BRAIN

Sébastien Jacquemont

Guillaume Huguet

Elise Douard

Zohra Saci

Guillaume Dumas

Laura Almasy

David C. Glahn

2021-09-30

European Neuropsychopharmacology (published)

doi.org

Trade-off Between Accuracy and Fairness of Data-driven Building and Indoor Environment Models: A Comparative Study of Pre-processing Methods

Ying Sun

Fariborz Haghighat

Benjamin C. M. Fung

2021-09-30

Energy (published)

doi.org

Weighted automata are compact and actively learnable

Artem Kaznatcheev

Prakash Panangaden

2021-09-30

Information Processing Letters (published)

doi.org

arxiv.org

Graph Neural Networks in Natural Language Processing

Bang Liu

Lingfei Wu

Natural language processing (NLP) and understanding aim to read from unformatted text to accomplish different tasks. While word embeddings l… (see more)earned by deep neural networks are widely used, the underlying linguistic and semantic structures of text pieces cannot be fully exploited in these representations. Graph is a natural way to capture the connections between different text pieces, such as entities, sentences, and documents. To overcome the limits in vector space models, researchers combine deep learning models with graph-structured representations for various tasks in NLP and text mining. Such combinations help to make full use of both the structural information in text and the representation learning ability of deep neural networks. In this chapter, we introduce the various graph representations that are extensively used in NLP, and show how different NLP tasks can be tackled from a graph perspective. We summarize recent research works on graph-based NLP, and discuss two case studies related to graph-based text clustering, matching, and multihop machine reading comprehension in detail. Finally, we provide a synthesis about the important open problems of this subfield.

2021-09-29

Deep Learning on Graphs (published)

doi.org

Data-driven approaches for genetic characterization of SARS-CoV-2 lineages

Fatima Mostefai

Isabel Gamache

Jessie Huang

Arnaud N’Guessan

Justin Pelletier

Ahmad Pesaranghader

David Hamelin

Carmen Lia Murall

Raphaël Poujol

Jean-Christophe Grenier

Martin Smith

Etienne Caron

Morgan Craig

Jesse Shapiro

Guy Wolf

Smita Krishnaswamy

Julie G. Hussin

The genome of the Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2), the pathogen that causes coronavirus disease 2019 (COVID-19)… (see more), has been sequenced at an unprecedented scale, leading to a tremendous amount of viral genome sequencing data. To understand the evolution of this virus in humans, and to assist in tracing infection pathways and designing preventive strategies, we present a set of computational tools that span phylogenomics, population genetics and machine learning approaches. To illustrate the utility of this toolbox, we detail an in depth analysis of the genetic diversity of SARS-CoV-2 in first year of the COVID-19 pandemic, using 329,854 high-quality consensus sequences published in the GISAID database during the pre-vaccination phase. We demonstrate that, compared to standard phylogenetic approaches, haplotype networks can be computed efficiently on much larger datasets, enabling real-time analyses. Furthermore, time series change of Tajima’s D provides a powerful metric of population expansion. Unsupervised learning techniques further highlight key steps in variant detection and facilitate the study of the role of this genomic variation in the context of SARS-CoV-2 infection, with Multiscale PHATE methodology identifying fine-scale structure in the SARS-CoV-2 genetic data that underlies the emergence of key lineages. The computational framework presented here is useful for real-time genomic surveillance of SARS-CoV-2 and could be applied to any pathogen that threatens the health of worldwide populations of humans and other organisms.

2021-09-28

BioRxiv (preprint)

doi.org

Latent Attention Augmentation for Robust Autonomous Driving Policies

Ran Cheng

Christopher Agia

Florian Shkurti

David Meger

Gregory Dudek

Model-free reinforcement learning has become a viable approach for vision-based robot control. However, sample complexity and adaptability t… (see more)o domain shifts remain persistent challenges when operating in high-dimensional observation spaces (images, LiDAR), such as those that are involved in autonomous driving. In this paper, we propose a flexible framework by which a policy’s observations are augmented with robust attention representations in the latent space to guide the agent’s attention during training. Our method encodes local and global descriptors of the augmented state representations into a compact latent vector, and scene dynamics are approximated by a recurrent network that processes the latent vectors in sequence. We outline two approaches for constructing attention maps; a supervised pipeline leveraging semantic segmentation networks, and an unsupervised pipeline relying only on classical image processing techniques. We conduct our experiments in simulation and test the learned policy against varying seasonal effects and weather conditions. Our design decisions are supported in a series of ablation studies. The results demonstrate that our state augmentation method both improves learning efficiency and encourages robust domain adaptation when compared to common end-to-end frameworks and methods that learn directly from intermediate representations.

2021-09-26

IEEE/RJS International Conference on Intelligent Robots and Systems (published)

doi.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications