Publications

A Remedy For Distributional Shifts Through Expected Domain Translation

Jean-Christophe Gagnon-Audet

Soroosh Shahtalebi

Frank Rudzicz

Machine learning models often fail to generalize to unseen domains due to the distributional shifts. A family of such shifts, “correlation… (see more) shifts,” is caused by spurious correlations in the data. It is studied under the overarching topic of “domain generalization.” In this work, we employ multi-modal translation networks to tackle the correlation shifts that appear when data is sampled out-of-distribution. Learning a generative model from training domains enables us to translate each training sample under the special characteristics of other possible domains. We show that by training a predictor solely on the generated samples, the spurious correlations in training domains average out, and the invariant features corresponding to true correlations emerge. Our proposed technique, Expected Domain Translation (EDT), is benchmarked on the Colored MNIST dataset and drastically improves the state-of-the-art classification accuracy by 38% with train-domain validation model selection.

2022-05-23

IEEE International Conference on Acoustics, Speech, and Signal Processing (published)

doi.org

Roboethics as a Design Challenge: Lessons Learned from the Roboethics to Design and Development Competition

Jimin Rhim

Cheng Lin

Alexander Werner

Brandon DeHart

Vivian Qiang

Shalaleh Rismani

AJung Moon

How do we make concrete progress towards de-signing robots that can navigate ethically sensitive contexts? Almost two decades after the word… (see more) ‘roboethics’ was coined, translating interdisciplinary roboethics discussions into techni-cal design still remains a daunting task. This paper describes our first attempt at addressing these challenges through a roboethics-themed design competition. The design competition setting allowed us to (a) formulate ethical considerations as an engineering design task that anyone with basic programming skills can tackle; and (b) develop a prototype evaluation scheme that incorporates diverse normative perspectives of multiple stakeholders. The initial implementation of the competition was held online at the RO-MAN 2021 conference. The competition task involved programming a simulated mobile robot (TIAGo) that delivers items for individuals in the home environment, where many of these tasks involve ethically sensitive con-texts (e.g., an underage family member asks for an alcoholic drink). This paper outlines our experiences implementing the competition and the lessons we learned. We highlight design competitions as a promising mechanism to enable a new wave of roboethics research equipped with technical design solutions.

2022-05-23

2022 International Conference on Robotics and Automation (ICRA) (published)

doi.org

Better Modeling the Programming World with Code Concept Graphs-augmented Multi-modal Learning

Martin Weyssow

Houari Sahraoui

Bang Liu

The progress made in code modeling has been tremendous in recent years thanks to the design of natural language processing learning approach… (see more)es based on state-of-the-art model architectures. Nevertheless, we believe that the current state-of-the-art does not focus enough on the full potential that data may bring to a learning process in software engineering. Our vision articulates on the idea of leveraging multi-modal learning approaches to modeling the programming world. In this paper, we investigate one of the underlying idea of our vision whose objective based on concept graphs of identifiers aims at leveraging high-level relationships between domain concepts manipulated through particular language constructs. In particular, we propose to enhance an existing pretrained language model of code by joint-learning it with a graph neural network based on our concept graphs. We conducted a preliminary evaluation that shows gain of effectiveness of the models for code search using a simple joint-learning method and prompts us to further investigate our research vision.

2022-05-22

2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER) (published)

doi.org

arxiv.org

Hardware Architecture for Guessing Random Additive Noise Decoding Markov Order (GRAND-MO)

Syed Mohsin Abbas

Marwan Jalaleddine

Warren Gross

2022-05-20

Journal of Signal Processing Systems (published)

doi.org

Privacy-Aware Compression for Federated Data Analysis

Kamalika Chaudhuri

Chuan Guo

Michael Rabbat

Federated data analytics is a framework for distributed data analysis where a server compiles noisy responses from a group of distributed lo… (see more)w-bandwidth user devices to estimate aggregate statistics. Two major challenges in this framework are privacy, since user data is often sensitive, and compression, since the user devices have low network bandwidth. Prior work has addressed these challenges separately by combining standard compression algorithms with known privacy mechanisms. In this work, we take a holistic look at the problem and design a family of privacy-aware compression mechanisms that work for any given communication budget. We first propose a mechanism for transmitting a single real number that has optimal variance under certain conditions. We then show how to extend it to metric differential privacy for location privacy use-cases, as well as vectors, for application to federated learning. Our experiments illustrate that our mechanism can lead to better utility vs. compression trade-offs for the same privacy loss in a number of settings.

2022-05-20

auai.org/UAI/2022/Conference (poster)

doi.org

openreview.net

Privacy-aware compression for federated data analysis

Kamalika Chaudhuri

Chuan Guo

Michael Rabbat

Federated data analytics is a framework for distributed data analysis where a server compiles noisy responses from a group of distributed lo… (see more)w-bandwidth user devices to estimate aggregate statistics. Two major challenges in this framework are privacy, since user data is often sensitive, and compression, since the user devices have low network bandwidth. Prior work has addressed these challenges separately by combining standard compression algorithms with known privacy mechanisms. In this work, we take a holistic look at the problem and design a family of privacy-aware compression mechanisms that work for any given communication budget. We first propose a mechanism for transmitting a single real number that has optimal variance under certain conditions. We then show how to extend it to metric differential privacy for location privacy use-cases, as well as vectors, for application to federated learning. Our experiments illustrate that our mechanism can lead to better utility vs. compression trade-offs for the same privacy loss in a number of settings.

2022-05-20

auai.org/UAI/2022/Conference (poster)

doi.org

openreview.net

Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL

Akram Erraqabi

Marlos C. Machado

Harry Zhao

Mingde Zhao

Sainbayar Sukhbaatar

Alessandro Lazaric

Ludovic Denoyer

Yoshua Bengio

In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting, with applications ranging from… (see more) skill discovery to reward shaping. Recently, learning the Laplacian representation has been framed as the optimization of a temporally-contrastive objective to overcome its computational limitations in large (or continuous) state spaces. However, this approach requires uniform access to all states in the state space, overlooking the exploration problem that emerges during the representation learning process. In this work, we propose an alternative method that is able to recover, in a non-uniform-prior setting, the expressiveness and the desired properties of the Laplacian representation. We do so by combining the representation learning with a skill-based covering policy, which provides a better training distribution to extend and refine the representation. We also show that a simple augmentation of the representation objective with the learned temporal abstractions improves dynamics-awareness and helps exploration. We find that our method succeeds as an alternative to the Laplacian in the non-uniform setting and scales to challenging continuous control environments. Finally, even if our method is not optimized for skill discovery, the learned skills can successfully solve difficult continuous navigation tasks with sparse rewards, where standard skill discovery approaches are no so effective.

2022-05-20

auai.org/UAI/2022/Conference (poster)

doi.org

openreview.net

Universal antigen encoding of T cell activation from high-dimensional cytokine dynamics

Sooraj R. Achar

François X. P. Bourassa

Thomas J. Rademaker

Angela Lee

Taisuke Kondo

Emanuel Salazar-Cavazos

John S. Davies

Naomi Taylor

Paul François

Grégoire Altan-Bonnet

2022-05-20

Science (published)

doi.org

Human brain anatomy reflects separable genetic and environmental components of socioeconomic status

Hyeokmoon Kweon

Gökhan Aydogan

Alain Dagher

Danilo Bzdok

Christian C. Ruff

Gideon Nave

Martha J Farah

Philipp Koellinger

Recent studies report that socioeconomic status (SES) correlates with brain structure. Yet, such findings are variable and little is known a… (see more)bout underlying causes. We present a well-powered voxel-based analysis of grey matter volume (GMV) across levels of SES, finding many small SES effects widely distributed across the brain, including cortical, subcortical and cerebellar regions. We also construct a polygenic index of SES to control for the additive effects of common genetic variation related to SES, which attenuates observed SES-GMV relations, to different degrees in different areas. Remaining variance, which may be attributable to environmental factors, is substantially accounted for by body mass index, a marker for lifestyle related to SES. In sum, SES affects multiple brain regions through measurable genetic and environmental effects. One-sentence Summary Socioeconomic status is linked with brain anatomy through a varying balance of genetic and environmental influences.

2022-05-18

Science Advances (published)

doi.org

Multi-tract multi-symptom relationships in pediatric concussion

Guido I Guberman

Sonja Stojanovski

Eman Nishat

Alain Ptito

Danilo Bzdok

Anne L Wheeler

Maxime Descoteaux

The heterogeneity of white matter damage and symptoms in concussions has been identified as a major obstacle to therapeutic innovation. In c… (see more)ontrast, the vast majority of diffusion MRI studies on concussion have traditionally employed group-comparison approaches. Such studies do not consider heterogeneity of damage and symptoms in concussion. To parse concussion heterogeneity, the present study combines diffusion MRI (dMRI) and multivariate statistics to investigate multi-tract multi-symptom relationships. Using dMRI data from a sample of 306 children ages 9 and 10 with a history of concussion from the Adolescent Brain Cognitive Development Study (ABCD study), we built connectomes weighted by classical and emerging diffusion measures. These measures were combined into two informative indices, the first capturing a mixture of patterns suggestive of microstructural complexity, the second representing almost exclusively axonal density. We deployed pattern-learning algorithms to jointly decompose these connectivity features and 19 behavioural measures that capture well-known symptoms of concussions. We found idiosyncratic symptom-specific multi-tract connectivity features, which would not be captured in traditional univariate analyses. Multivariable connectome-symptom correspondences were stronger than all single-tract/single-symptom associations. Multi-tract connectivity features were also expressed equally across different sociodemographic strata and their expression was not accounted for by injury-related variables. In a replication dataset, the expression of multi-tract connectivity features predicted adverse psychiatric outcomes after accounting for other psychopathology-related variables. By defining cross-demographic multi-tract multi-symptom relationships to parse concussion heterogeneity, the present study can pave the way for the development of improved stratification strategies that may contribute to the success of future clinical trials and the improvement of concussion management.

2022-05-17

eLife (published)

doi.org

Learning Representations for New Sound Classes With Continual Self-Supervised Learning

Zhepei Wang

Cem (Yusuf) Subakan

Xilin Jiang

Junkai Wu

Efthymios Tzinis

Mirco Ravanelli

Paris Smaragdis

In this article, we work on a sound recognition system that continually incorporates new sound classes. Our main goal is to develop a framew… (see more)ork where the model can be updated without relying on labeled data. For this purpose, we propose adopting representation learning, where an encoder is trained using unlabeled data. This learning framework enables the study and implementation of a practically relevant use case where only a small amount of the labels is available in a continual learning context. We also make the empirical observation that a similarity-based representation learning method within this framework is robust to forgetting even if no explicit mechanism against forgetting is employed. We show that this approach obtains similar performance compared to several distillation-based continual learning methods when employed on self-supervised representation learning methods.

2022-05-15

ArXiv (preprint)

doi.org

arxiv.org

Homogenization of SGD in high-dimensions: Exact dynamics and generalization properties

Courtney Paquette

Elliot Paquette

Ben Adlam

Jeffrey Pennington

2022-05-14

ArXiv (preprint)

arxiv.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications