Publications

Gotta Go Fast When Generating Data with Score-Based Models

Alexia Jolicoeur-Martineau

Ke Li

Rémi Piché-Taillefer

Tal Kachman

Score-based (denoising diffusion) generative models have recently gained a lot of success in generating realistic and diverse data. These ap… (see more)proaches define a forward diffusion process for transforming data to noise and generate data by reversing it (thereby going from noise to data). Unfortunately, current score-based models generate data very slowly due to the sheer number of score network evaluations required by numerical SDE solvers. In this work, we aim to accelerate this process by devising a more efficient SDE solver. Existing approaches rely on the Euler-Maruyama (EM) solver, which uses a fixed step size. We found that naively replacing it with other SDE solvers fares poorly - they either result in low-quality samples or become slower than EM. To get around this issue, we carefully devise an SDE solver with adaptive step sizes tailored to score-based generative models piece by piece. Our solver requires only two score function evaluations, rarely rejects samples, and leads to high-quality samples. Our approach generates data 2 to 10 times faster than EM while achieving better or equal sample quality. For high-resolution images, our method leads to significantly higher quality samples than all other methods tested. Our SDE solver has the benefit of requiring no step size tuning.

2021-05-28

ArXiv (preprint)

openreview.net

Noised Consistency Training for Text Summarization

J. Liu

Qianren Mao

Bang Liu

Hao Peng

Hongdong Zhu

Jianxin Li

Neural abstractive summarization methods often require large quantities of labeled training data. However, labeling large amounts of summari… (see more)zation data is often prohibitive due to time, financial, and expertise constraints, which has limited the usefulness of summarization systems to practical applications. In this paper, we argue that this limitation can be overcome by a semi-supervised approach: consistency training which is to leverage large amounts of unlabeled data to improve the performance of supervised learning over a small corpus. The consistency regularization semi-supervised learning can regularize model predictions to be invariant to small noise applied to input articles. By adding noised unlabeled corpus to help regularize consistency training, this framework obtains comparative performance without using the full dataset. In particular, we have verified that leveraging large amounts of unlabeled data decently improves the performance of supervised learning over an insufficient labeled dataset.

2021-05-28

ArXiv (preprint)

arxiv.org

Learning Brain Dynamics With Coupled Low-Dimensional Nonlinear Oscillators and Deep Recurrent Networks

Germán Abrevaya

Guillaume Dumas

Aleksandr Y. Aravkin

Peng Zheng

Jean-Christophe Gagnon-Audet

James Kozloski

Pablo Polosecki

Guillaume Lajoie

David Cox

Silvina Ponce Dawson

Guillermo Cecchi

Irina Rish

Many natural systems, especially biological ones, exhibit complex multivariate nonlinear dynamical behaviors that can be hard to capture by … (see more)linear autoregressive models. On the other hand, generic nonlinear models such as deep recurrent neural networks often require large amounts of training data, not always available in domains such as brain imaging; also, they often lack interpretability. Domain knowledge about the types of dynamics typically observed in such systems, such as a certain type of dynamical systems models, could complement purely data-driven techniques by providing a good prior. In this work, we consider a class of ordinary differential equation (ODE) models known as van der Pol (VDP) oscil lators and evaluate their ability to capture a low-dimensional representation of neural activity measured by different brain imaging modalities, such as calcium imaging (CaI) and fMRI, in different living organisms: larval zebrafish, rat, and human. We develop a novel and efficient approach to the nontrivial problem of parameters estimation for a network of coupled dynamical systems from multivariate data and demonstrate that the resulting VDP models are both accurate and interpretable, as VDP's coupling matrix reveals anatomically meaningful excitatory and inhibitory interactions across different brain subsystems. VDP outperforms linear autoregressive models (VAR) in terms of both the data fit accuracy and the quality of insight provided by the coupling matrices and often tends to generalize better to unseen data when predicting future brain activity, being comparable to and sometimes better than the recurrent neural networks (LSTMs). Finally, we demonstrate that our (generative) VDP model can also serve as a data-augmentation tool leading to marked improvements in predictive accuracy of recurrent neural networks. Thus, our work contributes to both basic and applied dimensions of neuroimaging: gaining scientific insights and improving brain-based predictive models, an area of potentially high practical importance in clinical diagnosis and neurotechnology.

2021-05-26

Neural Computation (published)

doi.org

Inferring global-scale temporal latent topics from news reports to predict public health interventions for COVID-19

Zhi Wen

Guido Powell

Imane Chafi

David Buckeridge

Y. K. Li

2021-05-24

Patterns (published)

doi.org

Artificial intelligence in nursing: Priorities and opportunities from an international invitational think‐tank of the Nursing and Artificial Intelligence Leadership Collaborative

Charlene Esteban Ronquillo

Laura‐Maria Peltonen

Lisiane Pruinelli

Charlene H Chu

Suzanne Bakken

Ana Beduschi

Kenrick Cato

Nicholas Hardiker

Alain Junger

Martin Michalowski

Rune Nyrup

Samira Abbasgholizadeh-Rahimi

Donald Nigel Reed

Tapio Salakoski

Sanna Salanterä

Nancy Walton

Patrick Weber

Thomas Wiegand

Maxim Topaz

2021-05-18

Journal of Advanced Nursing (published)

doi.org

Deep Discourse Analysis for Generating Personalized Feedback in Intelligent Tutor Systems

Matt Grenander

Robert Belfer

Ekaterina Kochmar

Iulian V. Serban

Franccois St-Hilaire

Jackie Cheung

We explore creating automated, personalized feedback in an intelligent tutoring system (ITS). Our goal is to pinpoint correct and incorrect … (see more)concepts in student answers in order to achieve better student learning gains. Although automatic methods for providing personalized feedback exist, they do not explicitly inform students about which concepts in their answers are correct or incorrect. Our approach involves decomposing students answers using neural discourse segmentation and classification techniques. This decomposition yields a relational graph over all discourse units covered by the reference solutions and student answers. We use this inferred relational graph structure and a neural classifier to match student answers with reference solutions and generate personalized feedback. Although the process is completely automated and data-driven, the personalized feedback generated is highly contextual, domain-aware and effectively targets each student's misconceptions and knowledge gaps. We test our method in a dialogue-based ITS and demonstrate that our approach results in high-quality feedback and significantly improved student learning gains.

2021-05-18

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

DIBS: Diversity inducing Information Bottleneck in Model Ensembles

Samarth Sinha

Homanga Bharadhwaj

Anirudh Goyal

Hugo Larochelle

Animesh Garg

Florian Shkurti

2021-05-18

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Individual Fairness in Kidney Exchange Programs

Golnoosh Farnadi

William St-Arnaud

Behrouz Babaki

Margarida Carvalho

2021-05-18

AAAI Conference on Artificial Intelligence (published)

doi.org

Meta-learning framework with applications to zero-shot time-series forecasting

Boris Oreshkin

Dmitri Carpov

Nicolas Chapados

Yoshua Bengio

Can meta-learning discover generic ways of processing time series (TS) from a diverse dataset so as to greatly improve generalization on new… (see more) TS coming from different datasets? This work provides positive evidence to this using a broad meta-learning framework which we show subsumes many existing meta-learning algorithms. Our theoretical analysis suggests that residual connections act as a meta-learning adaptation mechanism, generating a subset of task-specific parameters based on a given TS input, thus gradually expanding the expressive power of the architecture on-the-fly. The same mechanism is shown via linearization analysis to have the interpretation of a sequential update of the final linear layer. Our empirical results on a wide range of data emphasize the importance of the identified meta-learning mechanisms for successful zero-shot univariate forecasting, suggesting that it is viable to train a neural network on a source TS dataset and deploy it on a different target TS dataset without retraining, resulting in performance that is at least as good as that of state-of-practice univariate forecasting models.

2021-05-18

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Metrics and continuity in reinforcement learning

Charline Le Lan

Marc Gendron-Bellemare

Pablo Samuel Castro

2021-05-18

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Self-Supervised Attention-Aware Reinforcement Learning

Haiping Wu

Khimya Khetarpal

Doina Precup

Visual saliency has emerged as a major visualization tool for interpreting deep reinforcement learning (RL) agents. However, much of the exi… (see more)sting research uses it as an analyzing tool rather than an inductive bias for policy learning. In this work, we use visual attention as an inductive bias for RL agents. We propose a novel self-supervised attention learning approach which can 1. learn to select regions of interest without explicit annotations, and 2. act as a plug for existing deep RL methods to improve the learning performance. We empirically show that the self-supervised attention-aware deep RL methods outperform the baselines in the context of both the rate of convergence and performance. Furthermore, the proposed self-supervised attention is not tied with specific policies, nor restricted to a specific scene. We posit that the proposed approach is a general self-supervised attention module for multi-task learning and transfer learning, and empirically validate the generalization ability of the proposed method. Finally, we show that our method learns meaningful object keypoints highlighting improvements both qualitatively and quantitatively.

2021-05-18

AAAI Conference on Artificial Intelligence (published)

doi.org

Variance Penalized On-Policy and Off-Policy Actor-Critic

Arushi Jain

Gandharv Patil

Ayush Jain

Khimya Khetarpal

Doina Precup

2021-05-18

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

NLP in the era of generative AI, cognitive sciences, and societal transformation

AI Policy Compass

Student Life and Resources

Publications

NLP in the era of generative AI, cognitive sciences, and societal transformation

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications