Publications

S5 Framework: A Review of Self-Supervised Shared Semantic Space Optimization for Multimodal Zero-Shot Learning

Clst

Yonatan Bisk

Ari Holtzman

Jesse Thomason

Ja-740 cob

Yoshua Bengio

Joyce Chai

Angeliki Lapata

Jonathan Lazaridou

Alek-742 May

Nicolas sandr Nisnevich

P. PintoJoseph

Turian

Ting Chen

Simon Kornblith

Mohammad Norouzi

Yen-Chun Chen

Linjie Li

Licheng Yu

Ahmed El … (see 89 more)

Faisal Kholy

Zhe Ahmed

Yu Gan

Cheng

Zihan Dai

Hanxiao Liu

Quoc V. Le

Jia Deng

Wei Dong

Richard Socher

Li-Jia Li

K. Liu

Jacob Devlin

Ming-Wei Chang

Kenton Lee

Jesse Dodge

Maarten Sap

Ana Marasovic

Gabriel Agnew

Dirk Ilharco

Groeneveld Matt

Li Dong

Nan Yang

Wenhui Wang

Furu Wei

Yang Liu

Jianfeng Wang

Ming Gao

Zhou

Xiaoyi Dong

Jia Bao

Ting Zhang

Dongdong

Weiming Chen

Lu Zhang

Dong Yuan

Fang Chen

Da-cheng Juan

Chuntian Lu

Zhen Li

Futang Peng

Aleksei Timofeev

Yi-Ting Chen

Yaxi Gao

Tom

Andrew Duerig

Tomkins Sujith

Ravi

Lukasz Kaiser

Aidan N. Gomez

Noam M. Shazeer

Niki Vaswani

Llion Parmar

Jones Jakob

Uszko-850

Alex G. Kendall

Yarin Gal

Roberto Cipolla

Salman H. Khan

Muzammal Naseer

Munawar Hayat

Waqas Zamir

Fahad Shahbaz

Khan

Ranjay Krishna

Yuke Zhu

Oliver Groth

Justin John-867

Kenji Hata

Joshua Kravitz

Stephanie Chen

Mike Lewis

Yinhan Liu

Marjan Naman Goyal

Abdelrahman Ghazvininejad

Omer Mohamed

Levy

Luke Zettlemoyer

Bohan Li

Hao Zhou

Jun-Tao He

Mingxuan Wang

Liunian Harold

Mark Li

Da Yatskar

Yin

Cho-Jui

Kai-Wei Chang

Visualbert

In this review, we aim to inspire research into 001 S elf-S upervised S hared S emantic S pace ( S5 ) 002 multimodal learning problems. We e… (see more)quip non-003 expert researchers with a framework of in-004 formed modeling decisions via an extensive 005 literature review, an actionable modeling check-006 list, as well as a series of novel zero-shot eval-007 uation tasks. The core idea for our S5 check-008 list lies in learning contextual multimodal in-009 teractions at various granularity levels via a 010 shared Transformer encoder with a denoising 011 loss term, which is also regularized by a con-012 trastive loss term to induce a semantic align-013 ment prior on the contextual embedding space. 014 Essentially, we aim to model human concept 015 understanding and thus learn to “put a name to 016 a face”. This ultimately enables interpretable 017 zero-shot S5 generalization on a variety of 018 novel downstream tasks. In summary, this re-019 view provides sufficient background and ac-020 tionable strategies for training cutting-edge S5 021 multimodal networks. 022

2021-12-31

(published)

www.semanticscholar.org

FusionRetro: Molecule Representation Fusion via In-Context Learning for Retrosynthetic Planning

Songtao Liu

Zhengkai Tu

Minkai Xu

Zuobai Zhang

Lu Lin

Rex Ying

Jian Tang

Peilin Zhao

Dinghao Wu

Retrosynthetic planning aims to devise a complete multi-step synthetic route from starting materials to a target molecule. Current strategie… (see more)s use a decoupled approach of single-step retrosynthesis models and search algorithms, taking only the product as the input to predict the reactants for each planning step and ignoring valuable context information along the synthetic route. In this work, we propose a novel framework that utilizes context information for improved retrosynthetic planning. We view synthetic routes as reaction graphs and propose to incorporate context through three principled steps: encode molecules into embeddings, aggregate information over routes, and readout to predict reactants. Our approach is the first attempt to utilize in-context learning for retrosynthesis prediction in retrosynthetic planning. The entire framework can be efficiently optimized in an end-to-end fashion and produce more practical and accurate predictions. Comprehensive experiments demonstrate that by fusing in the context information over routes, our model significantly improves the performance of retrosynthetic planning over baselines that are not context-aware, especially for long synthetic routes. Code is available at https://github.com/SongtaoLiu0823/FusionRetro.

2021-12-31

arXiv.org (preprint)

doi.org

proceedings.mlr.press

GANSpiration: Balancing Targeted and Serendipitous Inspiration in User Interface Design with Style-Based Generative Adversarial Network

Mohammad Amin Mozaffari

Xinyuan Zhang

Jinghui Cheng

Jin L.C. Guo

Inspiration from design examples plays a crucial role in the creative process of user interface design. However, current tools and technique… (see more)s that support inspiration usually only focus on example browsing with limited user control or similarity-based example retrieval, leading to undesirable design outcomes such as focus drift and design fixation. To address these issues, we propose the GANSpiration approach that suggests design examples for both targeted and serendipitous inspiration, leveraging a style-based Generative Adversarial Network. A quantitative evaluation revealed that the outputs of GANSpiration-based example suggestion approaches are relevant to the input design, and at the same time include diverse instances. A user study with professional UI/UX practitioners showed that the examples suggested by our approach serve as viable sources of inspiration for overall design concepts and specific design elements. Overall, our work paves the road of using advanced generative machine learning techniques in supporting the creative design practice.

2021-12-31

CHI (published)

doi.org

arxiv.org

A general class of surrogate functions for stable and efficient reinforcement learning

Sharan Vaswani

Olivier Bachem

Simone Totaro

Robert Müller

Shivam Garg

Matthieu Geist

Marlos C. Machado

Pablo Samuel Castro

Nicolas Le Roux

Common policy gradient methods rely on the maximization of a sequence of surrogate functions. In recent years, many such surrogate functions… (see more) have been proposed, most without strong theoretical guarantees, leading to algorithms such as TRPO, PPO or MPO. Rather than design yet another surrogate function, we instead propose a general framework (FMA-PG) based on functional mirror ascent that gives rise to an entire family of surrogate functions. We construct surrogate functions that enable policy improvement guarantees, a property not shared by most existing surrogate functions. Crucially, these guarantees hold regardless of the choice of policy parameterization. Moreover, a particular instantiation of FMA-PG recovers important implementation heuristics (e.g., using forward vs reverse KL divergence) resulting in a variant of TRPO with additional desirable properties. Via experiments on simple bandit problems, we evaluate the algorithms instantiated by FMA-PG. The proposed framework also suggests an improved variant of PPO, whose robustness and efficiency we empirically demonstrate on the MuJoCo suite.

2021-12-31

AISTATS (published)

doi.org

proceedings.mlr.press

Generating physically-consistent high-resolution climate data with hard-constrained neural networks

Paula Harder

Qidong Yang

Venkatesh Ramesh

Prasanna Sattegeri

Alex Hernández-García

Campbell Watson

D. Szwarcman

David Rolnick

The availability of reliable, high-resolution climate and weather data is important to inform long-term decisions on climate adaptation and … (see more)mitigation and to guide rapid responses to extreme events. Forecasting models are limited by computational costs and therefore often can only make coarse resolution predictions. Statistical downscaling can provide an efficient method of upsampling low-resolution data. In this field, deep learning has been applied successfully, often us-ing image super-resolution methods from computer vision. Despite achieving visually compelling results in some cases, such models often violate conservation laws when predicting physical variables. In order to conserve important physical quantities, we develop methods that guarantee physical constraints are satisfied by a deep downscaling model while also increasing their performance according to traditional metrics. We introduce two ways of constraining the network: A renor-malization layer added to the end of the neural network and a successive approach that scales with increasing upsampling factors. We show the applicability of our methods across different popular architectures and upsampling factors using ERA5 reanalysis data.

2021-12-31

arXiv.org (preprint)

doi.org

GitHub repositories with links to academic papers: Public access, traceability, and evolution

Supatsara Wattanakriengkrai

Bodin Chinthanet

Hideaki Hata

Raula Gaikovina Kula

Christoph Treude

Jin L.C. Guo

Kenichi Matsumoto

2021-12-31

Journal of Systems and Software (published)

doi.org

arxiv.org

Goal-driven optimization of single-neuron properties in artiﬁcial networks reveals regularization role of neural diversity and adaptation in the brain

Neurons in the brain have rich and adaptive input-output properties. Features such as diverse f-I curves and spike frequency adaptation are … (see more)known to place single neurons in optimal coding regimes when facing changing stimuli. Yet, it is still unclear how brain circuits exploit single neuron ﬂexibility, and how network-level requirements may have shaped such cellular function. To answer this question, a multi-scaled approach is needed where the computations of single neurons and of neural circuits must be considered as a complete system. In this work, we use artiﬁcial neural networks to systematically investigate single neuron input-output adaptive mechanisms, optimized in an end-to-end fashion. Throughout the optimization process, each neuron has the liberty to modify its nonlinear activation function, parametrized to mimic f-I curves of biological neurons, and to learn adaptation strategies to modify activation functions in real-time during a task. We ﬁnd that such networks show much-improved robustness to noise and changes in input statistics. Importantly, we ﬁnd that this procedure recovers precise coding strategies found in biological neurons, such as gain scaling and fractional order differentiation/integration. Using tools from dynamical systems theory, we analyze the role of these emergent single neuron properties and argue that neural diversity and adaptation plays an active regularization role that enables neural circuits to optimally propagate information across time.

2021-12-31

(published)

www.semanticscholar.org

Gradient Descent Is Optimal Under Lower Restricted Secant Inequality And Upper Error Bound

Charles Guille-escuret

Baptiste Goujaud

Adam Ibrahim

Ioannis Mitliagkas

The study of first-order optimization is sensitive to the assumptions made on the objective functions. These assumptions induce complexity c… (see more)lasses which play a key role in worst-case analysis, including the fundamental concept of algorithm optimality. Recent work argues that strong convexity and smoothness, popular assumptions in literature, lead to a pathological definition of the condition number (Guille-Escuret et al., 2021). Motivated by this result, we focus on the class of functions satisfying a lower restricted secant inequality and an upper error bound. On top of being robust to the aforementioned pathological behavior and including some non-convex functions, this pair of conditions displays interesting geometrical properties. In particular, the necessary and sufficient conditions to interpolate a set of points and their gradients within the class can be separated into simple conditions on each sampled gradient. This allows the performance estimation problem (PEP, Drori and Teboulle (2012)) to be solved analytically, leading to a lower bound on the convergence rate that proves gradient descent to be exactly optimal on this class of functions among all first-order algorithms.

2021-12-31

NeurIPS (published)

doi.org

openreview.net

Graph-Based Active Machine Learning Method for Diverse and Novel Antimicrobial Peptides Generation and Selection

Bonaventure F. P. Dossou

Dianbo Liu

Xu Ji

Moksh Jain

Almer M. van der Sloot

Roger Palou

Michael Tyers

Yoshua Bengio

As antibiotic-resistant bacterial strains are rapidly spreading worldwide, infections caused by these strains are emerging as a global crisi… (see more)s causing the death of millions of people every year. Antimicrobial Peptides (AMPs) are one of the candidates to tackle this problem because of their potential diversity, and ability to favorably modulate the host immune response. However, large-scale screening of new AMP candidates is expensive, time-consuming, and now affordable in developing countries, which need the treatments the most. In this work, we propose a novel active machine learning-based framework that statistically minimizes the number of wet-lab experiments needed to design new AMPs, while ensuring a high diversity and novelty of generated AMPs sequences, in multi-rounds of wet-lab AMP screening settings. Combining recurrent neural network models and a graph-based filter (GraphCC), our proposed approach delivers novel and diverse candidates and demonstrates better performances according to our defined metrics.

2021-12-31

arXiv (preprint)

doi.org

arxiv.org

GrowSpace: Learning How to Shape Plants

Mark Lefsrud

Plants are dynamic systems that are integral to our existence and survival. Plants face environment changes and adapt over time to their sur… (see more)rounding conditions. We argue that plant responses to an environmental stimulus are a good example of a real-world problem that can be approached within a reinforcement learning (RL)framework. With the objective of controlling a plant by moving the light source, we propose GrowSpace, as a new RL benchmark. The back-end of the simulator is implemented using the Space Colonisation Algorithm, a plant growing model based on competition for space. Compared to video game RL environments, this simulator addresses a real-world problem and serves as a test bed to visualize plant growth and movement in a faster way than physical experiments. GrowSpace is composed of a suite of challenges that tackle several problems such as control, multi-stage learning,fairness and multi-objective learning. We provide agent baselines alongside case studies to demonstrate the difficulty of the proposed benchmark.

2021-12-31

AAAI.org/2022/Workshop/AIAFS (published)

openreview.net

Harvesting Mature Relation Extraction Models from Limited Seed Knowledge: A Self-Development Framework for DS Rule Expansion

Raphael Hoffmann

Congle Zhang

Xiao Ling

Yankai Lin

Shiqi Shen

Zhiyuan Liu

Huanbo Luan

Christopher D Manning

M. Surdeanu

John Bauer

Adriana Romero

Pietro Lio’

Yoshua Bengio

Xuanhui Wang

Cheng Li

Nadav Golbandi

Bendersky Marc

Najork. 2018

The

Wentao Wu … (see 2 more)

Hongsong Li

Haixun Wang

Distantly-supervised relation extraction 001 (DSRE) is an effective method to scale relation 002 extraction (RE) to large unlabeled corpora … (see more)003 with the utilization of knowledge bases (KBs), 004 but suffers from the scale of KBs and the 005 introduced noise. 006 To alleviate the above two problems, we 007 propose a novel framework called S elf-008 devel O pment r U le ex P ansion ( SOUP ), which 009 starts from limited amount of labeled data 010 and continuously produces low-noise labels on 011 large-scaled unlabeled data by a growing learn-012 able logical rules set. 013 Specifically, SOUP achieves a mutual enhance-014 ment of RE model and logical rules set, first 015 a RE model is trained on the labeled data to 016 summarize the knowledge, then the knowledge 017 is utilized to explore candidate rules from unla-018 beled data, finally high-quality candidates are 019 selected in a graph-based ranking manner to ex-020 tend the logical rules set and new rule-labeled 021 data are provided for better RE model training. 022 Experiments on wiki20 dataset demonstrate 023 that, with limited seed knowledge from small-024 scaled manually labeled data, SOUP achieves 025 significant improvement compared to baselines 026 by producing continuous growth of both logical 027 rules and the RE model, and that labeling noise 028 of SOUP is much less than DS. Furthermore, 029 RE model enhanced by SOUP with 1.6k logical 030 rules learned from prior knowledge could pro-031 duce an equivalent performance to the model 032 trained on data labeled in DS manner by 72k 033 relational facts of KBs. 034

2021-12-31

(published)

www.semanticscholar.org

Heterogeneous Supervised Topic Models

Dhanya Sridhar

Hal Daumé III

David Blei

Researchers in the social sciences are often interested in the relationship between text and an outcome of interest, where the goal is to bo… (see more)th uncover latent patterns in the text and predict outcomes for unseen texts. To this end, this paper develops the heterogeneous supervised topic model (HSTM), a probabilistic approach to text analysis and prediction. HSTMs posit a joint model of text and outcomes to find heterogeneous patterns that help with both text analysis and prediction. The main benefit of HSTMs is that they capture heterogeneity in the relationship between text and the outcome across latent topics. To fit HSTMs, we develop a variational inference algorithm based on the auto-encoding variational Bayes framework. We study the performance of HSTMs on eight datasets and find that they consistently outperform related methods, including fine-tuned black-box models. Finally, we apply HSTMs to analyze news articles labeled with pro- or anti-tone. We find evidence of differing language used to signal a pro- and anti-tone.

2021-12-31

Transactions of the Association for Computational Linguistics (published)

doi.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications