Publications

Building Robust Ensembles via Margin Boosting

Dinghuai Zhang

Hongyang R. Zhang

Pradeep Ravikumar

Arun Sai Suggala

In the context of adversarial robustness, a single model does not usually have enough power to defend against all possible adversarial attac… (see more)ks, and as a result, has sub-optimal robustness. Consequently, an emerging line of work has focused on learning an ensemble of neural networks to defend against adversarial attacks. In this work, we take a principled approach towards building robust ensembles. We view this problem from the perspective of margin-boosting and develop an algorithm for learning an ensemble with maximum margin. Through extensive empirical evaluation on benchmark datasets, we show that our algorithm not only outperforms existing ensembling techniques, but also large models trained in an end-to-end fashion. An important byproduct of our work is a margin-maximizing cross-entropy (MCE) loss, which is a better alternative to the standard cross-entropy (CE) loss. Empirically, we show that replacing the CE loss in state-of-the-art adversarial training techniques with our MCE loss leads to significant performance improvement.

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

doi.org

arxiv.org

Direct Behavior Specification via Constrained Reinforcement Learning

Julien Roy

Roger Girgis

Joshua Romoff

Pierre-Luc Bacon

Chris Pal

Chris J Pal

The standard formulation of Reinforcement Learning lacks a practical way of specifying what are admissible and forbidden behaviors. Most oft… (see more)en, practitioners go about the task of behavior specification by manually engineering the reward function, a counter-intuitive process that requires several iterations and is prone to reward hacking by the agent. In this work, we argue that constrained RL, which has almost exclusively been used for safe RL, also has the potential to significantly reduce the amount of work spent for reward specification in applied RL projects. To this end, we propose to specify behavioral preferences in the CMDP framework and to use Lagrangian methods to automatically weigh each of these behavioral constraints. Specifically, we investigate how CMDPs can be adapted to solve goal-based tasks while adhering to several constraints simultaneously. We evaluate this framework on a set of continuous control tasks relevant to the application of Reinforcement Learning for NPC design in video games.

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

proceedings.mlr.press

arxiv.org

Estimating Social Influence from Observational Data

Dhanya Sridhar

Caterina De Bacco

David Blei

We consider the problem of estimating social influence, the effect that a person's behavior has on the future behavior of their peers. The k… (see more)ey challenge is that shared behavior between friends could be equally explained by influence or by two other confounding factors: 1) latent traits that caused people to both become friends and engage in the behavior, and 2) latent preferences for the behavior. This paper addresses the challenges of estimating social influence with three contributions. First, we formalize social influence as a causal effect, one which requires inferences about hypothetical interventions. Second, we develop Poisson Influence Factorization (PIF), a method for estimating social influence from observational data. PIF fits probabilistic factor models to networks and behavior data to infer variables that serve as substitutes for the confounding latent traits. Third, we develop assumptions under which PIF recovers estimates of social influence. We empirically study PIF with semi-synthetic and real data from Last.fm, and conduct a sensitivity analysis. We find that PIF estimates social influence most accurately compared to related methods and remains robust under some violations of its assumptions.

2022-06-28

Proceedings of the First Conference on Causal Learning and Reasoning (published)

doi.org

openreview.net

A Generalized Bootstrap Target for Value-Learning, Efficiently Combining Value and Feature Predictions

Anthony GX-Chen

Veronica Chelu

Blake Richards

Joelle Pineau

2022-06-28

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Generative Flow Networks for Discrete Probabilistic Modeling

Dinghuai Zhang

Nikolay Malkin

Zhen Liu

Alexandra Volokhova

Aaron Courville

Yoshua Bengio

We present energy-based generative flow networks (EB-GFN), a novel probabilistic modeling algorithm for high-dimensional discrete data. Buil… (see more)ding upon the theory of generative flow networks (GFlowNets), we model the generation process by a stochastic data construction policy and thus amortize expensive MCMC exploration into a fixed number of actions sampled from a GFlowNet. We show how GFlowNets can approximately perform large-block Gibbs sampling to mix between modes. We propose a framework to jointly train a GFlowNet with an energy function, so that the GFlowNet learns to sample from the energy distribution, while the energy learns with an approximate MLE objective with negative samples from the GFlowNet. We demonstrate EB-GFN's effectiveness on various probabilistic modeling tasks. Code is publicly available at https://github.com/zdhNarsil/EB_GFN.

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

proceedings.mlr.press

arxiv.org

Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning

Utku Evci

Vincent Dumoulin

Hugo Larochelle

Michael Curtis Mozer

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

proceedings.mlr.press

arxiv.org

3D Infomax improves GNNs for Molecular Property Prediction

Hannes Stärk

Dominique Beaini

Gabriele Corso

Prudencio Tossou

Christian Dallago

Stephan Günnemann

Pietro Lio

Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. Including 3D mol… (see more)ecular structure as input to learned models improves their predictions for many molecular properties. However, this information is infeasible to compute at the scale required by most real-world applications. We propose pre-training a model to understand the geometry of molecules given only their 2D molecular graph. Using methods from self-supervised learning, we maximize the mutual information between a 3D summary vector and the representations of a Graph Neural Network (GNN) such that they contain latent 3D information. During fine-tuning on molecules with unknown geometry, the GNN still generates implicit 3D information and can use it to inform downstream tasks. We show that 3D pre-training provides significant improvements for a wide range of molecular properties, such as a 22% average MAE reduction on eight quantum mechanical properties. Crucially, the learned representations can be effectively transferred between datasets with vastly different molecules.

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

proceedings.mlr.press

openreview.net

Learning To Cut By Looking Ahead: Cutting Plane Selection via Imitation Learning

Max B. Paulus

Giulia Zarpellon

Andreas Krause

Laurent Charlin

Chris J. Maddison

Cutting planes are essential for solving mixed-integer linear problems (MILPs), because they facilitate bound improvements on the optimal so… (see more)lution value. For selecting cuts, modern solvers rely on manually designed heuristics that are tuned to gauge the potential effectiveness of cuts. We show that a greedy selection rule explicitly looking ahead to select cuts that yield the best bound improvement delivers strong decisions for cut selection - but is too expensive to be deployed in practice. In response, we propose a new neural architecture (NeuralCut) for imitation learning on the lookahead expert. Our model outperforms standard baselines for cut selection on several synthetic MILP benchmarks. Experiments with a B&C solver for neural network verification further validate our approach, and exhibit the potential of learning methods in this setting.

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

doi.org

arxiv.org

Multi-scale Feature Learning Dynamics: Insights for Double Descent

Mohammad Pezeshki

Amartya Mitra

Yoshua Bengio

Guillaume Lajoie

A key challenge in building theoretical foundations for deep learning is the complex optimization dynamics of neural networks, resulting fro… (see more)m the high-dimensional interactions between the large number of network parameters. Such non-trivial interactions lead to intriguing model behaviors such as the phenomenon of "double descent" of the generalization error. The more commonly studied aspect of this phenomenon corresponds to model-wise double descent where the test error exhibits a second descent with increasing model complexity, beyond the classical U-shaped error curve. In this work, we investigate the origins of the less studied epoch-wise double descent in which the test error undergoes two non-monotonous transitions, or descents as the training time increases. We study a linear teacher-student setup exhibiting epoch-wise double descent similar to that in deep neural networks. In this setting, we derive closed-form analytical expressions for the evolution of generalization error over training. We find that double descent can be attributed to distinct features being learned at different scales: as fast-learning features overfit, slower-learning features start to fit, resulting in a second descent in test error. We validate our findings through numerical experiments where our theory accurately predicts empirical findings and remains consistent with observations in deep neural networks.

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

proceedings.mlr.press

openreview.net

Only tails matter: Average-Case Universality and Robustness in the Convex Regime

Leonardo Cunha

Gauthier Gidel

Fabian Pedregosa

Damien Scieur

Courtney Paquette

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

doi.org

openreview.net

Proving theorems using Incremental Learning and Hindsight Experience Replay

Maxwell Crouse

Eser Aygün

Laurent Orseau

Bassem Makni

Vernon Ralph Austel

Ankit Anand

Xavier Glorot

Cristina Cornelio

Shajith Ikbal

Stephen M Mcaleer

Vlad Firoiu

Pavan Kapanipathi

Lei M Zhang

Ndivhuwo Makondo

Doina Precup

Shibl Mourad

The highest performing ATP systems (e.g., [7, 18]) in first order logic have been evolving for decades and have grown to use an increasing n… (see more)umber of manually designed heuristics mixed with some machine learning, to obtain a large number of search strategies that are tried sequentially or in parallel. Some recent works [5, 13, 19] build on top of these provers, using modern machine learning techniques to augment, select or prioritize their already existing heuristics, with some success. Other recent works do not build on top of other provers, but still require existing proof examples as input (e.g., [9, 23]). Such machine-learning-based ATP systems can struggle to solve difficult problems when the training dataset does not provide problems of sufficiently diverse difficulties. In this paper, we propose an approach which can build a strong theorem prover without relying on existing domain-specific heuristics or on prior input data (in the form of proofs) to prime the learning. We strive to design a learning methodology for ATP that allows a system to improve even when there are large gaps in the difficulty of given set of theorems. In particular, given a set of conjectures without proofs, our system trains itself, based on its own attempts and (dis)proves an increasing number of conjectures, an approach which can be viewed as a form of incremental learning. Additionally, all the previous approaches [19, 1, 13] learn exclusively on successful proof attempts. When no new theorem can be proven, the learner may not be able to improve anymore and thus the system may not be able to obtain more training data. This could in principle happen even at the very start of training, if all the theorems available are too hard. To tackle this challenge, we adapt the idea of hindsight experience replay (HER) [3] to ATP: Clauses reached during proof attempts (whether successful or not) are turned into goals in hindsight, producing a large amount of ‘auxiliary’ theorems with proofs of varied difficulties for the learner, even in principle when no theorem from the original set can be proven initially. This leads to a smoother learning regime and a constantly improving learner. We evaluate our approach on two popular benchmarks: MPTP2078 [2] and M2k [17] and compare it both with TRAIL [1], a recent machine learning prover as well as with E prover [24, 7], one of the leading heuristic provers. Our proposed approach substantially outperforms TRAIL [1] on both datasets, surpasses E in the auto configuration with a 100s time limit, and is competitive with E in the autoschedule configuration with a 7 days time limit. In addition, our approach almost always (99.5% of cases) finds shorter proofs than E.

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

proceedings.mlr.press

openreview.net

The Primacy Bias in Deep Reinforcement Learning

Evgenii Nikishin

Max Schwarzer

Pierluca D'Oro

Pierre-Luc Bacon

Aaron Courville

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)