Portrait of Dinghuai Zhang

Dinghuai Zhang

PhD - Université de Montréal
Supervisor
Co-supervisor
Research Topics
Generative Models
Probabilistic Models

Publications

Biological Sequence Design with GFlowNets
Moksh J. Jain
Alex Hernandez-Garcia
Bonaventure F. P. Dossou
Chanakya Ajit Ekbote
Jie Fu
Micheal Kilgour
Lena Simine
Payel Das
Building Robust Ensembles via Margin Boosting
Hongyang R. Zhang
Pradeep Ravikumar
Arun Sai Suggala
In the context of adversarial robustness, a single model does not usually have enough power to defend against all possible adversarial attac… (see more)ks, and as a result, has sub-optimal robustness. Consequently, an emerging line of work has focused on learning an ensemble of neural networks to defend against adversarial attacks. In this work, we take a principled approach towards building robust ensembles. We view this problem from the perspective of margin-boosting and develop an algorithm for learning an ensemble with maximum margin. Through extensive empirical evaluation on benchmark datasets, we show that our algorithm not only outperforms existing ensembling techniques, but also large models trained in an end-to-end fashion. An important byproduct of our work is a margin-maximizing cross-entropy (MCE) loss, which is a better alternative to the standard cross-entropy (CE) loss. Empirically, we show that replacing the CE loss in state-of-the-art adversarial training techniques with our MCE loss leads to significant performance improvement.
Generative Flow Networks for Discrete Probabilistic Modeling
We present energy-based generative flow networks (EB-GFN), a novel probabilistic modeling algorithm for high-dimensional discrete data. Buil… (see more)ding upon the theory of generative flow networks (GFlowNets), we model the generation process by a stochastic data construction policy and thus amortize expensive MCMC exploration into a fixed number of actions sampled from a GFlowNet. We show how GFlowNets can approximately perform large-block Gibbs sampling to mix between modes. We propose a framework to jointly train a GFlowNet with an energy function, so that the GFlowNet learns to sample from the energy distribution, while the energy learns with an approximate MLE objective with negative samples from the GFlowNet. We demonstrate EB-GFN's effectiveness on various probabilistic modeling tasks. Code is publicly available at https://github.com/zdhNarsil/EB_GFN.
Unifying Likelihood-free Inference with Black-box Optimization and Beyond
Black-box optimization formulations for biological sequence design have drawn recent attention due to their promising potential impact on th… (see more)e pharmaceutical industry. In this work, we propose to unify two seemingly distinct worlds: likelihood-free inference and black-box optimization, under one probabilistic framework. In tandem, we provide a recipe for constructing various sequence design methods based on this framework. We show how previous optimization approaches can be"reinvented"in our framework, and further propose new probabilistic black-box optimization algorithms. Extensive experiments on sequence design application illustrate the benefits of the proposed methodology.
Can Subnetwork Structure be the Key to Out-of-Distribution Generalization?
Kartik Ahuja
Yilun Xu
Yisen Wang
Can models with particular structure avoid being biased towards spurious correlation in out-of-distribution (OOD) generalization? Peters et … (see more)al. (2016) provides a positive answer for linear cases. In this paper, we use a functional modular probing method to analyze deep model structures under OOD setting. We demonstrate that even in biased models (which focus on spurious correlation) there still exist unbiased functional subnetworks. Furthermore, we articulate and demonstrate the functional lottery ticket hypothesis: full network contains a subnetwork that can achieve better OOD performance. We then propose Modular Risk Minimization to solve the subnetwork selection problem. Our algorithm learns the subnetwork structure from a given dataset, and can be combined with any other OOD regularization methods. Experiments on various OOD generalization tasks corroborate the effectiveness of our method.
Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization
Kartik Ahuja
Jean-Christophe Gagnon-Audet
The invariance principle from causality is at the heart of notable approaches such as invariant risk minimization (IRM) that seek to address… (see more) out-of-distribution (OOD) generalization failures. Despite the promising theory, invariance principle-based approaches fail in common classification tasks, where invariant (causal) features capture all the information about the label. Are these failures due to the methods failing to capture the invariance? Or is the invariance principle itself insufficient? To answer these questions, we revisit the fundamental assumptions in linear regression tasks, where invariance-based approaches were shown to provably generalize OOD. In contrast to the linear regression tasks, we show that for linear classification tasks we need much stronger restrictions on the distribution shifts, or otherwise OOD generalization is impossible. Furthermore, even with appropriate restrictions on distribution shifts in place, we show that the invariance principle alone is insufficient. We prove that a form of the information bottleneck constraint along with invariance helps address key failures when invariant features capture all the information about the label and also retains the existing success when they do not. We propose an approach that incorporates both of these principles and demonstrate its effectiveness in several experiments.
Neural Approximate Sufficient Statistics for Implicit Models
Yanzhi Chen
Michael U. Gutmann
Zhanxing Zhu
We consider the fundamental problem of how to automatically construct summary statistics for implicit generative models where the evaluation… (see more) of likelihood function is intractable but sampling / simulating data from the model is possible. The idea is to frame the task of constructing sufficient statistics as learning mutual information maximizing representation of the data. This representation is computed by a deep neural network trained by a joint statistic-posterior learning strategy. We apply our approach to both traditional approximate Bayesian computation (ABC) and recent neural likelihood approaches, boosting their performance on a range of tasks.
Unifying Likelihood-free Inference with Black-box Sequence Design and Beyond