Publications

Learn-To-Design: Reinforcement Learning-Assisted Chemical Process Optimization
Eslam G. Al-Sakkari
Mohamed Ali
Daria C. Boffito
Mouloud Amazouz
This paper proposes an AI-assisted approach aimed at accelerating chemical process design through causal incremental reinforcement learning … (voir plus)(CIRL) where an intelligent agent is interacting iteratively with a process simulation environment (e.g., Aspen HYSYS, DWSIM, etc.). The proposed approach is based on an incremental learnable optimizer capable of guiding multi-objective optimization towards optimal design variable configurations, depending on several factors including the problem complexity, selected RL algorithm and hyperparameters tuning. One advantage of this approach is that the agent-simulator interaction significantly reduces the vast search space of design variables, leading to an accelerated and optimized design process. This is a generic causal approach that enables the exploration of new process configurations and provides actionable insights to designers to improve not only the process design but also the design process across various applications. The approach was validated on industrial processes including an absorption-based carbon capture, considering the economic and technological uncertainties of different capture processes, such as energy price, production cost, and storage capacity. It achieved a cost reduction of up to 5.5% for the designed capture process, after a few iterations, while also providing the designer with actionable insights. From a broader perspective, the proposed approach paves the way for accelerating the adoption of decarbonization technologies (CCUS value chains, clean fuel production, etc.) at a larger scale, thus catalyzing climate change mitigation.
Adaptive Accompaniment with ReaLchords
Adam Roberts
Ian Simon
Alexander Scarlatos
Chris Donahue
Cassie Tarakajian
Shayegan Omidshafiei
Natasha Jaques
Cheng-Zhi Anna Huang
Jamming requires coordination, anticipation, and collaborative creativity between musicians. Current generative models of music produce expr… (voir plus)essive output but are not able to generate in an \emph{online} manner, meaning simultaneously with other musicians (human or otherwise). We propose ReaLchords, an online generative model for improvising chord accompaniment to user melody. We start with an online model pretrained by maximum likelihood, and use reinforcement learning to finetune the model for online use. The finetuning objective leverages both a novel reward model that provides feedback on both harmonic and temporal coherency between melody and chord, and a divergence term that implements a novel type of distillation from a teacher model that can see the future melody. Through quantitative experiments and listening tests, we demonstrate that the resulting model adapts well to unfamiliar input and produce fitting accompaniment. ReaLchords opens the door to live jamming, as well as simultaneous co-creation in other modalities.
All-in-one simulation-based inference
Manuel Gloeckler
Michael Deistler
Christian Dietrich Weilbach
Frank N. Wood
Jakob H. Macke
Autoformalizing Euclidean Geometry
Logan Murphy
Kaiyu Yang
Jialiang Sun
Zhaoyu Li
Animashree Anandkumar
Autoformalization involves automatically translating informal math into formal theorems and proofs that are machine-verifiable. Euclidean ge… (voir plus)ometry provides an interesting and controllable domain for studying autoformalization. In this paper, we introduce a neuro-symbolic framework for autoformalizing Euclidean geometry, which combines domain knowledge, SMT solvers, and large language models (LLMs). One challenge in Euclidean geometry is that informal proofs rely on diagrams, leaving gaps in texts that are hard to formalize. To address this issue, we use theorem provers to fill in such diagrammatic information automatically, so that the LLM only needs to autoformalize the explicit textual steps, making it easier for the model. We also provide automatic semantic evaluation for autoformalized theorem statements. We construct LeanEuclid, an autoformalization benchmark consisting of problems from Euclid’s Elements and the UniGeo dataset formalized in the Lean proof assistant. Experiments with GPT-4 and GPT-4V show the capability and limitations of state-of-the-art LLMs on autoformalizing geometry problems. The data and code are available at https://github.com/loganrjmurphy/LeanEuclid.
CKGConv: General Graph Convolution with Continuous Kernels
Soumyasundar Pal
Jiaming Zhou
Yingxue Zhang
Mark J. Coates
The existing definitions of graph convolution, either from spatial or spectral perspectives, are inflexible and not unified. Defining a gene… (voir plus)ral convolution operator in the graph domain is challenging due to the lack of canonical coordinates, the presence of irregular structures, and the properties of graph symmetries. In this work, we propose a novel and general graph convolution framework by parameterizing the kernels as continuous functions of pseudo-coordinates derived via graph positional encoding. We name this Continuous Kernel Graph Convolution (CKGConv). Theoretically, we demonstrate that CKGConv is flexible and expressive. CKGConv encompasses many existing graph convolutions, and exhibits a stronger expressiveness, as powerful as graph transformers in terms of distinguishing non-isomorphic graphs. Empirically, we show that CKGConv-based Networks outperform existing graph convolutional networks and perform comparably to the best graph transformers across a variety of graph datasets. The code and models are publicly available at https://github.com/networkslab/CKGConv.
A Computational Framework for Solving Wasserstein Lagrangian Flows
Kirill Neklyudov
Rob Brekelmans
Qiang Liu
Alireza Makhzani
The dynamical formulation of the optimal transport can be extended through various choices of the underlying geometry (kinetic energy), and … (voir plus)the regularization of density paths (potential energy). These combinations yield different variational problems (Lagrangians), encompassing many variations of the optimal transport problem such as the Schrödinger bridge, unbalanced optimal transport, and optimal transport with physical constraints, among others. In general, the optimal density path is unknown, and solving these variational problems can be computationally challenging. We propose a novel deep learning based framework approaching all of these problems from a unified perspective. Leveraging the dual formulation of the Lagrangians, our method does not require simulating or backpropagating through the trajectories of the learned dynamics, and does not need access to optimal couplings. We showcase the versatility of the proposed framework by outperforming previous approaches for the single-cell trajectory inference, where incorporating prior knowledge into the dynamics is crucial for correct predictions.
Consistent Adversarially Robust Linear Classification: Non-Parametric Setting
Elvis Dopgima Dohmatob
For binary classification in …
On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization
Tianyue H. Zhang
Jose Gallego-Posada
Constrained optimization offers a powerful framework to prescribe desired behaviors in neural network models. Typically, constrained problem… (voir plus)s are solved via their min-max Lagrangian formulations, which exhibit unstable oscillatory dynamics when optimized using gradient descent-ascent. The adoption of constrained optimization techniques in the machine learning community is currently limited by the lack of reliable, general-purpose update schemes for the Lagrange multipliers. This paper proposes the
Discovering Environments with XRM
Diane Bouchacourt
Mark Ibrahim
P Vincent
David Lopez-Paz
Environment annotations are essential for the success of many out-of-distribution (OOD) generalization methods. Unfortunately, these are cos… (voir plus)tly to obtain and often limited by human annotators’ biases. To achieve robust generalization, it is essential to develop algorithms for automatic environment discovery within datasets. Current proposals, which divide examples based on their training error, suffer from one fundamental problem. These methods introduce hyper-parameters and early-stopping criteria, which require a validation set with human-annotated environments, the very information subject to discovery. In this paper, we propose Cross-Risk Minimization (XRM) to address this issue. XRM trains twin networks, each learning from one random half of the training data, while imitating confident held-out mistakes made by its sibling. XRM provides a recipe for hyper-parameter tuning, does not require early-stopping, and can discover environments for all training and validation data. Algorithms built on top of XRM environments achieve oracle worst-group-accuracy, addressing a long-standing challenge in OOD generalization.
Don't be so Negative! Score-based Generative Modeling with Oracle-assisted Guidance
Saeid Naderiparizi
Xiaoxuan Liang
Berend Zwartsenberg
Setareh Cohan
Frank N. Wood
EiG-Search: Generating Edge-Induced Subgraphs for GNN Explanation in Linear Time
Shengyao Lu
Keith G Mills
Jiao He
Di Niu
Estimating Unknown Population Sizes Using the Hypergeometric Distribution
The multivariate hypergeometric distribution describes sampling without replacement from a discrete population of elements divided into mult… (voir plus)iple categories. Addressing a gap in the literature, we tackle the challenge of estimating discrete distributions when both the total population size and the sizes of its constituent categories are unknown. Here, we propose a novel solution using the hypergeometric likelihood to solve this estimation challenge, even in the presence of severe under-sampling. We develop our approach to account for a data generating process where the ground-truth is a mixture of distributions conditional on a continuous latent variable, such as with collaborative filtering, using the variational autoencoder framework. Empirical data simulation demonstrates that our method outperforms other likelihood functions used to model count data, both in terms of accuracy of population size estimate and in its ability to learn an informative latent space. We demonstrate our method's versatility through applications in NLP, by inferring and estimating the complexity of latent vocabularies in text excerpts, and in biology, by accurately recovering the true number of gene transcripts from sparse single-cell genomics data.