Publications

Questions Are All You Need to Train a Dense Passage Retriever
Devendra Singh Sachan
Mike Lewis
Dani Yogatama
Luke Zettlemoyer
Manzil Zaheer
We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training da… (voir plus)ta. Dense retrieval is a central challenge for open-domain tasks, such as Open QA, where state-of-the-art methods typically require large supervised datasets with custom hard-negative mining and denoising of positive examples. ART, in contrast, only requires access to unpaired inputs and outputs (e.g., questions and potential answer passages). It uses a new passage-retrieval autoencoding scheme, where (1) an input question is used to retrieve a set of evidence passages, and (2) the passages are then used to compute the probability of reconstructing the original question. Training for retrieval based on question reconstruction enables effective unsupervised learning of both passage and question encoders, which can be later incorporated into complete Open QA systems without any further finetuning. Extensive experiments demonstrate that ART obtains state-of-the-art results on multiple QA retrieval benchmarks with only generic initialization from a pre-trained language model, removing the need for labeled data and task-specific losses.1 Our code and model checkpoints are available at: https://github.com/DevSinghSachan/art.
ROSA: Random Orthogonal Subspace Adaptation
Marawan Gamal
Aristides Milios
Towards Out-of-Distribution Adversarial Robustness
Adam Ibrahim
Charles Guille-Escuret
Adversarial robustness continues to be a major challenge for deep learning. A core issue is that robustness to one type of attack often fail… (voir plus)s to transfer to other attacks. While prior work establishes a theoretical trade-off in robustness against different
BatchGFN: Generative Flow Networks for Batch Active Learning
Shreshth A Malik
Salem Lahlou
Andrew Jesson
Moksh J. Jain
Nikolay Malkin
Tristan Deleu
Yarin Gal
We introduce BatchGFN—a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points pro… (voir plus)portional to a batch reward. With an appropriate reward function to quantify the utility of acquiring a batch, such as the joint mutual information between the batch and the model parameters, BatchGFN is able to construct highly informative batches for active learning in a principled way. We show our approach enables sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems. This alleviates the computational complexity of batch-aware algorithms and removes the need for greedy approximations to find maximizers for the batch reward. We also present early results for amortizing training across acquisition steps, which will enable scaling to real-world tasks.
Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation
Chris Emezue
Tristan Deleu
Stefan Bauer
CATS: A Computation-Aware Transaction Processing System with Proactive Unlocking
Bolun Zhu
Yu Hua
Ziyin Long
With the increasing complexity of network applications and high demands for QoS, transaction processing systems have received more attention… (voir plus)s due to salient features of simplicity and atomicity. Computation operations play an important role in transaction processing systems. However, conventional QoS-based mechanisms become inefficient due to the limited concurrent support upon computation operations, thus causing high time consumption in the critical path of concurrency control. In order to efficiently offer concurrent computations, we propose CATS, a Computation Aware Transaction processing System, to mitigate performance impacts caused by computation operations. CATS further leverages program semantics to defer the execution of transaction operations in the commit phase to alleviate unnecessary conflicts caused by computations. Extensive evaluation results demonstrate that CATS significantly outperforms state-of-the-art designs, including 2PL and OCC based transaction processing systems on high-contended and computation-intensive workloads. We have released the open-source codes in GitHub for public use.
CATS: A Computation-Aware Transaction Processing System with Proactive Unlocking
Bolun Zhu
Yu Hua
Ziyin Long
With the increasing complexity of network applications and high demands for QoS, transaction processing systems have received more attention… (voir plus)s due to salient features of simplicity and atomicity. Computation operations play an important role in transaction processing systems. However, conventional QoS-based mechanisms become inefficient due to the limited concurrent support upon computation operations, thus causing high time consumption in the critical path of concurrency control. In order to efficiently offer concurrent computations, we propose CATS, a Computation Aware Transaction processing System, to mitigate performance impacts caused by computation operations. CATS further leverages program semantics to defer the execution of transaction operations in the commit phase to alleviate unnecessary conflicts caused by computations. Extensive evaluation results demonstrate that CATS significantly outperforms state-of-the-art designs, including 2PL and OCC based transaction processing systems on high-contended and computation-intensive workloads. We have released the open-source codes in GitHub for public use.
Causal Discovery with Language Models as Imperfect Experts
Stephanie Long
Alexandre Piché
Valentina Zantedeschi
Tibor Schuster
Understanding the causal relationships that underlie a system is a fundamental prerequisite to accurate decision-making. In this work, we ex… (voir plus)plore how expert knowledge can be used to improve the data-driven identification of causal graphs, beyond Markov equivalence classes. In doing so, we consider a setting where we can query an expert about the orientation of causal relationships between variables, but where the expert may provide erroneous information. We propose strategies for amending such expert knowledge based on consistency properties, e.g., acyclicity and conditional independencies in the equivalence class. We then report a case study, on real data, where a large language model is used as an imperfect expert.
Development of a hydrated electron dosimeter for radiotherapy applications: A proof of concept.
Julien Mégrourèche
H. Bekerat
Jingyi Bian
Alaina Bui
Jack C Sankey
Lilian Childress
BACKGROUND Hydrated electrons, which are short-lived products of radiolysis in water, increase the optical absorption of water, providing a … (voir plus)pathway toward near-tissue-equivalent clinical radiation dosimeters. This has been demonstrated in high-dose-per-pulse radiochemistry research, but, owing to the weak absorption signal, its application in existing low-dose-per-pulse radiotherapy provided by clinical linear accelerators (linacs) has yet to be investigated. PURPOSE The aims of this study were to measure the optical absorption associated with hydrated electrons produced by clinical linacs and to assess the suitability of the technique for radiotherapy (⩽ 1 cGy per pulse) applications. METHODS 40 mW of 660-nm laser light was sent five passes through deionized water contained in a 10 × 4 ×
An Empirical Study of the Effectiveness of Using a Replay Buffer on Mode Discovery in GFlowNets
Nikhil Murali Vemgal
Elaine Lau
Reinforcement Learning (RL) algorithms aim to learn an optimal policy by iteratively sampling actions to learn how to maximize the total exp… (voir plus)ected return,
Exploring Exchangeable Dataset Amortization for Bayesian Posterior Inference
Sarthak Mittal
Niels Leif Bracher
Priyank Jaini
Marcus A Brubaker
Bayesian inference provides a natural way of incorporating uncertainties and different underlying theories when making predictions or analyz… (voir plus)ing complex systems. However, it requires computationally expensive routines for approximation, which have to be re-run when new data is observed and are thus infeasible to efficiently scale and reuse. In this work, we look at the problem from the perspective of amortized inference to obtain posterior parameter distributions for known probabilistic models. We propose a neural network-based approach that can handle exchangeable observations and amortize over datasets to convert the problem of Bayesian posterior inference into a single forward pass of a network. Our empirical analyses explore various design choices for amortized inference by comparing: (a) our proposed variational objective with forward KL minimization, (b) permutation-invariant architectures like Transformers and DeepSets, and (c) parameterizations of posterior families like diagonal Gaussian and Normalizing Flows. Through our experiments, we successfully apply amortization techniques to estimate the posterior distributions for different domains solely through inference.
GFlowNets for Causal Discovery: an Overview
Dragos Cristian Manta
Edward J Hu