Publications

Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis
Jia Lin Hau
Mohammad Ghavamzadeh
Marek Petrik
In Markov decision processes (MDPs), quantile risk measures such as Value-at-Risk are a standard metric for modeling RL agents' preferences … (see more)for certain outcomes. This paper proposes a new Q-learning algorithm for quantile optimization in MDPs with strong convergence and performance guarantees. The algorithm leverages a new, simple dynamic program (DP) decomposition for quantile MDPs. Compared with prior work, our DP decomposition requires neither known transition probabilities nor solving complex saddle point equations and serves as a suitable foundation for other model-free RL algorithms. Our numerical results in tabular domains show that our Q-learning algorithm converges to its DP variant and outperforms earlier algorithms.
Recovering undersampled single-cell transcriptomes with HyperCell
Abstract

Single-cell transcriptomic technology has now matured, allowing quantification of mRNA transcripts corres… (see more)ponding to tens of thousands of genes within a cell. However, still only a small fraction of these mRNA is captured and measured by today’s single-cell assays. There are likely hundreds of thousands of mRNA copies present within a typical human cell, yet these assays omit a majority of the transcripts that are actually present. This introduces technical noise, especially non-biological variability and excessive sparsity, which frustrates downstream analysis and potentially skews biological conclusions. To overcome these challenges, we here develop HyperCell, a probabilistic deep learning approach that explicitly models this undersampling to produce estimates of each cell’s original gene transcript abundances across the whole transcriptome. We demonstrate that our framework offers benefits in various mRNA modeling settings, by i) correctly differentiating between spurious sampling-induced and real biological zeros, outperforming existing approaches, ii) estimating the total mRNA content of cells across states to reduce contamination due to background transcripts, iii) reducing contamination due to background transcripts, and iv) helping to counteract biases that may appear during typical differential gene expression analyses using widespread normalization approaches. Our approach to correcting for the technical noise introduced by the single-cell experimental process brings us closer to studying biology, starting from the true transcriptome of cells.

Representation Learning via Non-Contrastive Mutual Information
Zhaohan Daniel Guo
Bernardo Avila Pires
Dale Schuurmans
Bo Dai
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction
Zhangzhi Peng
Zachary Quinn
Michael Bronstein
Pranam Chatterjee
Avishek Joey Bose
Generative modeling of discrete data underlies important applications spanning text-based agents like ChatGPT to the design of the very buil… (see more)ding blocks of life in protein sequences. However, application domains need to exert control over the generated data by steering the generative process - typically via RLHF - to satisfy a specified property, reward, or affinity metric. In this paper, we study the problem of steering Masked Diffusion Models (MDMs), a recent class of discrete diffusion models that offer a compelling alternative to traditional autoregressive models. We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference by learning to sample from a target Bayesian posterior. Our DDPP framework leads to a family of three novel objectives that are all simulation-free, and thus scalable while applying to general non-differentiable reward functions. Empirically, we instantiate DDPP by steering MDMs to perform class-conditional pixel-level image modeling, RLHF-based alignment of MDMs using text-based rewards, and finetuning protein language models to generate more diverse secondary structures and shorter proteins. We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
On the Identifiability of Causal Abstractions
Sékou-Oumar Kaba
Causal representation learning (CRL) enhances machine learning models' robustness and generalizability by learning structural causal models … (see more)associated with data-generating processes. We focus on a family of CRL methods that uses contrastive data pairs in the observable space, generated before and after a random, unknown intervention, to identify the latent causal model. (Brehmer et al., 2022) showed that this is indeed possible, given that all latent variables can be intervened on individually. However, this is a highly restrictive assumption in many systems. In this work, we instead assume interventions on arbitrary subsets of latent variables, which is more realistic. We introduce a theoretical framework that calculates the degree to which we can identify a causal model, given a set of possible interventions, up to an abstraction that describes the system at a higher level of granularity.
The Superposition of Diffusion Models Using the Itô Density Estimator
Avishek Joey Bose
Kirill Neklyudov
The Cambrian explosion of easily accessible pre-trained diffusion models suggests a demand for methods that combine multiple different pre-t… (see more)rained diffusion models without incurring the significant computational burden of re-training a larger combined model. In this paper, we cast the problem of combining multiple pre-trained diffusion models at the generation stage under a novel proposed framework termed superposition. Theoretically, we derive superposition from rigorous first principles stemming from the celebrated continuity equation and design two novel algorithms tailor-made for combining diffusion models in SuperDiff. SuperDiff leverages a new scalable Itô density estimator for the log likelihood of the diffusion SDE which incurs no additional overhead compared to the well-known Hutchinson's estimator needed for divergence calculations. We demonstrate that SuperDiff is scalable to large pre-trained diffusion models as superposition is performed solely through composition during inference, and also enjoys painless implementation as it combines different pre-trained vector fields through an automated re-weighting scheme. Notably, we show that SuperDiff is efficient during inference time, and mimics traditional composition operators such as the logical OR and the logical AND. We empirically demonstrate the utility of using SuperDiff for generating more diverse images on CIFAR-10, more faithful prompt conditioned image editing using Stable Diffusion, as well as improved conditional molecule generation and unconditional de novo structure design of proteins. https://github.com/necludov/super-diffusion
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
Thomas Schmied
Jordi Grau-Moya
Markus Wulfmeier
Neural Kinematic Bases for Fluids
Yibo Liu
Paul Kry
Kenny Erleben
Sune Darkner
Teseo Schneider
Cortical differences across psychiatric disorders and associated common and rare genetic variants
Kuldeep Kumar
Zhijie Liao
Clara Moreau
Christopher R. K. Ching
Claudia Modenato
Will Snyder
Sayeh Kazem
Charles-Olivier Martin
C.O. Martin
Anne-Marie Bélanger
Valérie K. Fontaine
Khadije Jizi
Rune Boen
Zohra Saci
Leila Kushan
Ana I. Silva
Marianne B.M. van den Bree
David E.J. Linden … (see 16 more)
Michael J. Owen
Jeremy Hall
Sarah Lippé
Bogdan Draganski
Laura Almasy
Sophia I. Thomopoulos
Neda Jahanshad
Ida E. Sønderby
Ole A. Andreassen
David C. Glahn
Armin Raznahan
Carrie Bearden
Tomáš Paus
Paul M. Thompson
Sébastien Jacquemont
Deep Learning Unlocks the True Potential of Organ Donation after Circulatory Death with Accurate Prediction of Time-to-Death
Xingzhi Sun
Edward De Brouwer
Chen Liu
Ramesh Batra
𝟏
Increasing the number of organ donations after circulatory death (DCD) has been identified as one of the most important ways of addressing t… (see more)he ongoing organ shortage. While recent technological advances in organ transplantation have increased their success rate, a substantial challenge in increasing the number of DCD donations resides in the uncertainty regarding the timing of cardiac death after terminal extubation, impacting the risk of prolonged ischemic organ injury, and negatively affecting post-transplant outcomes. In this study, we trained and externally validated an ODE-RNN model, which combines recurrent neural network with neural ordinary equations and excels in processing irregularly-sampled time series data. The model is designed to predict time-to-death following terminal extubation in the intensive care unit (ICU) using the last 24 hours of clinical observations. Our model was trained on a cohort of 3,238 patients from Yale New Haven Hospital, and validated on an external cohort of 1,908 patients from six hospitals across Connecticut. The model achieved accuracies of 95.3 {+/-} 1.0% and 95.4 {+/-} 0.7% for predicting whether death would occur in the first 30 and 60 minutes, respectively, with a calibration error of 0.024 {+/-} 0.009. Heart rate, respiratory rate, mean arterial blood pressure (MAP), oxygen saturation (SpO2), and Glasgow Coma Scale (GCS) scores were identified as the most important predictors. Surpassing existing clinical scores, our model sets the stage for reduced organ acquisition costs and improved post-transplant outcomes.
Impact of Reducing Time Lived With Colostomies on Social Stigma Affecting Children With Anorectal Malformations in Southwestern Uganda.
Felix Oyania
Caroline Q. Stephens
Sarah Ullrich
Meera Kotagal
Amy M. Shui
Caleb Tuhumwire
Godfrey Zari Rukundo
Joseph Ngonzi
Ava Yap
Francis Bajunirwe
Doruk Ozgediz
BACKGROUND The social stigma of families of children living with colostomies due to anorectal malformation (ARM) is significant in low-incom… (see more)e countries (LICs). Improved access to pediatric surgery has resulted in more 1-stage ARM procedures in Southwestern Uganda, avoiding colostomy creation, but the impact on social stigma experienced by families is unknown. We hypothesized that this change would decrease the social stigma experienced by families. METHODS A single-center mixed retrospective and prospective cohort study with combined qualitative data of families of children with ARM who underwent corrective surgery compared the stigma experienced by those with colostomies to those without. The Kilifi Stigma Scale of Epilepsy (KSSE) was used to assess social stigma. Multivariable regression analysis assessed differences in the stigma experienced, controlling for age at diagnosis, rurality, distance traveled, sex, and parental education. Subgroup analysis assessed the impact of colostomy duration on stigma, stratified over parental education. RESULTS Patient/family dyads with 238 ARM were included; 177 (74%) received a colostomy. Most patients were male (51%), lived in rural areas (71%), and had parents with primary school education (65%). For those without a colostomy, the median KSSE was 0 (Q1-Q3 0-0), compared to 11 (Q1-Q3 3-20) for colostomy. On multivariable analysis, after controlling for age at diagnosis, rurality, distance traveled, sex, and parental education attainment, families of patients with ARM who received a colostomy had a median KSSE score 7.8 points higher than those who did not receive a colostomy (coefficient 7.78, 95% 3.14-12.43, and p = 0.001). When the duration of colostomy (in years) was examined, the median KSSE score increased by 1.58 points for each additional year for a patient who had a colostomy (IRR 1.58, 95% CI: 0.76-2.40, and p  0.001). CONCLUSION Adopting a 1-stage ARM repair for the select types, which avoids colostomy creation, significantly reduces the exper
Learning from Stochastic Teacher Representations Using Student-Guided Knowledge Distillation
Muhammad Haseeb Aslam
Clara Martinez
Alessandro Lameiras Koerich
Ali Etemad
Eric Granger
Advances in self-distillation have shown that when knowledge is distilled from a teacher to a student using the same deep learning (DL) arch… (see more)itecture, the student performance can surpass the teacher particularly when the network is overparameterized and the teacher is trained with early stopping. Alternatively, ensemble learning also improves performance, although training, storing, and deploying multiple models becomes impractical as the number of models grows. Even distilling an ensemble to a single student model or weight averaging methods first requires training of multiple teacher models and does not fully leverage the inherent stochasticity for generating and distilling diversity in DL models. These constraints are particularly prohibitive in resource-constrained or latency-sensitive applications such as wearable devices. This paper proposes to train only one model and generate multiple diverse teacher representations using distillation-time dropout. However, generating these representations stochastically leads to noisy representations that are misaligned with the learned task. To overcome this problem, a novel stochastic self-distillation (SSD) training strategy is introduced for filtering and weighting teacher representation to distill from task-relevant representations only, using student-guided knowledge distillation (SGKD). The student representation at each distillation step is used as authority to guide the distillation process. Experimental results on real-world affective computing, wearable/biosignal datasets from the UCR Archive, the HAR dataset, and image classification datasets show that the proposed SSD method can outperform state-of-the-art methods without increasing the model size at both training and testing time, and incurs negligible computational complexity compared to state-of-the-art ensemble learning and weight averaging methods.