Publications

Adaptive Dynamic Programming for Energy-Efficient Base Station Cell Switching
Junliang Luo
Yi Tian Xu
Di Wu
M. Jenkin
Energy saving in wireless networks is growing in importance due to increasing demand for evolving new-gen cellular networks, environmental a… (voir plus)nd regulatory concerns, and potential energy crises arising from geopolitical tensions. In this work, we propose an approximate dynamic programming (ADP)-based method coupled with online optimization to switch on/off the cells of base stations to reduce network power consumption while maintaining adequate Quality of Service (QoS) metrics. We use a multilayer perceptron (MLP) given each state-action pair to predict the power consumption to approximate the value function in ADP for selecting the action with optimal expected power saved. To save the largest possible power consumption without deteriorating QoS, we include another MLP to predict QoS and a long short-term memory (LSTM) for predicting handovers, incorporated into an online optimization algorithm producing an adaptive QoS threshold for filtering cell switching actions based on the overall QoS history. The performance of the method is evaluated using a practical network simulator with various real-world scenarios with dynamic traffic patterns.
Causal Inference in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems
Trang Nguyen
Alexander Tong
Kanika Madan
Dianbo Liu
Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular proc… (voir plus)esses. However, causal discovery in GRNs is a challenging problem for multiple reasons including the existence of cyclic feedback loops and uncertainty that yields diverse possible causal structures. Previous works in this area either ignore cyclic dynamics (assume acyclic structure) or struggle with scalability. We introduce Swift-DynGFN as a novel framework that enhances causal structure learning in GRNs while addressing scalability concerns. Specifically, Swift-DynGFN exploits gene-wise independence to boost parallelization and to lower computational cost. Experiments on real single-cell RNA velocity and synthetic GRN datasets showcase the advancement in learning causal structure in GRNs and scalability in larger systems.
Improved baselines for vision-language pre-training
Enrico Fini
Pietro Astolfi
Jakob Verbeek
Michal Drozdzal
Local Search GFlowNets
Minsu Kim
Taeyoung Yun
Emmanuel Bengio
Dinghuai Zhang
Sungsoo Ahn
Jinkyoo Park
Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their re… (voir plus)wards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which focuses on exploiting high-rewarded sample space to resolve this issue. Our main idea is to explore the local neighborhood via backtracking and reconstruction guided by backward and forward policies, respectively. This allows biasing the samples toward high-reward solutions, which is not possible for a typical GFlowNet solution generation scheme, which uses the forward policy to generate the solution from scratch. Extensive experiments demonstrate a remarkable performance improvement in several biochemical tasks. Source code is available: https://github.com/dbsxodud-11/ls_gfn.
Searching for High-Value Molecules Using Reinforcement Learning and Transformers
Raj Ghugare
Santiago Miret
Adriana Hugessen
Mariano Phielipp
Reinforcement learning (RL) over text representations can be effective for finding high-value policies that can search over graphs. However,… (voir plus) RL requires careful structuring of the search space and algorithm design to be effective in this challenge. Through extensive experiments, we explore how different design choices for text grammar and algorithmic choices for training can affect an RL policy's ability to generate molecules with desired properties. We arrive at a new RL-based molecular design algorithm (ChemRLformer) and perform a thorough analysis using 25 molecule design tasks, including computationally complex protein docking simulations. From this analysis, we discover unique insights in this problem space and show that ChemRLformer achieves state-of-the-art performance while being more straightforward than prior work by demystifying which design choices are actually helpful for text-based molecule design.
Sensing Wellbeing in the Workplace, Why and For Whom? Envisioning Impacts with Organizational Stakeholders
Anna Kawakami
Shreya Chowdhary
Shamsi T. Iqbal
Q. Vera Liao
Jina Suh
Koustuv Saha
With the heightened digitization of the workplace, alongside the rise of remote and hybrid work prompted by the pandemic, there is growing c… (voir plus)orporate interest in using passive sensing technologies for workplace wellbeing. Existing research on these technologies often focus on understanding or improving interactions between an individual user and the technology. Workplace settings can, however, introduce a range of complexities that challenge the potential impact and in-practice desirability of wellbeing sensing technologies. Today, there is an inadequate empirical understanding of how everyday workers---including those who are impacted by, and impact the deployment of workplace technologies--envision its broader socio-ecological impacts. In this study, we conduct storyboard-driven interviews with 33 participants across three stakeholder groups: organizational governors, AI builders, and worker data subjects. Overall, our findings surface how workers envisioned wellbeing sensing technologies may lead to cascading impacts on their broader organizational culture, interpersonal relationships with colleagues, and individual day-to-day lives. Participants anticipated harms arising from ambiguity and misalignment around scaled notions of "worker wellbeing,'' underlying technical limitations to workplace-situated sensing, and assumptions regarding how social structures and relationships may shape the impacts and use of these technologies. Based on our findings, we discuss implications for designing worker-centered data-driven wellbeing technologies.
SUMMIT: Scaffolding Open Source Software Issue Discussion Through Summarization
Saskia Gilmer
Avinash Bhat
Shuvam Shah
Kevin Cherry
Jinghui Cheng
The neuroanatomical substrates of autism and ADHD and their link to putative genomic underpinnings
Lisa M. Berg
Caroline Gurr
Johanna Leyhausen
Hanna Seelemeyer
Anke Bletsch
Tim Schaefer
Charlotte M. Pretzsch
Beth Oakley
Eva Loth
Dorothea L. Floris
Jan K. Buitelaar
Christian Beckmann
Tobias Banaschewski
Tony Charman
Emily J. H. Jones
Julian Tillmann
Chris H. Chatham
Thomas Bourgeron
Jumana Sara Bonnie Simon Sarah Sven Carsten Michael Danie Ahmad Ambrosino Auyeung Baron-Cohen Baumeister Böl
Jumana Sara Bonnie Simon Sarah Sven Carsten Michael Daniel Claudia Yvette Bhismadev Ineke Daisy Flavio Guillaume Sarah Jessica Vincent Pilar David Lindsay Hannah Joerg Rosemary Mark H. Prantik Meng-Chuan Xavier Liogier Michael V. David J. René Andre Luke Maarten Andreas Carolin Nico Laurence Marianne Bob Gahan Antonio M. Barbara Amber Jessica Roberto Antonia San José Emily Will Roberto Heike Jack Steve C. R. Caroline Marcel P. Ahmad … (voir 58 de plus)
Jumana Ahmad
Sara Ambrosino
Bonnie Auyeung
Simon Baron-Cohen
Sarah Baumeister
Sven Bölte
Carsten Bours
Michael Brammer
Daniel Brandeis
Claudia Brogna
Yvette de Bruijn
Bhismadev Chakrabarti
Ineke Cornelissen
Daisy Crawley
Flavio Dell’Acqua
Sarah Durston
Jessica Faulkner
Vincent Frouin
Pilar Garcés
David Goyard
Lindsay Ham
Hannah Hayward
Joerg F. Hipp
Rosemary Holt
Mark Johnson
Prantik Kundu
Meng-Chuan Lai
Xavier Liogier D’ardhuy
Michael V. Lombardo
David J. Lythgoe
René Mandl
Andre Marquand
Luke Mason
Maarten Mennes
Andreas Meyer-Lindenberg
Carolin Moessnang
Nico Bast
Laurence O’Dwyer
Marianne Oldehinkel
Bob Oranje
Gahan Pandina
Antonio Persico
Barbara Ruggeri
Amber N. V. Ruigrok
Jessica Sabet
Roberto Sacco
Antonia San José Cáceres
Emily Simonoff
Will Spooren
Roberto Toro
Heike Tost
Jack Waldman
Steve C. R. Williams
Caroline Wooldridge
Marcel P. Zwiers
Declan Murphy
Christine Ecker
Data Cleaning and Machine Learning: A Systematic Literature Review
Pierre-Olivier Cot'e
Amin Nikanjam
Nafisa Ahmed
Dmytro Humeniuk
Context: Machine Learning (ML) is integrated into a growing number of systems for various applications. Because the performance of an ML mod… (voir plus)el is highly dependent on the quality of the data it has been trained on, there is a growing interest in approaches to detect and repair data errors (i.e., data cleaning). Researchers are also exploring how ML can be used for data cleaning; hence creating a dual relationship between ML and data cleaning. To the best of our knowledge, there is no study that comprehensively reviews this relationship. Objective: This paper's objectives are twofold. First, it aims to summarize the latest approaches for data cleaning for ML and ML for data cleaning. Second, it provides future work recommendations. Method: We conduct a systematic literature review of the papers published between 2016 and 2022 inclusively. We identify different types of data cleaning activities with and for ML: feature cleaning, label cleaning, entity matching, outlier detection, imputation, and holistic data cleaning. Results: We summarize the content of 101 papers covering various data cleaning activities and provide 24 future work recommendations. Our review highlights many promising data cleaning techniques that can be further extended. Conclusion: We believe that our review of the literature will help the community develop better approaches to clean data.
Differential Chromatin Architecture and Risk Variants in Deep Layer Excitatory Neurons and Grey Matter Microglia Contribute to Major Depressive Disorder
Anjali Chawla
Doruk Cakmakci
Wenmin Zhang
Malosree Maitra
Reza Rahimian
Haruka Mitsuhashi
MA Davoli
Jenny Yang
Gary Gang Chen
Ryan Denniston
Deborah Mash
Naguib Mechawar
Matthew Suderman
Corina Nagy
Gustavo Turecki
Learning Reliable Logical Rules with SATNet
Zhaoyu Li
Jinpei Guo
Yuhe Jiang
Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts in Underspecified Visual Tasks
Luca Scimeca
Alexander Rubinstein
Armand Nicolicioiu
Damien Teney
Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to shortcut learning phenomena, where… (voir plus) a model may rely on erroneous, easy-to-learn, cues while ignoring reliable ones. In this work, we propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs). We discover that DPMs have the inherent capability to represent multiple visual cues independently, even when they are largely correlated in the training data. We leverage this characteristic to encourage model diversity and empirically show the efficacy of the approach with respect to several diversification objectives. We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.