Publications

Local Search GFlowNets

Minsu Kim

Sungsoo Ahn

Jinkyoo Park

Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their re… (voir plus)wards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which focuses on exploiting high-rewarded sample space to resolve this issue. Our main idea is to explore the local neighborhood via backtracking and reconstruction guided by backward and forward policies, respectively. This allows biasing the samples toward high-reward solutions, which is not possible for a typical GFlowNet solution generation scheme, which uses the forward policy to generate the solution from scratch. Extensive experiments demonstrate a remarkable performance improvement in several biochemical tasks. Source code is available: https://github.com/dbsxodud-11/ls_gfn.

2023-10-04

ArXiv (prépublication)

doi.org

arxiv.org

Local Search GFlowNets

Minsu Kim

Sungsoo Ahn

Jinkyoo Park

2023-10-04

ArXiv (prépublication)

doi.org

arxiv.org

Local Search GFlowNets

Minsu Kim

Sungsoo Ahn

Jinkyoo Park

2023-10-04

ArXiv (prépublication)

doi.org

arxiv.org

Local Search GFlowNets

Minsu Kim

Sungsoo Ahn

Jinkyoo Park

Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their re… (voir plus)wards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which focuses on exploiting high-rewarded sample space to resolve this issue. Our main idea is to explore the local neighborhood via backtracking and reconstruction guided by backward and forward policies, respectively. This allows biasing the samples toward high-reward solutions, which is not possible for a typical GFlowNet solution generation scheme, which uses the forward policy to generate the solution from scratch. Extensive experiments demonstrate a remarkable performance improvement in several biochemical tasks. Source code is available: https://github.com/dbsxodud-11/ls_gfn.

2023-10-04

ArXiv (prépublication)

doi.org

arxiv.org

Local Search GFlowNets

Minsu Kim

Sungsoo Ahn

Jinkyoo Park

Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their re… (voir plus)wards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search, which focuses on exploiting high-rewarded sample space to resolve this issue. Our main idea is to explore the local neighborhood via backtracking and reconstruction guided by backward and forward policies, respectively. This allows biasing the samples toward high-reward solutions, which is not possible for a typical GFlowNet solution generation scheme, which uses the forward policy to generate the solution from scratch. Extensive experiments demonstrate a remarkable performance improvement in several biochemical tasks. Source code is available: https://github.com/dbsxodud-11/ls_gfn.

2023-10-04

ArXiv (prépublication)

doi.org

arxiv.org

Local Search GFlowNets

Minsu Kim

Sungsoo Ahn

Jinkyoo Park

2023-10-04

ArXiv (prépublication)

doi.org

arxiv.org

Searching for High-Value Molecules Using Reinforcement Learning and Transformers

Raj Ghugare

Santiago Miret

Adriana Hugessen

Mariano Phielipp

Glen Berseth

Reinforcement learning (RL) over text representations can be effective for finding high-value policies that can search over graphs. However,… (voir plus) RL requires careful structuring of the search space and algorithm design to be effective in this challenge. Through extensive experiments, we explore how different design choices for text grammar and algorithmic choices for training can affect an RL policy's ability to generate molecules with desired properties. We arrive at a new RL-based molecular design algorithm (ChemRLformer) and perform a thorough analysis using 25 molecule design tasks, including computationally complex protein docking simulations. From this analysis, we discover unique insights in this problem space and show that ChemRLformer achieves state-of-the-art performance while being more straightforward than prior work by demystifying which design choices are actually helpful for text-based molecule design.

2023-10-04

ArXiv (prépublication)

doi.org

arxiv.org

Sensing Wellbeing in the Workplace, Why and For Whom? Envisioning Impacts with Organizational Stakeholders

Anna Kawakami

Shreya Chowdhary

Shamsi T. Iqbal

Q. Vera Liao

Alexandra Olteanu

Jina Suh

Koustuv Saha

With the heightened digitization of the workplace, alongside the rise of remote and hybrid work prompted by the pandemic, there is growing c… (voir plus)orporate interest in using passive sensing technologies for workplace wellbeing. Existing research on these technologies often focus on understanding or improving interactions between an individual user and the technology. Workplace settings can, however, introduce a range of complexities that challenge the potential impact and in-practice desirability of wellbeing sensing technologies. Today, there is an inadequate empirical understanding of how everyday workers---including those who are impacted by, and impact the deployment of workplace technologies--envision its broader socio-ecological impacts. In this study, we conduct storyboard-driven interviews with 33 participants across three stakeholder groups: organizational governors, AI builders, and worker data subjects. Overall, our findings surface how workers envisioned wellbeing sensing technologies may lead to cascading impacts on their broader organizational culture, interpersonal relationships with colleagues, and individual day-to-day lives. Participants anticipated harms arising from ambiguity and misalignment around scaled notions of "worker wellbeing,'' underlying technical limitations to workplace-situated sensing, and assumptions regarding how social structures and relationships may shape the impacts and use of these technologies. Based on our findings, we discuss implications for designing worker-centered data-driven wellbeing technologies.

2023-10-04

Proceedings of the ACM on Human-Computer Interaction (publié)

doi.org

arxiv.org

SUMMIT: Scaffolding Open Source Software Issue Discussion Through Summarization

Saskia Gilmer

Avinash Bhat

Shuvam Shah

Kevin Cherry

Jinghui Cheng

Jin Guo

2023-10-04

Proceedings of the ACM on Human-Computer Interaction (publié)

doi.org

arxiv.org

The neuroanatomical substrates of autism and ADHD and their link to putative genomic underpinnings

Lisa M. Berg

Caroline Gurr

Johanna Leyhausen

Hanna Seelemeyer

Anke Bletsch

Tim Schaefer

Charlotte M. Pretzsch

Beth Oakley

Eva Loth

Dorothea L. Floris

Jan K. Buitelaar

Christian Beckmann

Tobias Banaschewski

Tony Charman

Emily J. H. Jones

Julian Tillmann

Chris H. Chatham

Thomas Bourgeron

Jumana Sara Bonnie Simon Sarah Sven Carsten Michael Daniel Claudia Yvette Bhismadev Ineke Daisy Flavio Guillaume Sarah Jessica Vincent Pilar David Lindsay Hannah Joerg Rosemary Mark H. Prantik Meng-Chuan Xavier Liogier Michael V. David J. René Andre Luke Maarten Andreas Carolin Nico Laurence Marianne Bob Gahan Antonio M. Barbara Amber Jessica Roberto Antonia San José Emily Will Roberto Heike Jack Steve C. R. Caroline Marcel P. Ahmad

Jumana Sara Bonnie Simon Sarah Sven Carsten Michael Danie Ahmad Ambrosino Auyeung Baron-Cohen Baumeister Böl … (voir 58 de plus)

Jumana Ahmad

Sara Ambrosino

Bonnie Auyeung

Simon Baron-Cohen

Sarah Baumeister

Sven Bölte

Carsten Bours

Michael Brammer

Daniel Brandeis

Claudia Brogna

Yvette de Bruijn

Bhismadev Chakrabarti

Ineke Cornelissen

Daisy Crawley

Flavio Dell’Acqua

Guillaume Dumas

Sarah Durston

Jessica Faulkner

Vincent Frouin

Pilar Garcés

David Goyard

Lindsay Ham

Hannah Hayward

Joerg F. Hipp

Rosemary Holt

Mark Johnson

Prantik Kundu

Meng-Chuan Lai

Xavier Liogier D’ardhuy

Michael V. Lombardo

David J. Lythgoe

René Mandl

Andre Marquand

Luke Mason

Maarten Mennes

Andreas Meyer-Lindenberg

Carolin Moessnang

Nico Bast

Laurence O’Dwyer

Marianne Oldehinkel

Bob Oranje

Gahan Pandina

Antonio Persico

Barbara Ruggeri

Amber N. V. Ruigrok

Jessica Sabet

Roberto Sacco

Antonia San José Cáceres

Emily Simonoff

Will Spooren

Roberto Toro

Heike Tost

Jack Waldman

Steve C. R. Williams

Caroline Wooldridge

Marcel P. Zwiers

Declan Murphy

Christine Ecker

2023-10-04

Molecular Autism (publié)

doi.org

Data Cleaning and Machine Learning: A Systematic Literature Review

Pierre-Olivier Cot'e

Amin Nikanjam

Nafisa Ahmed

Dmytro Humeniuk

Foutse Khomh

Context: Machine Learning (ML) is integrated into a growing number of systems for various applications. Because the performance of an ML mod… (voir plus)el is highly dependent on the quality of the data it has been trained on, there is a growing interest in approaches to detect and repair data errors (i.e., data cleaning). Researchers are also exploring how ML can be used for data cleaning; hence creating a dual relationship between ML and data cleaning. To the best of our knowledge, there is no study that comprehensively reviews this relationship. Objective: This paper's objectives are twofold. First, it aims to summarize the latest approaches for data cleaning for ML and ML for data cleaning. Second, it provides future work recommendations. Method: We conduct a systematic literature review of the papers published between 2016 and 2022 inclusively. We identify different types of data cleaning activities with and for ML: feature cleaning, label cleaning, entity matching, outlier detection, imputation, and holistic data cleaning. Results: We summarize the content of 101 papers covering various data cleaning activities and provide 24 future work recommendations. Our review highlights many promising data cleaning techniques that can be further extended. Conclusion: We believe that our review of the literature will help the community develop better approaches to clean data.

2023-10-03

ArXiv (prépublication)

doi.org

arxiv.org

Differential Chromatin Architecture and Risk Variants in Deep Layer Excitatory Neurons and Grey Matter Microglia Contribute to Major Depressive Disorder

Anjali Chawla

Doruk Cakmakci

Wenmin Zhang

Malosree Maitra

Reza Rahimian

Haruka Mitsuhashi

MA Davoli

Jenny Yang

Gary Gang Chen

Ryan Denniston

Deborah Mash

Naguib Mechawar

Matthew Suderman

Yue Li

Corina Nagy

Gustavo Turecki

2023-10-03

bioRxiv (prépublication)

doi.org

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Publications

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Mots-clés populaires:

Publications