Publications

Block-State Transformers

2023-06-15

ArXiv (prépublication)

arxiv.org

Block-State Transformers

2023-06-15

ArXiv (prépublication)

arxiv.org

Block-State Transformers

2023-06-15

ArXiv (prépublication)

arxiv.org

Block-State Transformers

State space models (SSMs) have shown impressive results on tasks that require modeling long-range dependencies and efficiently scale to long… (voir plus) sequences owing to their subquadratic runtime complexity. Originally designed for continuous signals, SSMs have shown superior performance on a plethora of tasks, in vision and audio; however, SSMs still lag Transformer performance in Language Modeling tasks. In this work, we propose a hybrid layer named Block-State Transformer (BST), that internally combines an SSM sublayer for long-range contextualization, and a Block Transformer sublayer for short-term representation of sequences. We study three different, and completely parallelizable, variants that integrate SSMs and block-wise attention. We show that our model outperforms similar Transformer-based architectures on language modeling perplexity and generalizes to longer sequences. In addition, the Block-State Transformer demonstrates more than tenfold increase in speed at the layer level compared to the Block-Recurrent Transformer when model parallelization is employed.

2023-06-15

ArXiv (prépublication)

arxiv.org

GEANT4-DNA simulation of temperature-dependent and pH-dependent yields of chemical radiolytic species

Jingyi Bian

Juan Duran

Wook-Geun Shin

Jose Ramos-Méndez

Jack C Sankey

Lilian Childress

Jan Seuntjens

Shirin A. Enger

2023-06-15

Physics in Medicine & Biology (publié)

doi.org

A solution algorithm for chance-constrained problems with integer second-stage recourse decisions

Andrea Lodi

Enrico Malaguti

Michele Monaci

Giacomo Nannicini

Paolo

Paronuzzi

2023-06-15

Mathematical programming (publié)

doi.org

A2CiD2: Accelerating Asynchronous Communication in Decentralized Deep Learning

Adel Nabli

Eugene Belilovsky

Edouard Oyallon

2023-06-14

ArXiv (prépublication)

doi.org

arxiv.org

Who Controlled the Evidence? Question Answering for Disclosure Information Extraction

Hardy Hardy

Derek Ruths

Nicholas B King

Conflict of interest (COI) disclosure statements provide rich information to support trans-parency and reduce bias in research. We introduce… (voir plus) a novel task to identify relationships between sponsoring entities and the research studies they sponsor from the disclosure statement. This task is challenging due to the complexity of recognizing all potential relationship patterns and the hierarchical nature of identifying entities first and then extracting their relationships to the study. To overcome these challenges, in this paper, we also constructed a new annotated dataset and proposed a Question Answering-based method to recognize entities and extract relationships. Our method has demonstrated robustness in handling diverse relationship patterns, and it remains effective even when trained on a low-resource dataset.

2023-06-13

Proceedings of the Conference on Health, Inference, and Learning (publié)

proceedings.mlr.press

Benchmarking Neural Network Training Algorithms

George E. Dahl

Frank Schneider

Zachary Nado

Naman Agarwal

Chandramouli Shama Sastry

Philipp Hennig

Sourabh Medapati

Runa Eschenhagen

Priya Kasimbeg

Daniel Suo

Juhan Bae

Justin M. Gilmer

A. L. Peirson

Bilal Muhammad Khan

Rohan Anil

Michael Rabbat

Shankar Krishnan

Daniel Snider

Ehsan Amid

Kongtao Chen … (voir 5 de plus)

Chris J. Maddison

R. Vasudev

Michal Badura

Ankush Garg

Peter Mattson

2023-06-12

ArXiv (prépublication)

doi.org

arxiv.org

Harms from Increasingly Agentic Algorithmic Systems

Alan Chan

Rebecca Salganik

Alva Markelius

Chris Pang

Nitarshan Rajkumar

Dmitrii Krasheninnikov

Lauro Langosco

Zhonghao He

Yawen Duan

Micah Carroll

Michelle Lin

Alex Mayhew

Katherine Collins

Maryam Molamohammadi

John Burden

Wanru Zhao

Shalaleh Rismani

Konstantinos Voudouris

Umang Bhatt

Adrian Weller … (voir 2 de plus)

David Scott Krueger

Tegan Maharaj

Research in Fairness, Accountability, Transparency, and Ethics (FATE)1 has established many sources and forms of algorithmic harm, in domain… (voir plus)s as diverse as health care, finance, policing, and recommendations. Much work remains to be done to mitigate the serious harms of these systems, particularly those disproportionately affecting marginalized communities. Despite these ongoing harms, new systems are being developed and deployed, typically without strong regulatory barriers, threatening the perpetuation of the same harms and the creation of novel ones. In response, the FATE community has emphasized the importance of anticipating harms, rather than just responding to them. Anticipation of harms is especially important given the rapid pace of developments in machine learning (ML). Our work focuses on the anticipation of harms from increasingly agentic systems. Rather than providing a definition of agency as a binary property, we identify 4 key characteristics which, particularly in combination, tend to increase the agency of a given algorithmic system: underspecification, directness of impact, goal-directedness, and long-term planning. We also discuss important harms which arise from increasing agency – notably, these include systemic and/or long-range impacts, often on marginalized or unconsidered stakeholders. We emphasize that recognizing agency of algorithmic systems does not absolve or shift the human responsibility for algorithmic harms. Rather, we use the term agency to highlight the increasingly evident fact that ML systems are not fully under human control. Our work explores increasingly agentic algorithmic systems in three parts. First, we explain the notion of an increase in agency for algorithmic systems in the context of diverse perspectives on agency across disciplines. Second, we argue for the need to anticipate harms from increasingly agentic systems. Third, we discuss important harms from increasingly agentic systems and ways forward for addressing them. We conclude by reflecting on implications of our work for anticipating algorithmic harms from emerging systems.

2023-06-12

2023 ACM Conference on Fairness, Accountability, and Transparency (publié)

doi.org

arxiv.org

A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods

Unsupervised Domain Adaptation (UDA) aims at classifying unlabeled target images leveraging source labeled ones. In the case of an extreme l… (voir plus)abel shift scenario between the source and target domains, where we have extra source classes not present in the target domain, the UDA problem becomes a harder problem called Partial Domain Adaptation (PDA). While different methods have been developed to solve the PDA problem, most successful algorithms use model selection strategies that rely on target labels to find the best hyper-parameters and/or models along training. These strategies violate the main assumption in PDA: only unlabeled target domain samples are available. In addition, there are also experimental inconsistencies between developed methods - different architectures, hyper-parameter tuning, number of runs - yielding unfair comparisons. The main goal of this work is to provide a realistic evaluation of PDA methods under different model selection strategies and a consistent evaluation protocol. We evaluate 6 state-of-the-art PDA algorithms on 2 different real-world datasets using 7 different model selection strategies. Our two main findings are: (i) without target labels for model selection, the accuracy of the methods decreases up to 30 percentage points; (ii) only one method and model selection pair performs well on both datasets. Experiments were performed with our PyTorch framework, BenchmarkPDA, which we open source.

2023-06-11

TMLR (accepté)

doi.org

openreview.net

Conditions for indexability of restless bandits and an algorithm to compute whittle index – CORRIGENDUM

Nima Akbarzadeh

Aditya Mahajan

2023-06-09

Advances in Applied Probability (publié)

doi.org