Publications

Benchmarking Neural Network Training Algorithms
George E. Dahl
Frank Schneider
Zachary Nado
Naman Agarwal
Chandramouli Shama Sastry
Philipp Hennig
Sourabh Medapati
Runa Eschenhagen
Priya Kasimbeg
Daniel Suo
Juhan Bae
Justin M. Gilmer
A. L. Peirson
Bilal Muhammad Khan
Rohan Anil
Michael G. Rabbat
Shankar Krishnan
Daniel Snider
Ehsan Amid
Kongtao Chen … (see 5 more)
Chris J. Maddison
R. Vasudev
Michal Badura
Ankush Garg
Peter Mattson
Harms from Increasingly Agentic Algorithmic Systems
ALVA MARKELIUS
CHRIS PANG
Dmitrii Krasheninnikov
Lauro Langosco
ZHONGHAO HE
Yawen Duan
MICAH CARROLL
ALEX MAYHEW
KATHERINE COLLINS
John Burden
WANRU ZHAO
KONSTANTINOS VOUDOURIS
UMANG BHATT
Adrian Weller … (see 2 more)
David Krueger
Research in Fairness, Accountability, Transparency, and Ethics (FATE) has established many sources and forms of algorithmic harm, in domains… (see more) as diverse as health care, finance, policing, and recommendations. Much work remains to be done to mitigate the serious harms of these systems, particularly those disproportionately affecting marginalized communities. Despite these ongoing harms, new systems are being developed and deployed which threaten the perpetuation of the same harms and the creation of novel ones. In response, the FATE community has emphasized the importance of anticipating harms. Our work focuses on the anticipation of harms from increasingly agentic systems. Rather than providing a definition of agency as a binary property, we identify 4 key characteristics which, particularly in combination, tend to increase the agency of a given algorithmic system: underspecification, directness of impact, goal-directedness, and long-term planning. We also discuss important harms which arise from increasing agency -- notably, these include systemic and/or long-range impacts, often on marginalized stakeholders. We emphasize that recognizing agency of algorithmic systems does not absolve or shift the human responsibility for algorithmic harms. Rather, we use the term agency to highlight the increasingly evident fact that ML systems are not fully under human control. Our work explores increasingly agentic algorithmic systems in three parts. First, we explain the notion of an increase in agency for algorithmic systems in the context of diverse perspectives on agency across disciplines. Second, we argue for the need to anticipate harms from increasingly agentic systems. Third, we discuss important harms from increasingly agentic systems and ways forward for addressing them. We conclude by reflecting on implications of our work for anticipating algorithmic harms from emerging systems.
A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods
Unsupervised Domain Adaptation (UDA) aims at classifying unlabeled target images leveraging source labeled ones. In the case of an extreme l… (see more)abel shift scenario between the source and target domains, where we have extra source classes not present in the target domain, the UDA problem becomes a harder problem called Partial Domain Adaptation (PDA). While different methods have been developed to solve the PDA problem, most successful algorithms use model selection strategies that rely on target labels to find the best hyper-parameters and/or models along training. These strategies violate the main assumption in PDA: only unlabeled target domain samples are available. In addition, there are also experimental inconsistencies between developed methods - different architectures, hyper-parameter tuning, number of runs - yielding unfair comparisons. The main goal of this work is to provide a realistic evaluation of PDA methods under different model selection strategies and a consistent evaluation protocol. We evaluate 6 state-of-the-art PDA algorithms on 2 different real-world datasets using 7 different model selection strategies. Our two main findings are: (i) without target labels for model selection, the accuracy of the methods decreases up to 30 percentage points; (ii) only one method and model selection pair performs well on both datasets. Experiments were performed with our PyTorch framework, BenchmarkPDA, which we open source.
Conditions for indexability of restless bandits and an algorithm to compute whittle index – CORRIGENDUM
Distinctive Whole-brain Cell-Types Predict Tissue Damage Patterns in Thirteen Neurodegenerative Conditions
Veronika Pak
Quadri Adewale
Mahsa Dadar
Yashar Zeighami
Yasser Iturria-Medina
Abstract For over a century, brain research narrative has mainly centered on neuron cells. Accordingly, most whole-brain neu… (see more)rodegenerative studies focus on neuronal dysfunction and their selective vulnerability, while we lack comprehensive analyses of other major cell-types’ contribution. By unifying spatial gene expression, structural MRI, and cell deconvolution, here we describe how the human brain distribution of canonical cell-types extensively predicts tissue damage in eleven neurodegenerative disorders, including early- and late-onset Alzheimer’s disease, Parkinson’s disease, dementia with Lewy bodies, amyotrophic lateral sclerosis, frontotemporal dementia, and tauopathies. We reconstructed comprehensive whole-brain reference maps of cellular abundance for six major cell-types and identified characteristic axes of spatial overlapping with atrophy. Our results support the strong mediating role of non-neuronal cells, primarily microglia and astrocytes, on spatial vulnerability to tissue loss in neurodegeneration, with distinct and shared across-disorders pathomechanisms. These observations provide critical insights into the multicellular pathophysiology underlying spatiotemporal advance in neurodegeneration. Notably, they also emphasize the need to exceed the current neuro-centric view of brain diseases, supporting the imperative for cell-specific therapeutic targets in neurodegeneration.
Value function estimation using conditional diffusion models for control
Walter Talbott
Miguel Ángel Bautista
R Devon Hjelm
Alexander T Toshev
Joshua M. Susskind
Invariant Causal Set Covering Machines
Baptiste Bauvin
Pascal Germain
J. Corbeil
A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection
Eduardo Dadalto Câmara Gomes
Pierre Colombo
Guillaume Staerman
Nathan Noiry
The Stack: 3 TB of permissively licensed source code
Denis Kocetkov
Raymond Li
Loubna Ben allal
Jia LI
Chenghao Mou
Carlos Muñoz Ferrandis
Yacine Jernite
Margaret Mitchell
Sean Hughes
Thomas Wolf
Leandro Von Werra
Large Language Models (LLMs) play an ever-increasing role in the field of Artificial Intelligence (AI)--not only for natural language proces… (see more)sing but also for code understanding and generation. To stimulate open and responsible research on LLMs for code, we introduce The Stack, a 3.1 TB dataset consisting of permissively licensed source code in 30 programming languages. We describe how we collect the full dataset, construct a permissively licensed subset, present a data governance plan, discuss limitations, and show promising results on text2code benchmarks by training 350M-parameter decoders on different Python subsets. We find that (1) near-deduplicating the data significantly boosts performance across all experiments, and (2) it is possible to match previously reported HumanEval and MBPP performance using only permissively licensed data. We make the dataset available at https://hf.co/BigCode, provide a tool called"Am I in The Stack"(https://hf.co/spaces/bigcode/in-the-stack) for developers to search The Stack for copies of their code, and provide a process for code to be removed from the dataset by following the instructions at https://www.bigcode-project.org/docs/about/the-stack/.
AHA!: Facilitating AI Impact Assessment by Generating Examples of Harms
Zana Buçinca
Chau Minh Pham
Maurice Jakesch
Marco Túlio Ribeiro
A.R. Olteanu
Saleema Amershi
While demands for change and accountability for harmful AI consequences mount, foreseeing the downstream effects of deploying AI systems rem… (see more)ains a challenging task. We developed AHA! (Anticipating Harms of AI), a generative framework to assist AI practitioners and decision-makers in anticipating potential harms and unintended consequences of AI systems prior to development or deployment. Given an AI deployment scenario, AHA! generates descriptions of possible harms for different stakeholders. To do so, AHA! systematically considers the interplay between common problematic AI behaviors as well as their potential impacts on different stakeholders, and narrates these conditions through vignettes. These vignettes are then filled in with descriptions of possible harms by prompting crowd workers and large language models. By examining 4113 harms surfaced by AHA! for five different AI deployment scenarios, we found that AHA! generates meaningful examples of harms, with different problematic AI behaviors resulting in different types of harms. Prompting both crowds and a large language model with the vignettes resulted in more diverse examples of harms than those generated by either the crowd or the model alone. To gauge AHA!'s potential practical utility, we also conducted semi-structured interviews with responsible AI professionals (N=9). Participants found AHA!'s systematic approach to surfacing harms important for ethical reflection and discovered meaningful stakeholders and harms they believed they would not have thought of otherwise. Participants, however, differed in their opinions about whether AHA! should be used upfront or as a secondary-check and noted that AHA! may shift harm anticipation from an ideation problem to a potentially demanding review problem. Drawing on our results, we discuss design implications of building tools to help practitioners envision possible harms.
PAC-Bayesian Learning of Aggregated Binary Activated Neural Networks with Probabilities over Representations
Louis Fortier-Dubois
Gaël Letarte
Franccois Laviolette
Pascal Germain
Spatial variations in aromatic hydrocarbon emission in a dust-rich galaxy
Justin S. Spilker
Kedar A. Phadke
Manuel Aravena
Melanie Archipley
Matthew B. Bayliss
Jack E. Birkin
Matthieu Béthermin
James Burgoyne
Jared Cathey
Scott C. Chapman
Håkon Dahle
Anthony H. Gonzalez
Gayathri Gururajan
Christopher C. Hayward
Yashar D. Hezaveh
Ryley Hill
Taylor A. Hutchison
Keunho J. Kim
Seonwoo Kim
David Law … (see 19 more)
Matthew A. Malkan
Daniel P. Marrone
Eric J. Murphy
Desika Narayanan
Alex Navarre
Grace M. Olivier
Jeffrey A. Rich
Jane R. Rigby
Cassie Reuter
James E. Rhoads
Keren Sharon
J.D. T. Smith
Manuel Solimano
Nikolaus Sulzenauer
Joaquin D. Vieira
David Law
Axel Weiß
Katherine E. Whitaker
Dust grains absorb half of the radiation emitted by stars throughout the history of the universe, re-emitting this energy at infrared wavele… (see more)ngths. Polycyclic aromatic hydrocarbons (PAHs) are large organic molecules that trace millimeter-size dust grains and regulate the cooling of the interstellar gas within galaxies. Observations of PAH features in very distant galaxies have been difficult due to the limited sensitivity and wavelength coverage of previous infrared telescopes. Here we present JWST observations that detect the 3.3um PAH feature in a galaxy observed less than 1.5 billion years after the Big Bang. The high equivalent width of the PAH feature indicates that star formation, rather than black hole accretion, dominates the infrared emission throughout the galaxy. The light from PAH molecules, large dust grains, and stars and hot dust are spatially distinct from one another, leading to order-of-magnitude variations in the PAH equivalent width and the ratio of PAH to total infrared luminosity across the galaxy. The spatial variations we observe suggest either a physical offset between the PAHs and large dust grains or wide variations in the local ultraviolet radiation field. Our observations demonstrate that differences in the emission from PAH molecules and large dust grains are a complex result of localized processes within early galaxies.