Publications

Benchmarking Neural Network Training Algorithms

George Edward Dahl

Frank Schneider

Zachary Nado

Naman Agarwal

Chandramouli Shama Sastry

Philipp Hennig

Sourabh Medapati

Runa Eschenhagen

Priya Kasimbeg

Daniel Suo

Juhan Bae

Justin M. Gilmer

A. L. Peirson

Bilal Muhammad Khan

Rohan Anil

Michael Rabbat

Shankar Krishnan

Daniel Snider

Ehsan Amid

Kongtao Chen … (voir 5 de plus)

Chris J. Maddison

R. Vasudev

Michal Badura

Ankush Garg

Peter Mattson

2023-06-12

ArXiv (prépublication)

doi.org

arxiv.org

Harms from Increasingly Agentic Algorithmic Systems

Alan Chan

Rebecca Salganik

Alva Markelius

Chris Pang

Nitarshan Rajkumar

Dmitrii Krasheninnikov

Lauro Langosco

Zhonghao He

Yawen Duan

Micah Carroll

Michelle Lin

Alex Mayhew

Katherine Collins

Maryam Molamohammadi

John Burden

Wanru Zhao

Shalaleh Rismani

Konstantinos Voudouris

Umang Bhatt

Adrian Weller … (voir 2 de plus)

David Scott Krueger

Tegan Maharaj

Research in Fairness, Accountability, Transparency, and Ethics (FATE)1 has established many sources and forms of algorithmic harm, in domain… (voir plus)s as diverse as health care, finance, policing, and recommendations. Much work remains to be done to mitigate the serious harms of these systems, particularly those disproportionately affecting marginalized communities. Despite these ongoing harms, new systems are being developed and deployed, typically without strong regulatory barriers, threatening the perpetuation of the same harms and the creation of novel ones. In response, the FATE community has emphasized the importance of anticipating harms, rather than just responding to them. Anticipation of harms is especially important given the rapid pace of developments in machine learning (ML). Our work focuses on the anticipation of harms from increasingly agentic systems. Rather than providing a definition of agency as a binary property, we identify 4 key characteristics which, particularly in combination, tend to increase the agency of a given algorithmic system: underspecification, directness of impact, goal-directedness, and long-term planning. We also discuss important harms which arise from increasing agency – notably, these include systemic and/or long-range impacts, often on marginalized or unconsidered stakeholders. We emphasize that recognizing agency of algorithmic systems does not absolve or shift the human responsibility for algorithmic harms. Rather, we use the term agency to highlight the increasingly evident fact that ML systems are not fully under human control. Our work explores increasingly agentic algorithmic systems in three parts. First, we explain the notion of an increase in agency for algorithmic systems in the context of diverse perspectives on agency across disciplines. Second, we argue for the need to anticipate harms from increasingly agentic systems. Third, we discuss important harms from increasingly agentic systems and ways forward for addressing them. We conclude by reflecting on implications of our work for anticipating algorithmic harms from emerging systems.

2023-06-12

2023 ACM Conference on Fairness, Accountability, and Transparency (publié)

doi.org

arxiv.org

A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods

Tiago Salvador

Kilian FATRAS

Ioannis Mitliagkas

Adam M. Oberman

Unsupervised Domain Adaptation (UDA) aims at classifying unlabeled target images leveraging source labeled ones. In the case of an extreme l… (voir plus)abel shift scenario between the source and target domains, where we have extra source classes not present in the target domain, the UDA problem becomes a harder problem called Partial Domain Adaptation (PDA). While different methods have been developed to solve the PDA problem, most successful algorithms use model selection strategies that rely on target labels to find the best hyper-parameters and/or models along training. These strategies violate the main assumption in PDA: only unlabeled target domain samples are available. In addition, there are also experimental inconsistencies between developed methods - different architectures, hyper-parameter tuning, number of runs - yielding unfair comparisons. The main goal of this work is to provide a realistic evaluation of PDA methods under different model selection strategies and a consistent evaluation protocol. We evaluate 6 state-of-the-art PDA algorithms on 2 different real-world datasets using 7 different model selection strategies. Our two main findings are: (i) without target labels for model selection, the accuracy of the methods decreases up to 30 percentage points; (ii) only one method and model selection pair performs well on both datasets. Experiments were performed with our PyTorch framework, BenchmarkPDA, which we open source.

2023-06-11

TMLR (accepté)

doi.org

openreview.net

Conditions for indexability of restless bandits and an algorithm to compute whittle index – CORRIGENDUM

Nima Akbarzadeh

Aditya Mahajan

2023-06-09

Advances in Applied Probability (publié)

doi.org

Distinctive whole-brain cell types predict tissue damage patterns in thirteen neurodegenerative conditions

Veronika Pak

Quadri Adewale

Danilo Bzdok

Mahsa Dadar

Yashar Zeighami

Yasser Iturria-Medina

For over a century, brain research narrative has mainly centered on neuron cells. Accordingly, most neurodegenerative studies focus on neuro… (voir plus)nal dysfunction and their selective vulnerability, while we lack comprehensive analyses of other major cell types’ contribution. By unifying spatial gene expression, structural MRI, and cell deconvolution, here we describe how the human brain distribution of canonical cell types extensively predicts tissue damage in thirteen neurodegenerative conditions, including early-and late-onset Alzheimer’s disease, Parkinson’s disease, dementia with Lewy bodies, amyotrophic lateral sclerosis, mutations in presenilin-1, and three clinical variants of frontotemporal lobar degeneration (behavioural variant, semantic and non-fluent primary progressive aphasia) along with associated 3-repeat and 4-repeat tauopathies and TDP43 proteinopathies types A and C. We reconstructed comprehensive whole-brain reference maps of cellular abundance for six major cell types and identified characteristic axes of spatial overlapping with atrophy. Our results support the strong mediating role of non-neuronal cells, primarily microglia and astrocytes, in spatial vulnerability to tissue loss in neurodegeneration, with distinct and shared across-disorders pathomechanisms. These observations provide critical insights into the multicellular pathophysiology underlying spatiotemporal advance in neurodegeneration. Notably, they also emphasize the need to exceed the current neuro-centric view of brain diseases, supporting the imperative for cell-specific therapeutic targets in neurodegeneration.

2023-06-09

bioRxiv (prépublication)

doi.org

Robust Data-driven Prescriptiveness Optimization

Mehran Poursoltani

Érick Delage

Angelos Georghiou

The abundance of data has led to the emergence of a variety of optimization techniques that attempt to leverage available side information t… (voir plus)o provide more anticipative decisions. The wide range of methods and contexts of application have motivated the design of a universal unitless measure of performance known as the coefficient of prescriptiveness. This coefficient was designed to quantify both the quality of contextual decisions compared to a reference one and the prescriptive power of side information. To identify policies that maximize the former in a data-driven context, this paper introduces a distributionally robust contextual optimization model where the coefficient of prescriptiveness substitutes for the classical empirical risk minimization objective. We present a bisection algorithm to solve this model, which relies on solving a series of linear programs when the distributional ambiguity set has an appropriate nested form and polyhedral structure. Studying a contextual shortest path problem, we evaluate the robustness of the resulting policies against alternative methods when the out-of-sample dataset is subject to varying amounts of distribution shift.

2023-06-09

ArXiv (prépublication)

doi.org

arxiv.org

Value function estimation using conditional diffusion models for control

Bogdan Mazoure

Walter Talbott

Miguel Ángel Bautista

(Rex) Devon Hjelm

Alexander T Toshev

Joshua M. Susskind

2023-06-09

ArXiv (prépublication)

doi.org

openreview.net

Dynamic Routing and Wavelength Assignment with Reinforcement Learning

Peyman Kafaei

Quentin Cappart

Nicolas Chapados

Hamed Pouya

Louis-Martin Rousseau

With the rapid developments in communication systems, and considering their dynamic nature, all-optical networks are becoming increasingly c… (voir plus)omplex. This study proposes a novel method based on deep reinforcement learning for the routing and wavelength assignment problem in all-optical wavelength-decision-multiplexing networks. We consider dynamic incoming requests, in which their arrival and holding times are not known in advance. The objective is to devise a strategy that minimizes the number of rejected packages due to the lack of resources in the long term. We use graph neural networks to capture crucial latent information from the graph-structured input to develop the optimal strategy. The proposed deep reinforcement learning algorithm selects a route and a wavelength simultaneously for each incoming traffic connection as they arrive. The results demonstrate that the learned agent outperforms the methods used in practice and can be generalized on network topologies that did not participate in training.

2023-06-08

INFORMS Journal on Optimization (publié)

doi.org

Invariant Causal Set Covering Machines

Thibaud Godon

Baptiste Bauvin

Pascal Germain

Jacques Corbeil

Alexandre Drouin

2023-06-07

ArXiv (prépublication)

doi.org

arxiv.org

Beyond Gaussian Noise: A Generalized Approach to Likelihood Analysis with Non-Gaussian Noise

Ronan Legin

Alexandre Adam

Yashar Hezaveh

Laurence Perreault-Levasseur

2023-06-06

The Astrophysical Journal Letters (publié)

doi.org

arxiv.org

A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection

Eduardo Dadalto Câmara Gomes

Pierre Colombo

Guillaume Staerman

Nathan Noiry

Pablo Piantanida

2023-06-06

ArXiv (prépublication)

doi.org

arxiv.org

The Stack: 3 TB of permissively licensed source code

Denis Kocetkov

Raymond Li

Loubna Ben allal

Jia LI

Chenghao Mou

Carlos Muñoz Ferrandis

Yacine Jernite

Margaret Mitchell

Sean Hughes

Thomas Wolf

Dzmitry Bahdanau

Leandro Von Werra

Harm de Vries

Large Language Models (LLMs) play an ever-increasing role in the field of Artificial Intelligence (AI)--not only for natural language proces… (voir plus)sing but also for code understanding and generation. To stimulate open and responsible research on LLMs for code, we introduce The Stack, a 3.1 TB dataset consisting of permissively licensed source code in 30 programming languages. We describe how we collect the full dataset, construct a permissively licensed subset, present a data governance plan, discuss limitations, and show promising results on text2code benchmarks by training 350M-parameter decoders on different Python subsets. We find that (1) near-deduplicating the data significantly boosts performance across all experiments, and (2) it is possible to match previously reported HumanEval and MBPP performance using only permissively licensed data. We make the dataset available at https://hf.co/BigCode, provide a tool called"Am I in The Stack"(https://hf.co/spaces/bigcode/in-the-stack) for developers to search The Stack for copies of their code, and provide a process for code to be removed from the dataset by following the instructions at https://www.bigcode-project.org/docs/about/the-stack/.

2023-06-06

TMLR (accepté)

doi.org

openreview.net

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Publications

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications