David Scott Krueger

Core Academic Member

Assistant professor, Université de Montréal, Department of Computer Science and Operations Research (DIRO)

Biography

David Krueger is an Assistant Professor in Robust, Reasoning and Responsible AI in the Department of Computer Science and Operations Research (DIRO) at University of Montreal, and a Core Academic Member at Mila - Quebec Artificial Intelligence Institute, UC Berkeley's Center for Human-Compatible AI (CHAI), and the Center for the Study of Existential Risk (CSER). His work focuses on reducing the risk of human extinction from artificial intelligence (AI x-risk) through technical research as well as education, outreach, governance and advocacy.

His research spans many areas of Deep Learning, AI Alignment, AI Safety and AI Ethics, including alignment failure modes, algorithmic manipulation, interpretability, robustness, and understanding how AI systems learn and generalize. He has been featured in media outlets including ITV's Good Morning Britain, Al Jazeera's Inside Story, France 24, New Scientist and the Associated Press.

David completed his graduate studies at the University of Montreal and Mila - Quebec Artificial Intelligence Institute, working with Yoshua Bengio, Roland Memisevic, and Aaron Courville.

Current Students

Alan Chan

PhD - Université de Montréal

Principal supervisor :

Nicolas Le Roux

alan.chan@mila.quebec

Website

Publications

Towards Out-of-Distribution Adversarial Robustness

Adam Ibrahim

Charles Guille-Escuret

Adversarial robustness continues to be a major challenge for deep learning. A core issue is that robustness to one type of attack often fail… (see more)s to transfer to other attacks. While prior work establishes a theoretical trade-off in robustness against different

2023-06-20

ICML.cc/2023/Workshop/AdvML-Frontiers (published)

doi.org

openreview.net

Harms from Increasingly Agentic Algorithmic Systems

Alan Chan

Rebecca Salganik

Alva Markelius

Chris Pang

Nitarshan Rajkumar

Dmitrii Krasheninnikov

Lauro Langosco

Zhonghao He

Yawen Duan

Micah Carroll

Michelle Lin

Alex Mayhew

Katherine Collins

Maryam Molamohammadi

John Burden

Wanru Zhao

Shalaleh Rismani

Konstantinos Voudouris

Umang Bhatt

Adrian Weller … (see 2 more)

David Scott Krueger

Tegan Maharaj

Research in Fairness, Accountability, Transparency, and Ethics (FATE)1 has established many sources and forms of algorithmic harm, in domain… (see more)s as diverse as health care, finance, policing, and recommendations. Much work remains to be done to mitigate the serious harms of these systems, particularly those disproportionately affecting marginalized communities. Despite these ongoing harms, new systems are being developed and deployed, typically without strong regulatory barriers, threatening the perpetuation of the same harms and the creation of novel ones. In response, the FATE community has emphasized the importance of anticipating harms, rather than just responding to them. Anticipation of harms is especially important given the rapid pace of developments in machine learning (ML). Our work focuses on the anticipation of harms from increasingly agentic systems. Rather than providing a definition of agency as a binary property, we identify 4 key characteristics which, particularly in combination, tend to increase the agency of a given algorithmic system: underspecification, directness of impact, goal-directedness, and long-term planning. We also discuss important harms which arise from increasing agency – notably, these include systemic and/or long-range impacts, often on marginalized or unconsidered stakeholders. We emphasize that recognizing agency of algorithmic systems does not absolve or shift the human responsibility for algorithmic harms. Rather, we use the term agency to highlight the increasingly evident fact that ML systems are not fully under human control. Our work explores increasingly agentic algorithmic systems in three parts. First, we explain the notion of an increase in agency for algorithmic systems in the context of diverse perspectives on agency across disciplines. Second, we argue for the need to anticipate harms from increasingly agentic systems. Third, we discuss important harms from increasingly agentic systems and ways forward for addressing them. We conclude by reflecting on implications of our work for anticipating algorithmic harms from emerging systems.

2023-06-12

2023 ACM Conference on Fairness, Accountability, and Transparency (published)

doi.org

arxiv.org

The Flag and the Cross: White Christian Nationalism and the Threat to American Democracy by Philip S. Gorski and Samuel L. Perry (review)

David Scott Krueger

2023-04-21

(published)

www.semanticscholar.org

The Flag and the Cross: White Christian Nationalism and the Threat to American Democracy by Philip S. Gorski and Samuel L. Perry (review)

David Scott Krueger

2023-04-21

(published)

www.semanticscholar.org

Investigating the Nature of 3D Generalization in Deep Neural Networks

Shoaib Ahmed Siddiqui

David Scott Krueger

Thomas M. Breuel

2023-04-19

ArXiv (preprint)

doi.org

arxiv.org

Facing AI extinction

David Scott Krueger

2023-04-01

New Scientist (published)

doi.org

Out-of-context Meta-learning in Large Language Models

Dmitrii Krasheninnikov

Egor Krasheninnikov

David Scott Krueger

Brown et al. (2020) famously introduced the phenomenon of in-context meta-learning in large language models (LLMs). Our work establishes the… (see more) existence of a phenomenon we call out-of-context meta-learning via carefully designed synthetic experiments with large language models. We argue that out-of-context meta-learning is an important and surprising capability of LLMs, which may lead them to more readily "internalize" the semantic content of text that is, or appears to be, broadly useful (such as true statements, or text from authoritative sources) and apply it in appropriate contexts. We also raise the question of how this phenomenon emerges, and discuss two possible explanations: one relying on the way LLMs store knowledge in their parameters, and another suggesting that the implicit gradient alignment bias of gradient-descent-based methods may be responsible. Finally, we reflect on what our results might imply about capabilities of future AI systems, and discuss potential risks.

2023-03-04

ICLR.cc/2023/Workshop/ME-FoMo (poster)

openreview.net

Blockwise self-supervised learning with Barlow Twins

Shoaib Ahmed Siddiqui

David Scott Krueger

Yann LeCun

Stephane Deny

Current state-of-the-art deep networks are all powered by backpropagation. In this paper, we explore alternatives to full backpropagation in… (see more) the form of blockwise learning rules, leveraging the latest developments in self-supervised learning. Notably, we show that a blockwise pretraining procedure consisting of training independently the 4 main blocks of layers of a ResNet-50 with Barlow Twins loss function at each block performs almost as well as end-to-end backpropagation on ImageNet: a linear probe trained on top of our blockwise pretrained model obtains a top-1 classification accuracy of 70.48\%, only 1.1\% below the accuracy of an end-to-end pretrained network (71.57\% accuracy). We perform extensive experiments to understand the impact of different components within our method and explore a variety of adaptations of self-supervised learning to the blockwise paradigm, building an exhaustive understanding of the critical avenues for scaling local learning rules to large networks, with implications ranging from hardware design to neuroscience.

2023-02-01

ICLR.cc/2023/Conference (rejected)

openreview.net

Broken Neural Scaling Laws

Ethan Caballero

Kshitij Gupta

Irina Rish

David Scott Krueger

We present a smoothly broken power law functional form (that we refer to as a Broken Neural Scaling Law (BNSL)) that accurately models&extra… (see more)polates the scaling behaviors of deep neural networks (i.e. how the evaluation metric of interest varies as amount of compute used for training (or inference), number of model parameters, training dataset size, model input size, number of training steps, or upstream performance varies) for various architectures&for each of various tasks within a large&diverse set of upstream&downstream tasks, in zero-shot, prompted,&finetuned settings. This set includes large-scale vision, language, audio, video, diffusion, generative modeling, multimodal learning, contrastive learning, AI alignment, AI capabilities, robotics, out-of-distribution (OOD) generalization, continual learning, transfer learning, uncertainty estimation / calibration, OOD detection, adversarial robustness, distillation, sparsity, retrieval, quantization, pruning, fairness, molecules, computer programming/coding, math word problems,"emergent phase transitions", arithmetic, supervised learning, unsupervised/self-supervised learning,&reinforcement learning (single agent&multi-agent). When compared to other functional forms for neural scaling, this functional form yields extrapolations of scaling behavior that are considerably more accurate on this set. Moreover, this functional form accurately models&extrapolates scaling behavior that other functional forms are incapable of expressing such as the nonmonotonic transitions present in the scaling behavior of phenomena such as double descent&the delayed, sharp inflection points present in the scaling behavior of tasks such as arithmetic. Lastly, we use this functional form to glean insights about the limit of the predictability of scaling behavior. Code is available at https://github.com/ethancaballero/broken_neural_scaling_laws

2023-02-01

ICLR.cc/2023/Conference (poster)

doi.org

openreview.net

Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics

Shoaib Ahmed Siddiqui

Nitarshan Rajkumar

Tegan Maharaj

David Scott Krueger

Sara Hooker

Modern machine learning research relies on relatively few carefully curated datasets. Even in these datasets, and typically in `untidy' or r… (see more)aw data, practitioners are faced with significant issues of data quality and diversity which can be prohibitively labor intensive to address. Existing methods for dealing with these challenges tend to make strong assumptions about the particular issues at play, and often require a priori knowledge or metadata such as domain labels. Our work is orthogonal to these methods: we instead focus on providing a unified and efficient framework for Metadata Archaeology -- uncovering and inferring metadata of examples in a dataset. We curate different subsets of data that might exist in a dataset (e.g. mislabeled, atypical, or out-of-distribution examples) using simple transformations, and leverage differences in learning dynamics between these probe suites to infer metadata of interest. Our method is on par with far more sophisticated mitigation methods across different tasks: identifying and correcting mislabeled examples, classifying minority-group samples, prioritizing points relevant for training and enabling scalable human auditing of relevant examples.

2023-02-01

ICLR.cc/2023/Conference (notable)

doi.org

openreview.net

Mechanistic Mode Connectivity

Ekdeep Singh Lubana

Eric J Bigelow

Robert P. Dick

David Scott Krueger

Hidenori Tanaka

We study neural network loss landscapes through the lens of mode connectivity, the observation that minimizers of neural networks retrieved … (see more)via training on a dataset are connected via simple paths of low loss. Specifically, we ask the following question: are minimizers that rely on different mechanisms for making their predictions connected via simple paths of low loss? We provide a definition of mechanistic similarity as shared invariances to input transformations and demonstrate that lack of linear connectivity between two models implies they use dissimilar mechanisms for making their predictions. Relevant to practice, this result helps us demonstrate that naive fine-tuning on a downstream dataset can fail to alter a model's mechanisms, e.g., fine-tuning can fail to eliminate a model's reliance on spurious attributes. Our analysis also motivates a method for targeted alteration of a model's mechanisms, named connectivity-based fine-tuning (CBFT), which we analyze using several synthetic datasets for the task of reducing a model's reliance on spurious attributes.

2023-01-01

ICML (published)

doi.org

openreview.net

Noisy Pairing and Partial Supervision for Stylized Opinion Summarization

Reinald Kim

Mirella Lapata. 2020

Un-611

David Scott Krueger

Emmanuel Bengio

Maxinder S. Kan-620

Tegan Maharaj

Asja Fischer

Aaron Courville

Somnath Basu

Roy Chowdhury

Chao Zhao

Tanya Goyal

Junyi Jiacheng Xu

Jessy Li

Ivor Wai-hung Tsang

James T. Kwok

Neil Houlsby

Andrei Giurgiu

Stanisław Jastrzębski … (see 22 more)

Bruna Morrone

Quentin de Laroussilhe

Mona Gesmundo

Attariyan Sylvain

Gelly

Thomas Wolf

Lysandre Debut

Julien Victor Sanh

Clement Chaumond

Anthony Delangue

Pier-339 Moi

Tim ric Cistac

R´emi Rault

Morgan Louf

Funtow-900 Joe

Sam Davison

Patrick Shleifer

Von Platen

Clara Ma

Yacine Jernite

Julien Plu

Canwen Xu

Opinion summarization research has primar-001 ily focused on generating summaries reflect-002 ing important opinions from customer reviews 0… (see more)03 without paying much attention to the writing 004 style. In this paper, we propose the stylized 005 opinion summarization task, which aims to 006 generate a summary of customer reviews in 007 the desired (e.g., professional) writing style. 008 To tackle the difficulty in collecting customer 009 and professional review pairs, we develop a 010 non-parallel training framework, Noisy Pair-011 ing and Partial Supervision ( NAPA ), which 012 trains a stylized opinion summarization sys-013 tem from non-parallel customer and profes-014 sional review sets. We create a benchmark P RO - 015 S UM by collecting customer and professional 016 reviews from Yelp and Michelin. Experimental 017 results on P RO S UM and FewSum demonstrate 018 that our non-parallel training framework con-019 sistently improves both automatic and human 020 evaluations, successfully building a stylized 021 opinion summarization model that can gener-022 ate professionally-written summaries from cus-023 tomer reviews. 024

2023-01-01

(published)

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

David Scott Krueger

Biography

Current Students

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

David Scott Krueger

Biography

Current Students

Publications