Portrait de David Scott Krueger

David Scott Krueger

Membre académique principal
Professeur adjoint, Université de Montréal, Département d'informatique et de recherche opérationnelle (DIRO)
Sujets de recherche
Apprentissage de représentations
Apprentissage profond

Biographie

David Krueger est professeur adjoint en IA robuste, raisonnable et responsable au département d'informatique et de recherche opérationnelle (DIRO) et un membre académique principal à Mila - Institut québécois d'intelligence artificielle, au Center for Human-Compatible AI (CHAI) de l'université de Berkeley et au Center for the Study of Existential Risk (CSER). Ses travaux portent sur la réduction du risque d'extinction de l'humanité par l'intelligence artificielle (x-risque IA) par le biais de la recherche technique ainsi que de l'éducation, de la sensibilisation, de la gouvernance et de la défense des droits humains.

Ses recherches couvrent de nombreux domaines de l'apprentissage profond, de l'alignement de l'IA, de la sécurité de l'IA et de l'éthique de l'IA, notamment les modes de défaillance de l'alignement, la manipulation algorithmique, l'interprétabilité, la robustesse et la compréhension de la manière dont les systèmes d'IA apprennent et se généralisent. Il a été présenté dans les médias, notamment dans l'émission Good Morning Britain d'ITV, Inside Story d'Al Jazeera, France 24, New Scientist et l'Associated Press.

David a terminé ses études supérieures à l'Université de Montréal et à Mila - Institut québécois d'intelligence artificielle, où il a travaillé avec Yoshua Bengio, Roland Memisevic et Aaron Courville.

Étudiants actuels

Doctorat - UdeM
Superviseur⋅e principal⋅e :

Publications

The Flag and the Cross: White Christian Nationalism and the Threat to American Democracy by Philip S. Gorski and Samuel L. Perry (review)
Investigating the Nature of 3D Generalization in Deep Neural Networks
Shoaib Ahmed Siddiqui
Thomas M. Breuel
Facing AI extinction
Out-of-context Meta-learning in Large Language Models
Dmitrii Krasheninnikov
Egor Krasheninnikov
Brown et al. (2020) famously introduced the phenomenon of in-context meta-learning in large language models (LLMs). Our work establishes the… (voir plus) existence of a phenomenon we call out-of-context meta-learning via carefully designed synthetic experiments with large language models. We argue that out-of-context meta-learning is an important and surprising capability of LLMs, which may lead them to more readily "internalize" the semantic content of text that is, or appears to be, broadly useful (such as true statements, or text from authoritative sources) and apply it in appropriate contexts. We also raise the question of how this phenomenon emerges, and discuss two possible explanations: one relying on the way LLMs store knowledge in their parameters, and another suggesting that the implicit gradient alignment bias of gradient-descent-based methods may be responsible. Finally, we reflect on what our results might imply about capabilities of future AI systems, and discuss potential risks.
Blockwise self-supervised learning with Barlow Twins
Shoaib Ahmed Siddiqui
Yann LeCun
Stephane Deny
Current state-of-the-art deep networks are all powered by backpropagation. In this paper, we explore alternatives to full backpropagation in… (voir plus) the form of blockwise learning rules, leveraging the latest developments in self-supervised learning. Notably, we show that a blockwise pretraining procedure consisting of training independently the 4 main blocks of layers of a ResNet-50 with Barlow Twins loss function at each block performs almost as well as end-to-end backpropagation on ImageNet: a linear probe trained on top of our blockwise pretrained model obtains a top-1 classification accuracy of 70.48\%, only 1.1\% below the accuracy of an end-to-end pretrained network (71.57\% accuracy). We perform extensive experiments to understand the impact of different components within our method and explore a variety of adaptations of self-supervised learning to the blockwise paradigm, building an exhaustive understanding of the critical avenues for scaling local learning rules to large networks, with implications ranging from hardware design to neuroscience.
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
We present a smoothly broken power law functional form (that we refer to as a Broken Neural Scaling Law (BNSL)) that accurately models&extra… (voir plus)polates the scaling behaviors of deep neural networks (i.e. how the evaluation metric of interest varies as amount of compute used for training (or inference), number of model parameters, training dataset size, model input size, number of training steps, or upstream performance varies) for various architectures&for each of various tasks within a large&diverse set of upstream&downstream tasks, in zero-shot, prompted,&finetuned settings. This set includes large-scale vision, language, audio, video, diffusion, generative modeling, multimodal learning, contrastive learning, AI alignment, AI capabilities, robotics, out-of-distribution (OOD) generalization, continual learning, transfer learning, uncertainty estimation / calibration, OOD detection, adversarial robustness, distillation, sparsity, retrieval, quantization, pruning, fairness, molecules, computer programming/coding, math word problems,"emergent phase transitions", arithmetic, supervised learning, unsupervised/self-supervised learning,&reinforcement learning (single agent&multi-agent). When compared to other functional forms for neural scaling, this functional form yields extrapolations of scaling behavior that are considerably more accurate on this set. Moreover, this functional form accurately models&extrapolates scaling behavior that other functional forms are incapable of expressing such as the nonmonotonic transitions present in the scaling behavior of phenomena such as double descent&the delayed, sharp inflection points present in the scaling behavior of tasks such as arithmetic. Lastly, we use this functional form to glean insights about the limit of the predictability of scaling behavior. Code is available at https://github.com/ethancaballero/broken_neural_scaling_laws
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Sara Hooker
Modern machine learning research relies on relatively few carefully curated datasets. Even in these datasets, and typically in `untidy' or r… (voir plus)aw data, practitioners are faced with significant issues of data quality and diversity which can be prohibitively labor intensive to address. Existing methods for dealing with these challenges tend to make strong assumptions about the particular issues at play, and often require a priori knowledge or metadata such as domain labels. Our work is orthogonal to these methods: we instead focus on providing a unified and efficient framework for Metadata Archaeology -- uncovering and inferring metadata of examples in a dataset. We curate different subsets of data that might exist in a dataset (e.g. mislabeled, atypical, or out-of-distribution examples) using simple transformations, and leverage differences in learning dynamics between these probe suites to infer metadata of interest. Our method is on par with far more sophisticated mitigation methods across different tasks: identifying and correcting mislabeled examples, classifying minority-group samples, prioritizing points relevant for training and enabling scalable human auditing of relevant examples.
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J Bigelow
Robert P. Dick
Hidenori Tanaka
We study neural network loss landscapes through the lens of mode connectivity, the observation that minimizers of neural networks retrieved … (voir plus)via training on a dataset are connected via simple paths of low loss. Specifically, we ask the following question: are minimizers that rely on different mechanisms for making their predictions connected via simple paths of low loss? We provide a definition of mechanistic similarity as shared invariances to input transformations and demonstrate that lack of linear connectivity between two models implies they use dissimilar mechanisms for making their predictions. Relevant to practice, this result helps us demonstrate that naive fine-tuning on a downstream dataset can fail to alter a model's mechanisms, e.g., fine-tuning can fail to eliminate a model's reliance on spurious attributes. Our analysis also motivates a method for targeted alteration of a model's mechanisms, named connectivity-based fine-tuning (CBFT), which we analyze using several synthetic datasets for the task of reducing a model's reliance on spurious attributes.
Noisy Pairing and Partial Supervision for Stylized Opinion Summarization
Reinald Kim
Mirella Lapata. 2020
Un-611
Emmanuel Bengio
Maxinder S. Kan-620
Asja Fischer
Somnath Basu
Roy Chowdhury
Chao Zhao
Tanya Goyal
Junyi Jiacheng Xu
Jessy Li
Ivor Wai-hung Tsang
James T. Kwok
Neil Houlsby
Andrei Giurgiu
Stanisław Jastrzębski … (voir 22 de plus)
Bruna Morrone
Quentin de Laroussilhe
Mona Gesmundo
Attariyan Sylvain
Gelly
Thomas Wolf
Lysandre Debut
Julien Victor Sanh
Clement Chaumond
Anthony Delangue
Pier-339 Moi
Tim ric Cistac
R´emi Rault
Morgan Louf
Funtow-900 Joe
Sam Davison
Patrick Shleifer
Von Platen
Clara Ma
Yacine Jernite
Julien Plu
Canwen Xu
Opinion summarization research has primar-001 ily focused on generating summaries reflect-002 ing important opinions from customer reviews 0… (voir plus)03 without paying much attention to the writing 004 style. In this paper, we propose the stylized 005 opinion summarization task, which aims to 006 generate a summary of customer reviews in 007 the desired (e.g., professional) writing style. 008 To tackle the difficulty in collecting customer 009 and professional review pairs, we develop a 010 non-parallel training framework, Noisy Pair-011 ing and Partial Supervision ( NAPA ), which 012 trains a stylized opinion summarization sys-013 tem from non-parallel customer and profes-014 sional review sets. We create a benchmark P RO - 015 S UM by collecting customer and professional 016 reviews from Yelp and Michelin. Experimental 017 results on P RO S UM and FewSum demonstrate 018 that our non-parallel training framework con-019 sistently improves both automatic and human 020 evaluations, successfully building a stylized 021 opinion summarization model that can gener-022 ate professionally-written summaries from cus-023 tomer reviews. 024
On The Fragility of Learned Reward Functions
Lev E McKinney
Yawen Duan
Adam Gleave
Reward functions are notoriously difficult to specify, especially for tasks with complex goals. Reward learning approaches attempt to infer … (voir plus)reward functions from human feedback and preferences. Prior works on reward learning have mainly focused on the performance of policies trained alongside the reward function. This practice, however, may fail to detect learned rewards that are not capable of training new policies from scratch and thus do not capture the intended behavior. Our work focuses on demonstrating and studying the causes of these relearning failures in the domain of preference-based reward learning. We demonstrate with experiments in tabular and continuous control environments that the severity of relearning failures can be sensitive to changes in reward model design and the trajectory dataset composition. Based on our findings, we emphasize the need for more retraining-based evaluations in the literature.
Training Equilibria in Reinforcement Learning
Lauro Langosco
Adam Gleave
In partially observable environments, reinforcement learning algorithms such as policy gradient and Q-learning may have multiple equilibria-… (voir plus)--policies that are stable under further training---and can converge to equilibria that are strictly suboptimal. Prior work blames insufficient exploration, but suboptimal equilibria can arise despite full exploration and other favorable circumstances like a flexible policy parametrization. We show theoretically that the core problem is that in partially observed environments, an agent's past actions induce a distribution on hidden states. Equipping the policy with memory helps it model the hidden state and leads to convergence to a higher reward equilibrium, \emph{even when there exists a memoryless optimal policy}. Experiments show that policies with insufficient memory tend to learn to use the environment as auxiliary memory, and parameter noise helps policies escape suboptimal equilibria.