Portrait de David Scott Krueger

David Scott Krueger

Membre académique principal
Professeur adjoint, Université de Montréal, Département d'informatique et de recherche opérationnelle (DIRO)
Sujets de recherche
Apprentissage de représentations
Apprentissage profond

Biographie

David Krueger est professeur adjoint en IA robuste, raisonnable et responsable au département d'informatique et de recherche opérationnelle (DIRO) et un membre académique principal à Mila - Institut québécois d'intelligence artificielle, au Center for Human-Compatible AI (CHAI) de l'université de Berkeley et au Center for the Study of Existential Risk (CSER). Ses travaux portent sur la réduction du risque d'extinction de l'humanité par l'intelligence artificielle (x-risque IA) par le biais de la recherche technique ainsi que de l'éducation, de la sensibilisation, de la gouvernance et de la défense des droits humains.

Ses recherches couvrent de nombreux domaines de l'apprentissage profond, de l'alignement de l'IA, de la sécurité de l'IA et de l'éthique de l'IA, notamment les modes de défaillance de l'alignement, la manipulation algorithmique, l'interprétabilité, la robustesse et la compréhension de la manière dont les systèmes d'IA apprennent et se généralisent. Il a été présenté dans les médias, notamment dans l'émission Good Morning Britain d'ITV, Inside Story d'Al Jazeera, France 24, New Scientist et l'Associated Press.

David a terminé ses études supérieures à l'Université de Montréal et à Mila - Institut québécois d'intelligence artificielle, où il a travaillé avec Yoshua Bengio, Roland Memisevic et Aaron Courville.

Étudiants actuels

Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche

Publications

Managing AI Risks in an Era of Rapid Progress
Geoffrey Hinton
Andrew Yao
Dawn Song
Pieter Abbeel
Yuval Noah Harari
Ya-Qin Zhang
Lan Xue
Shai Shalev-Shwartz
Gillian K. Hadfield
Jeff Clune
Frank Hutter
Atilim Güneş Baydin
Sheila McIlraith
Qiqi Gao
Ashwin Acharya
Anca Dragan
Philip Torr … (voir 4 de plus)
Stuart Russell
Daniel Kahneman
Jan Brauner
Sören Mindermann
In this short consensus paper, we outline risks from upcoming, advanced AI systems. We examine large-scale social harms and malicious uses, … (voir plus)as well as an irreversible loss of human control over autonomous AI systems. In light of rapid and continuing AI progress, we propose priorities for AI R&D and governance.
Managing AI Risks in an Era of Rapid Progress
Geoffrey Hinton
Andrew Yao
Dawn Song
Pieter Abbeel
Yuval Noah Harari
Trevor Darrell
Ya-Qin Zhang
Lan Xue
Shai Shalev-Shwartz
Gillian K. Hadfield
Jeff Clune
Frank Hutter
Atilim Güneş Baydin
Sheila McIlraith
Qiqi Gao
Ashwin Acharya
Anca Dragan … (voir 5 de plus)
Philip Torr
Stuart Russell
Daniel Kahneman
Jan Brauner
Sören Mindermann
Managing AI Risks in an Era of Rapid Progress
Geoffrey Hinton
Andrew Yao
Dawn Song
Pieter Abbeel
Yuval Noah Harari
Ya-Qin Zhang
Lan Xue
Shai Shalev-Shwartz
Gillian K. Hadfield
Jeff Clune
Frank Hutter
Atilim Güneş Baydin
Sheila McIlraith
Qiqi Gao
Ashwin Acharya
Anca Dragan
Philip Torr … (voir 4 de plus)
Stuart Russell
Daniel Kahneman
Jan Brauner
Sören Mindermann
In this short consensus paper, we outline risks from upcoming, advanced AI systems. We examine large-scale social harms and malicious uses, … (voir plus)as well as an irreversible loss of human control over autonomous AI systems. In light of rapid and continuing AI progress, we propose priorities for AI R&D and governance.
Hazards from Increasingly Accessible Fine-Tuning of Downloadable Foundation Models
Alan Chan
Benjamin Bucknall
Herbie Bradley
Hazards from Increasingly Accessible Fine-Tuning of Downloadable Foundation Models
Alan Chan
Benjamin Bucknall
Herbie Bradley
Meta- (out-of-context) learning in neural networks
Dmitrii Krasheninnikov
Egor Krasheninnikov
Bruno Mlodozeniec
Brown et al. (2020) famously introduced the phenomenon of in-context learning in large language models (LLMs). We establish the existence of… (voir plus) a phenomenon we call meta-out-of-context learning (meta-OCL) via carefully designed synthetic experiments with LLMs. Our results suggest that meta-OCL leads LLMs to more readily"internalize"the semantic content of text that is, or appears to be, broadly useful (such as true statements, or text from authoritative sources) and use it in appropriate circumstances. We further demonstrate meta-OCL in a synthetic computer vision setting, and propose two hypotheses for the emergence of meta-OCL: one relying on the way models store knowledge in their parameters, and another suggesting that the implicit gradient alignment bias of gradient-descent-based optimizers may be responsible. Finally, we reflect on what our results might imply about capabilities of future AI systems, and discuss potential risks. Our code can be found at https://github.com/krasheninnikov/internalization.
Thinker: Learning to Plan and Act
Stephen Chung
Ivan Anokhin
We propose the Thinker algorithm, a novel approach that enables reinforcement learning agents to autonomously interact with and utilize a le… (voir plus)arned world model. The Thinker algorithm wraps the environment with a world model and introduces new actions designed for interacting with the world model. These model-interaction actions enable agents to perform planning by proposing alternative plans to the world model before selecting a final action to execute in the environment. This approach eliminates the need for handcrafted planning algorithms by enabling the agent to learn how to plan autonomously and allows for easy interpretation of the agent's plan with visualization. We demonstrate the algorithm's effectiveness through experimental results in the game of Sokoban and the Atari 2600 benchmark, where the Thinker algorithm achieves state-of-the-art performance and competitive results, respectively. Visualizations of agents trained with the Thinker algorithm demonstrate that they have learned to plan effectively with the world model to select better actions. Thinker is the first work showing that an RL agent can learn to plan with a learned world model in complex environments.
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J Bigelow
Robert P. Dick
Hidenori Tanaka
Towards Out-of-Distribution Adversarial Robustness
Adam Ibrahim
Charles Guille-Escuret
Adversarial robustness continues to be a major challenge for deep learning. A core issue is that robustness to one type of attack often fail… (voir plus)s to transfer to other attacks. While prior work establishes a theoretical trade-off in robustness against different
Harms from Increasingly Agentic Algorithmic Systems
Alan Chan
Rebecca Salganik
Alva Markelius
Chris Pang
Nitarshan Rajkumar
Dmitrii Krasheninnikov
Lauro Langosco
Zhonghao He
Yawen Duan
Micah Carroll
Michelle Lin
Alex Mayhew
Katherine Collins
Maryam Molamohammadi
John Burden
Wanru Zhao
Shalaleh Rismani
Konstantinos Voudouris
Umang Bhatt
Adrian Weller … (voir 2 de plus)
Research in Fairness, Accountability, Transparency, and Ethics (FATE)1 has established many sources and forms of algorithmic harm, in domain… (voir plus)s as diverse as health care, finance, policing, and recommendations. Much work remains to be done to mitigate the serious harms of these systems, particularly those disproportionately affecting marginalized communities. Despite these ongoing harms, new systems are being developed and deployed, typically without strong regulatory barriers, threatening the perpetuation of the same harms and the creation of novel ones. In response, the FATE community has emphasized the importance of anticipating harms, rather than just responding to them. Anticipation of harms is especially important given the rapid pace of developments in machine learning (ML). Our work focuses on the anticipation of harms from increasingly agentic systems. Rather than providing a definition of agency as a binary property, we identify 4 key characteristics which, particularly in combination, tend to increase the agency of a given algorithmic system: underspecification, directness of impact, goal-directedness, and long-term planning. We also discuss important harms which arise from increasing agency – notably, these include systemic and/or long-range impacts, often on marginalized or unconsidered stakeholders. We emphasize that recognizing agency of algorithmic systems does not absolve or shift the human responsibility for algorithmic harms. Rather, we use the term agency to highlight the increasingly evident fact that ML systems are not fully under human control. Our work explores increasingly agentic algorithmic systems in three parts. First, we explain the notion of an increase in agency for algorithmic systems in the context of diverse perspectives on agency across disciplines. Second, we argue for the need to anticipate harms from increasingly agentic systems. Third, we discuss important harms from increasingly agentic systems and ways forward for addressing them. We conclude by reflecting on implications of our work for anticipating algorithmic harms from emerging systems.
The Flag and the Cross: White Christian Nationalism and the Threat to American Democracy by Philip S. Gorski and Samuel L. Perry (review)
The Flag and the Cross: White Christian Nationalism and the Threat to American Democracy by Philip S. Gorski and Samuel L. Perry (review)