Portrait of Gael Varoquaux is unavailable

Gael Varoquaux

Alumni

Publications

International AI Safety Report Second Key Update: Technical Safeguards and Risk Management
Stephen Clare
Carina Prunkl
Maksym Andriushchenko
BEN BUCKNALL
Philip Fox
Nestor Maslej
Conor McGlynn
Malcolm Murray
Stephen Casper
Jessica Newman
Daniel Privitera
Daron Acemoglu
Thomas G. Dietterich
Fredrik Heintz
Geoffrey Hinton
Nick Jennings
Susan Leavy … (see 17 more)
Teresa Ludermir
Vidushi Marda
Helen Margetts
John McDermid
Jane Munga
Arvind Narayanan
Alondra Nelson
Clara Neppel
Sarvapali D. (Gopal) Ramchurn
Stuart Russell
Marietje Schaake
Bernhard Schölkopf
Alvaro Soto
Lee Tiedrich
Andrew Yao
Ya-Qin Zhang
This is the Second Key Update to the 2025 International AI Safety Report. The First Key Update (1) discussed developments in the capabilitie… (see more)s of general-purpose AI models and systems and associated risks. This Key Update covers how various actors, including researchers, companies, and governments, are approaching risk management and technical mitigations for AI. The past year has seen important developments in AI risk management, including better techniques for training safer models and monitoring their outputs. While this represents tangible progress, significant gaps remain. It is often uncertain how effective current measures are at preventing harms, and effectiveness varies across time and applications. There are many opportunities to further strengthen existing safeguard techniques and to develop new ones. This Key Update provides a concise overview of critical developments in risk management practices and technical risk mitigation since the publication of the 2025 AI Safety Report in January. It highlights where progress is being made and where gaps remain. Above all, it aims to support policymakers, researchers, and the public in navigating a rapidly changing environment, helping them to make informed and timely decisions about the governance of general-purpose AI. Professor Yoshua BengioUniversité de Montréal / LawZero /Mila – Quebec AI Institute & Chair
International AI Safety Report: First Key Update, Capabilities and Risk Implications
Prof. Yoshua Bengio
Stephen Clare
Carina Prunkl
Maksym Andriushchenko
BEN BUCKNALL
Philip Fox
Tiancheng Hu
Cameron Jones
Sam Manning
Nestor Maslej
Vasilios Mavroudis
Conor McGlynn
Malcolm Murray
Charlotte Stix
Lucia Velasco
Nicole Wheeler
Daniel Privitera
Daron Acemoglu … (see 36 more)
Thomas G. Dietterich
Fredrik Heintz
Geoffrey Hinton
Nick Jennings
Susan Leavy
Teresa Ludermir
Vidushi Marda
Helen Margetts
John McDermid
Jane Munga
Arvind Narayanan
Alondra Nelson
Clara Neppel
Sarvapali D. (Gopal) Ramchurn
Stuart Russell
Marietje Schaake
Bernhard Schölkopf
Alvaro Soto
Lee Tiedrich
Andrew Yao
Ya-Qin Zhang
Lambrini Das
Claire Dennis
Arianna Dini
Freya Hempleman
Samuel Kenny
Patrick King
Hannah Merchant
Jamie-Day Rawal
Rose Woolhouse
The field of AI is moving too quickly for a single yearly publication to keep pace. Significant changes can occur on a timescale of months, … (see more)sometimes weeks. This is why we are releasing Key Updates: shorter, focused reports that highlight the most important developments between full editions of the International AI Safety Report. With these updates, we aim to provide policymakers, researchers, and the public with up-to-date information to support wise decisions about AI governance. This first Key Update focuses on areas where especially significant changes have occurred since January 2025: advances in general-purpose AI systems' capabilities, and the implications for several critical risks. New training techniques have enabled AI systems to reason step-by-step and operate autonomously for longer periods, allowing them to tackle more kinds of work. However, these same advances create new challenges across biological risks, cyber security, and oversight of AI systems themselves. The International AI Safety Report is intended to help readers assess, anticipate, and manage risks from general-purpose AI systems. These Key Updates ensure that critical developments receive timely attention as the field rapidly evolves.
Advancing science- and evidence-based AI policy.
Rishi Bommasani
Sanjeev Arora
Jennifer Chayes
Yejin Choi
Mariano-Florentino Cuéllar
Li Fei-Fei
Daniel E. Ho
Dan Jurafsky
Sanmi Koyejo
Hima Lakkaraju
Arvind Narayanan
Alondra Nelson
Emma Pierson
Scott Singer
Suresh Venkatasubramanian
Ion Stoica
Percy Liang
Dawn Song
International AI Safety Report
Bronwyn Fox
André Carlos Ponce de Leon Ferreira de Carvalho
Mona Nemer
Raquel Pezoa Rivera
Yi Zeng
Juha Heikkilä
Guillaume Avrin
Antonio Krüger
Balaraman Ravindran
Hammam Riza
Ciarán Seoighe
Ziv Katzir
Andrea Monti
Hiroaki Kitano
Nusu Mwamanzi
Fahad Albalawi
José Ramón López Portillo
Haroon Sheikh
Gill Jolly … (see 86 more)
Olubunmi Ajala
Jerry Sheehan
Dominic Vincent Ligot
Kyoung Mu Lee
Crystal Rugege
Denise Wong
Nuria Oliver
Christian Busch
Ahmet Halit Hatip
Oleksii Molchanovskyi
Marwan Alserkal
Chris Johnson
Amandeep Singh Gill
Saif M. Khan
Daniel Privitera
Tamay Besiroglu
Rishi Bommasani
Stephen Casper
Yejin Choi
Philip Fox
Ben Garfinkel
Danielle Goldfarb
Hoda Heidari
Anson Ho
Sayash Kapoor
Leila Khalatbari
Shayne Longpre
Sam Manning
Vasilios Mavroudis
Mantas Mazeika
Julian Michael
Jessica Newman
Kwan Yee Ng
Chinasa T. Okolo
Deborah Raji
Girish Sastry
Elizabeth Seger
Theodora Skeadas
Tobin South
Daron Acemoglu
Olubayo Adekanmbi
David Dalrymple
Thomas G. Dietterich
Edward W. Felten
Pascale Fung
Pierre-Olivier Gourinchas
Fredrik Heintz
Geoffrey Hinton
Nick Jennings
Andreas Krause
Susan Leavy
Percy Liang
Teresa Ludermir
Vidushi Marda
Emma Strubell
Florian Tramèr
Lucia Velasco
Nicole Wheeler
Helen Margetts
John McDermid
Jane Munga
Arvind Narayanan
Alondra Nelson
Clara Neppel
Alice Oh
Gopal Ramchurn
Stuart Russell
Marietje Schaake
Bernhard Schölkopf
Dawn Song
Alvaro Soto
Lee Tiedrich
Andrew Yao
Ya-Qin Zhang
Baran Acar
Ben Clifford
Lambrini Das
Claire Dennis
Freya Hempleman
Hannah Merchant
Rian Overy
Ben Snodin
Benjamin Prud’homme
The first International AI Safety Report comprehensively synthesizes the current evidence on the capabilities, risks, and safety of advanced… (see more) AI systems. The report was mandated by the nations attending the AI Safety Summit in Bletchley, UK. Thirty nations, the UN, the OECD, and the EU each nominated a representative to the report's Expert Advisory Panel. A total of 100 AI experts contributed, representing diverse perspectives and disciplines. Led by the report's Chair, these independent experts collectively had full discretion over the report's content.
Individual Brain Charting dataset extension, third release for movie watching and retinotopy data
Ana Lúısa Pinho
Hugo Richard
Ana Fernanda Ponce
Michael Eickenberg
Alexis Amadon
Elvis Dopgima Dohmatob
Isabelle Denghien
Juan Jesús Torre
Swetha Shankar
Himanshu Aggarwal
Alexis Thual
Thomas Chapalain
Chantal Ginisty
Séverine Becuwe-Desmidt
Séverine Roger
Yann Lecomte
Valérie Berland
Laurence Laurier
Véronique Joly-Testault
Gaëlle Médiouni-Cloarec … (see 6 more)
Christine Doublé
Bernadette Martins
Stanislas Dehaene
Lucie Hertz-Pannier
Bertrand Thirion
Metrics reloaded: recommendations for image analysis validation.
Lena Maier-Hein
Annika Reinke
Evangelia Christodoulou
Ben Glocker
PATRICK GODAU
Fabian Isensee
Jens Kleesiek
Michal Kozubek
Mauricio Reyes
MICHAEL A. RIEGLER
Manuel Wiesenfarth
Michael Baumgartner
Matthias Eisenmann
DOREEN HECKMANN-NÖTZEL
A. EMRE KAVUR
TIM RÄDSCH
Minu Dietlinde Tizabi
Laura Acion
Michela Antonelli
Spyridon Bakas
Peter Bankhead
Allison Benis
M. Jorge Cardoso
Veronika Cheplygina
BETH A. CIMINI
Gary S. Collins
Keyvan Farahani
Bram van Ginneken
Daniel A. Hashimoto
Michael M. Hoffman
Merel Huisman
Pierre Jannin
CHARLES E. KAHN
Alexandros Karargyris
Alan Karthikesalingam
H. Kenngott
Annette Kopp-Schneider
Anna Kreshuk
Tahsin Kurc
Bennett Landman
GEERT LITJENS
Amin Madani
Klaus Maier-Hein
Anne L. Martel
Peter Mattson
Erik Meijering
Bjoern Menze
David Moher
Karel G.M. Moons
Henning Müller
Felix Nickel
Jens Petersen
Nasir Rajpoot
Nicola Rieke
Julio Saez-Rodriguez
Clarisa S'anchez Guti'errez
Shravya Shetty
M. Smeden
Carole H. Sudre
Ronald M. Summers
Abdel Aziz Taha
Sotirios A. Tsaftaris
Ben Van Calster
PAUL F. JÄGER
Understanding metric-related pitfalls in image analysis validation
Annika Reinke
Minu D. Tizabi
Michael Baumgartner
Matthias Eisenmann
DOREEN HECKMANN-NÖTZEL
A. EMRE KAVUR
TIM RÄDSCH
Carole H. Sudre
Laura Acion
Michela Antonelli
Spyridon Bakas
Arriel Benis
Arriel Benis
Matthew B. Blaschko
Florian Buettner
M. Jorge Cardoso
Veronika Cheplygina
Jianxu Chen
Evangelia Christodoulou … (see 59 more)
BETH A. CIMINI
Keyvan Farahani
LUCIANA FERRER
Gary S. Collins
Adrian Galdran
Bram van Ginneken
Ben Glocker
PATRICK GODAU
Daniel A. Hashimoto
Michael M. Hoffman
Robert Haase
Merel Huisman
Fabian Isensee
Pierre Jannin
CHARLES E. KAHN
Dagmar Kainmueller
BERNHARD KAINZ
Alexandros Karargyris
Jens Kleesiek
Florian Kofler
Thijs Kooi
Annette Kopp-Schneider
Alan Karthikesalingam
Hannes Kenngott
Michal Kozubek
Anna Kreshuk
Tahsin Kurc
Bennett A. Landman
GEERT LITJENS
Amin Madani
Klaus Maier-Hein
Anne L. Martel
Erik Meijering
Bjoern Menze
Karel G.M. Moons
Henning Müller
Felix Nickel
Peter Mattson
Jens Petersen
Susanne M. Rafelski
Nasir Rajpoot
Mauricio Reyes
MICHAEL A. RIEGLER
Nicola Rieke
Julio Saez-Rodriguez
Clara I. Sánchez
Shravya Shetty
Ronald M. Summers
Abdel A. Taha
Aleksei Tiulpin
Sotirios A. Tsaftaris
Ben Van Calster
Amin Madani
Ziv R. Yaniv
PAUL F. JÄGER
Lena Maier-Hein
Anne L. Martel
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligen… (see more)ce (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation.
Metrics Reloaded - A new recommendation framework for biomedical image analysis validation
Annika Reinke
Lena Maier-Hein
Evangelia Christodoulou
Ben Glocker
Patrick Scholz
Fabian Isensee
Jens Kleesiek
Michal Kozubek
Mauricio Reyes
Michael Alexander Riegler
Manuel Wiesenfarth
Michael Baumgartner
Matthias Eisenmann
DOREEN HECKMANN-NÖTZEL
Ali Emre Kavur
TIM RÄDSCH
Minu D. Tizabi
Laura Acion
Michela Antonelli
Spyridon Bakas
Peter Bankhead
Arriel Benis
M. Jorge Cardoso
Veronika Cheplygina
Beth A Cimini
Gary S. Collins
Keyvan Farahani
Bram van Ginneken
Fred A Hamprecht
Daniel A. Hashimoto
Michael M. Hoffman
Merel Huisman
Pierre Jannin
Charles Kahn
Alexandros Karargyris
Alan Karthikesalingam
Hannes Kenngott
Annette Kopp-Schneider
Anna Kreshuk
Tahsin Kurc
Bennett Landman
GEERT LITJENS
Amin Madani
Klaus Maier-Hein
Anne Martel
Peter Mattson
Erik Meijering
Bjoern Menze
David Moher
Karel G.M. Moons
Henning Müller
Felix Nickel
Jens Petersen
Nasir Rajpoot
Nicola Rieke
Julio Saez-Rodriguez
Clara I. Sánchez
Shravya Shetty
Maarten van Smeden
Carole H. Sudre
Ronald M. Summers
Abdel A. Taha
Sotirios A. Tsaftaris
Ben Van Calster
Paul F Jaeger
Meaningful performance assessment of biomedical image analysis algorithms depends on objective and appropriate performance metrics. There ar… (see more)e major shortcomings in the current state of the art. Yet, so far limited attention has been paid to practical pitfalls associated when using particular metrics for image analysis tasks. Therefore, a number of international initiatives have collaborated to offer researchers with guidance and tools for selecting performance metrics in a problem-aware manner. In our proposed framework, the characteristics of the given biomedical problem are first captured in a problem fingerprint, which identifies properties related to domain interests, the target structure(s), the input datasets, and algorithm output. A problem category-specific mapping is applied in the second step to match fingerprints to metrics that reflect domain requirements. Based on input from experts from more than 60 institutions worldwide, we believe our metric recommendation framework to be useful to the MIDL community and to enhance the quality of biomedical image analysis algorithm validation.
Population modeling with machine learning can enhance measures of mental health
Kamalaker Dadi
Josselin Houenou
Bertrand Thirion
Denis Engemann
We applied machine learning on more than 10.000 individuals from the general population to define empirical approximations of health-related… (see more) psychological measures that do not require human judgment.We found that machine-learning enriched the given psychological measures via approximation from brain and sociodemographic data: Resulting proxy measures related as well or better to real-world health behavior than the original measures.Model comparisons showed that sociodemographic information contributed most to characterizing psychological traits beyond aging.
Accounting for Variance in Machine Learning Benchmarks
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the l… (see more)earning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons.
Prediction, Not Association, Paves the Road to Precision Medicine
Ewout W. Steyerberg
Patterns of autism symptoms: hidden structure in the ADOS and ADI-R instruments
Jeremy Lefort-Besnard
Kai Vogeley
Leonhard Schilbach
Bertrand Thirion