Portrait de Gaétan Marceau Caron

Gaétan Marceau Caron

Directeur principal, Recherche appliquée en apprentissage automatique

Biographie

Gaétan Marceau Caron est le directeur principal de l’équipe de recherche appliquée en apprentissage automatique à Mila – Institut québécois d’intelligence artificielle. Son objectif est de promouvoir une équipe de chercheurs et de chercheuses travaillant à l’aide de l’intelligence artificielle et conjointement avec l’industrie sur des problèmes scientifiques difficiles, dont les solutions auront des retombées de grande valeur pour la société canadienne.

Il a plus de 12 ans d’expérience dans le transfert de connaissances relatives à l’intelligence artificielle à travers des projets collaboratifs en recherche appliquée. Diplômé en ingénierie de Polytechnique Montréal, ENSTA Paris (Institut Polytechnique de Paris) et de l’Université Pierre-et-Marie-Curie (Sorbonne Universités) et titulaire d’un doctorat en recherche scientifique de l’Université Paris-Saclay, il possède cette double expertise de l’analyse des systèmes industriels et de la création de solutions innovantes répondant à des besoins sociétaux de plus en plus complexes.

Durant les 9 dernières années, à Mila, il a été consultant scientifique dans plus de 25 projets avec l’industrie et comme formateur dans 6 éditions de l’école en apprentissage profond co-organisée avec IVADO.

Publications

OpenFake: An Open Dataset and Platform Toward Real-World Deepfake Detection
Deepfakes, synthetic media created using advanced AI techniques, pose a growing threat to information integrity, particularly in politically… (voir plus) sensitive contexts. This challenge is amplified by the increasing realism of modern generative models, which our human perception study confirms are often indistinguishable from real images. Yet, existing deepfake detection benchmarks rely on outdated generators or narrowly scoped datasets (e.g., single-face imagery), limiting their utility for real-world detection. To address these gaps, we present OpenFake, a large politically grounded dataset specifically crafted for benchmarking against modern generative models with high realism, and designed to remain extensible through an innovative crowdsourced adversarial platform that continually integrates new hard examples. OpenFake comprises nearly four million total images: three million real images paired with descriptive captions and almost one million synthetic counterparts from state-of-the-art proprietary and open-source models. Detectors trained on OpenFake achieve near-perfect in-distribution performance, strong generalization to unseen generators, and high accuracy on a curated in-the-wild social media test set, significantly outperforming models trained on existing datasets. Overall, we demonstrate that with high-quality and continually updated benchmarks, automatic deepfake detection is both feasible and effective in real-world settings.
COVI-AgentSim: an Agent-based Model for Evaluating Methods of Digital Contact Tracing
Prateek Gupta
Nasim Rahaman
Hannah Alsdurf
Abhinav Sharma
Nanor Minoyan
Soren Harnois Leblanc
Pierre-Luc St. Charles
Akshay Patel
Joumana Ghosn
Yang Zhang
Bernhard Schölkopf
Christopher Pal
Joanna Merckx
The rapid global spread of COVID-19 has led to an unprecedented demand for effective methods to mitigate the spread of the disease, and vari… (voir plus)ous digital contact tracing (DCT) methods have emerged as a component of the solution. In order to make informed public health choices, there is a need for tools which allow evaluation and comparison of DCT methods. We introduce an agent-based compartmental simulator we call COVI-AgentSim, integrating detailed consideration of virology, disease progression, social contact networks, and mobility patterns, based on parameters derived from empirical research. We verify by comparing to real data that COVI-AgentSim is able to reproduce realistic COVID-19 spread dynamics, and perform a sensitivity analysis to verify that the relative performance of contact tracing methods are consistent across a range of settings. We use COVI-AgentSim to perform cost-benefit analyses comparing no DCT to: 1) standard binary contact tracing (BCT) that assigns binary recommendations based on binary test results; and 2) a rule-based method for feature-based contact tracing (FCT) that assigns a graded level of recommendation based on diverse individual features. We find all DCT methods consistently reduce the spread of the disease, and that the advantage of FCT over BCT is maintained over a wide range of adoption rates. Feature-based methods of contact tracing avert more disability-adjusted life years (DALYs) per socioeconomic cost (measured by productive hours lost). Our results suggest any DCT method can help save lives, support re-opening of economies, and prevent second-wave outbreaks, and that FCT methods are a promising direction for enriching BCT using self-reported symptoms, yielding earlier warning signals and a significantly reduced spread of the virus per socioeconomic cost.