Portrait of Gaétan Marceau Caron

Gaétan Marceau Caron

Senior Director, Applied Machine Learning Research

Biography

Gaétan Marceau Caron is the Director of Applied Machine Learning Research at Mila – Quebec Artificial Intelligence Research Institute. He works to promote Mila’s team of researchers who work jointly with industry to address difficult scientific problems using AI. The spin-offs of these projects will ultimately benefit all of Canadian society.

He has more than twelve years’ experience in knowledge transfer in the field of AI through working on numerous collaborative projects in applied research. He has dual expertise in engineering and scientific research, having completed engineering degrees at Polytechnique Montréal and the Institut Polytechnique de Paris (ENSTA Paris), and science degrees at Université Pierre-et-Marie-Curie (Sorbonne) and Université Paris-Saclay (PhD level). This enables him to analyze industrial systems and find innovative solutions to the increasingly complex challenges facing society.

In his seven years at Mila, he has served as scientific adviser on more than twenty-five projects with industry and taught in six editions of the annual Mila/IVADO Deep Learning School.

Publications

OpenFake: An Open Dataset and Platform Toward Large-Scale Deepfake Detection
Deepfakes, synthetic media created using advanced AI techniques, have intensified the spread of misinformation, particularly in politically … (see more)sensitive contexts. Existing deepfake detection datasets are often limited, relying on outdated generation methods, low realism, or single-face imagery, restricting the effectiveness for general synthetic image detection. By analyzing social media posts, we identify multiple modalities through which deepfakes propagate misinformation. Furthermore, our human perception study demonstrates that recently developed proprietary models produce synthetic images increasingly indistinguishable from real ones, complicating accurate identification by the general public. Consequently, we present a comprehensive, politically-focused dataset specifically crafted for benchmarking detection against modern generative models. This dataset contains three million real images paired with descriptive captions, which are used for generating 963k corresponding high-quality synthetic images from a mix of proprietary and open-source models. Recognizing the continual evolution of generative techniques, we introduce an innovative crowdsourced adversarial platform, where participants are incentivized to generate and submit challenging synthetic images. This ongoing community-driven initiative ensures that deepfake detection methods remain robust and adaptive, proactively safeguarding public discourse from sophisticated misinformation threats.