Portrait de Jean-François Godbout

Jean-François Godbout

Membre académique associé
Professeur titulaire, Université de Montréal, Département de science politique
Alumni
Sujets de recherche
Désinformation
Modèles génératifs
Sécurité de l'IA

Biographie

Jean-François Godbout est professeur au département de science politique de l'Université de Montréal et membre académique associé à Mila - Institut québécois d'intelligence artificielle. Ses recherches portent principalement sur les sciences sociales computationnelles, la sécurité de l'IA et l'impact de l'IA générative sur la société. Il est actuellement directeur du programme de premier cycle en analyse des données en sciences humaines à l'Université de Montréal et chercheur à IVADO.

Étudiants actuels

Maîtrise recherche - UdeM
Stagiaire de recherche - UdeM
Co-superviseur⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Postdoctorat - UdeM
Doctorat - UdeM
Maîtrise recherche - UdeM
Co-superviseur⋅e :
Maîtrise recherche - UdeM
Co-superviseur⋅e :

Publications

Online Influence Campaigns: Strategies and Vulnerabilities
Ethan Kosak-Hine
Tom Gibbs
U. Montr'eal
Ivado
M. University
In order to combat the creation and spread of harmful content online, this paper defines and contextualizes the concept of inauthentic, soci… (voir plus)etal-scale manipulation by malicious actors. We review the literature on societally harmful content and how it proliferates to analyze the manipulation strategies used by such actors and the vulnerabilities they target. We also provide an overview of three case studies of extensive manipulation campaigns to emphasize the severity of the problem. We then address the role that Artificial Intelligence plays in the development and dissemination of harmful content, and how its evolution presents new threats to societal cohesion for countries across the globe. Our survey aims to increase our understanding of not just particular aspects of these threats, but also the strategies underlying their deployment, so we can effectively prepare for the evolving cybersecurity landscape.
OpenFake: An Open Dataset and Platform Toward Real-World Deepfake Detection
Deepfakes, synthetic media created using advanced AI techniques, pose a growing threat to information integrity, particularly in politically… (voir plus) sensitive contexts. This challenge is amplified by the increasing realism of modern generative models, which our human perception study confirms are often indistinguishable from real images. Yet, existing deepfake detection benchmarks rely on outdated generators or narrowly scoped datasets (e.g., single-face imagery), limiting their utility for real-world detection. To address these gaps, we present OpenFake, a large politically grounded dataset specifically crafted for benchmarking against modern generative models with high realism, and designed to remain extensible through an innovative crowdsourced adversarial platform that continually integrates new hard examples. OpenFake comprises nearly four million total images: three million real images paired with descriptive captions and almost one million synthetic counterparts from state-of-the-art proprietary and open-source models. Detectors trained on OpenFake achieve near-perfect in-distribution performance, strong generalization to unseen generators, and high accuracy on a curated in-the-wild social media test set, significantly outperforming models trained on existing datasets. Overall, we demonstrate that with high-quality and continually updated benchmarks, automatic deepfake detection is both feasible and effective in real-world settings.
A Simulation System Towards Solving Societal-Scale Manipulation
Austin Welch
Gayatri K
Dan Zhao
Hao Yu
Ethan Kosak-Hine
Tom Gibbs
Busra Tugce Gurbuz
The rise of AI-driven manipulation poses significant risks to societal trust and democratic processes. Yet, studying these effects in real-w… (voir plus)orld settings at scale is ethically and logistically impractical, highlighting a need for simulation tools that can model these dynamics in controlled settings to enable experimentation with possible defenses. We present a simulation environment designed to address this. We elaborate upon the Concordia framework that simulates offline, `real life' activity by adding online interactions to the simulation through social media with the integration of a Mastodon server. We improve simulation efficiency and information flow, and add a set of measurement tools, particularly longitudinal surveys. We demonstrate the simulator with a tailored example in which we track agents' political positions and show how partisan manipulation of agents can affect election results.
Epistemic Integrity in Large Language Models
Large language models are increasingly relied upon as sources of information, but their propensity for generating false or misleading statem… (voir plus)ents with high confidence poses risks for users and society. In this paper, we confront the critical problem of epistemic miscalibration—where a model's linguistic assertiveness fails to reflect its true internal certainty. We introduce a new human-labeled dataset and a novel method for measuring the linguistic assertiveness of Large Language Models which cuts error rates by over 50% relative to previous benchmarks. Validated across multiple datasets, our method reveals a stark misalignment between how confidently models linguistically present information and their actual accuracy. Further human evaluations confirm the severity of this miscalibration. This evidence underscores the urgent risk of the overstated certainty Large Language Models hold which may mislead users on a massive scale. Our framework provides a crucial step forward in diagnosing and correcting this miscalibration, offering a path to safer and more trustworthy AI across domains.
Simulation System Towards Solving Societal-Scale Manipulation
Austin Welch
Gayatri K
Dan Zhao
Hao Yu
Tom Gibbs
Ethan Kosak-Hine
Busra Tugce Gurbuz
The rise of AI-driven manipulation poses significant risks to societal trust and democratic processes. Yet, studying these effects in real-w… (voir plus)orld settings at scale is ethically and logistically impractical, highlighting a need for simulation tools that can model these dynamics in controlled settings to enable experimentation with possible defenses. We present a simulation environment designed to address this. We elaborate upon the Concordia framework that simulates offline, `real life' activity by adding online interactions to the simulation through social media with the integration of a Mastodon server. Through a variety of means we then improve simulation efficiency and information flow, and add a set of measurement tools, particularly longitudinal surveys of the agents' political positions. We demonstrate the simulator with a tailored example of how partisan manipulation of agents can affect election results.
Web Retrieval Agents for Evidence-Based Misinformation Detection
Regional and Temporal Patterns of Partisan Polarization during the COVID-19 Pandemic in the United States and Canada
C'ecile Amadoro
Gabrielle Desrosiers-Brisebois
Sacha Lévy
Public health measures were among the most polarizing topics debated online during the COVID-19 pandemic. Much of the discussion surrounded … (voir plus)specific events, such as when and which particular interventions came into practise. In this work, we develop and apply an approach to measure subnational and event-driven variation of partisan polarization and explore how these dynamics varied both across and within countries. We apply our measure to a dataset of over 50 million tweets posted during late 2020, a salient period of polarizing discourse in the early phase of the pandemic. In particular, we examine regional variations in both the United States and Canada, focusing on three specific health interventions: lockdowns, masks, and vaccines. We find that more politically conservative regions had higher levels of partisan polarization in both countries, especially in the US where a strong negative correlation exists between regional vaccination rates and degree of polarization in vaccine related discussions. We then analyze the timing, context, and profile of spikes in polarization, linking them to specific events discussed on social media across different regions in both countries. These typically last only a few days in duration, suggesting that online discussions reflect and could even drive changes in public opinion, which in the context of pandemic response impacts public health outcomes across different regions and over time.
Political Dynasties in Canada
Alex B. Rivard
Marc André Bodet
Using a unique dataset of legislators' electoral and biographical data in the Canadian provinces of Ontario, Quebec, New Brunswick, Nova Sco… (voir plus)tia and the federal parliament, this article analyses the extent to which family dynasties affected the career development of legislators since the mid-18th century. We find that the prevalence of dynasties was higher in provincial legislatures than it was in the federal parliament, that the number of dynasties in the Senate increased until the mid-20th century, and that the proportion of dynastic legislators at the subnational level was similar to the numbers seen in the United Kingdom during the early 19th century. Our results confirm the existence of a clear career benefit in terms of cabinet and senate appointments. In contrast to the American case and in line with the United Kingdom experience, we find no causal relationship between a legislator's tenure length and the presence of a dynasty.
A Comprehensive Dataset of Four Provincial Legislative Assembly Members
Alex B. Rivard
Marc André Bodet
Éric Montigny
This research note reports on a new dataset about legislators in four Canadian provinces since the establishment of their colonial assemblie… (voir plus)s in the eighteenth century. Over 7,000 legislators from Ontario, Quebec, New Brunswick, and Nova Scotia are included, with consolidated information drawn from multiple sources about parliamentarians’ years of birth and death, religion, electoral performance, kinship, and several other biographical indicators. We also illustrate the utility of such data with the help of a few descriptive examples drawn from the four provinces. We believe this consolidated dataset offers several opportunities for future research on representation, legislative activities and party politics.
Combining Confidence Elicitation and Sample-based Methods for Uncertainty Quantification in Misinformation Mitigation
Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation
Recent large language models (LLMs) have been shown to be effective for misinformation detection. However, the choice of LLMs for experiment… (voir plus)s varies widely, leading to uncertain conclusions. In particular, GPT-4 is known to be strong in this domain, but it is closed source, potentially expensive, and can show instability between different versions. Meanwhile, alternative LLMs have given mixed results. In this work, we show that Zephyr-7b presents a consistently viable alternative, overcoming key limitations of commonly used approaches like Llama-2 and GPT-3.5. This provides the research community with a solid open-source option and shows open-source models are gradually catching up on this task. We then highlight how GPT-3.5 exhibits unstable performance, such that this very widely used model could provide misleading results in misinformation detection. Finally, we validate new tools including approaches to structured output and the latest version of GPT-4 (Turbo), showing they do not compromise performance, thus unlocking them for future research and potentially enabling more complex pipelines for misinformation mitigation.
Uncertainty Resolution in Misinformation Detection