Publications

Online Influence Campaigns: Strategies and Vulnerabilities

Andreea Musulan

Veronica Xia

Ethan Kosak-Hine

Tom Gibbs

Vidya Sujaya

Reihaneh Rabbany

Jean-François Godbout

Kellin Pelrine

U. Montr'eal

Ivado

M. University

In order to combat the creation and spread of harmful content online, this paper defines and contextualizes the concept of inauthentic, soci… (see more)etal-scale manipulation by malicious actors. We review the literature on societally harmful content and how it proliferates to analyze the manipulation strategies used by such actors and the vulnerabilities they target. We also provide an overview of three case studies of extensive manipulation campaigns to emphasize the severity of the problem. We then address the role that Artificial Intelligence plays in the development and dissemination of harmful content, and how its evolution presents new threats to societal cohesion for countries across the globe. Our survey aims to increase our understanding of not just particular aspects of these threats, but also the strategies underlying their deployment, so we can effectively prepare for the evolving cybersecurity landscape.

2024-12-31

arXiv (preprint)

doi.org

arxiv.org

Open Technical Problems in Open-Weight AI Model Risk Management

Stephen Casper

Kyle O'Brien

Shayne Longpre

Elizabeth Seger

Kevin Klyman

Rishi Bommasani

Aniruddha Nrusimha

Ilia Shumailov

Sören Mindermann

Steven Basart

Frank Rudzicz

Kellin Pelrine

Avijit Ghosh

Andrew Strait

Robert Kirk

Dan Hendrycks

Peter Henderson

J. Zico Kolter

Geoffrey Irving

Yarin Gal … (see 2 more)

Yoshua Bengio

Dylan Hadfield-Menell

2024-12-31

SSRN Electronic Journal (accepted)

doi.org

OpenFake: An Open Dataset and Platform Toward Real-World Deepfake Detection

Akshatha Arodi

Ga'etan Marceau Caron

Jean-François Godbout

Reihaneh Rabbany

Deepfakes, synthetic media created using advanced AI techniques, pose a growing threat to information integrity, particularly in politically… (see more) sensitive contexts. This challenge is amplified by the increasing realism of modern generative models, which our human perception study confirms are often indistinguishable from real images. Yet, existing deepfake detection benchmarks rely on outdated generators or narrowly scoped datasets (e.g., single-face imagery), limiting their utility for real-world detection. To address these gaps, we present OpenFake, a large politically grounded dataset specifically crafted for benchmarking against modern generative models with high realism, and designed to remain extensible through an innovative crowdsourced adversarial platform that continually integrates new hard examples. OpenFake comprises nearly four million total images: three million real images paired with descriptive captions and almost one million synthetic counterparts from state-of-the-art proprietary and open-source models. Detectors trained on OpenFake achieve near-perfect in-distribution performance, strong generalization to unseen generators, and high accuracy on a curated in-the-wild social media test set, significantly outperforming models trained on existing datasets. Overall, we demonstrate that with high-quality and continually updated benchmarks, automatic deepfake detection is both feasible and effective in real-world settings.

2024-12-31

arXiv.org (preprint)

doi.org

arxiv.org

PairBench: Are Vision-Language Models Reliable at Comparing What They See?

Aarash Feizi

Sai Rajeswar

Adriana Romero

Reihaneh Rabbany

Spandana Gella

Valentina Zantedeschi

Joao Monteiro

Understanding how effectively large vision language models (VLMs) compare visual inputs is crucial across numerous applications, yet this fu… (see more)ndamental capability remains insufficiently assessed. While VLMs are increasingly deployed for tasks requiring comparative judgment, including automated evaluation, re-ranking, and retrieval-augmented generation, no systematic framework exists to measure their performance in these scenarios. We present PairBench, a simple framework that evaluates VLMs as customizable similarity tools using widely available image datasets. Our approach introduces four key metrics for reliable comparison: alignment with human annotations, consistency across pair ordering, distribution smoothness, and controllability through prompting. Our analysis reveals that no model consistently excels across all metrics, with each demonstrating distinct strengths and weaknesses. Most concerning is the widespread inability of VLMs to maintain symmetric similarity scores. Interestingly, we demonstrate that performance on our benchmark strongly correlates with popular benchmarks used for more complex tasks, while providing additional metrics into controllability, smoothness and ordering. This makes PairBench a unique and comprehensive framework to evaluate the performance of VLMs for automatic evaluation depending on the task.

2024-12-31

arXiv.org (preprint)

doi.org

arxiv.org

PEACE: Prompt Engineering Automation for CLIPSeg Enhancement for Safe-Landing Zone Segmentation

Haechan Mark Bong

Rongge Zhang

Antoine Robillard

Giovanni Beltrame

Safe landing is essential in robotics applications, from industrial settings to space exploration. As artificial intelligence advances, we h… (see more)ave developed PEACE (Prompt Engineering Automation for CLIPSeg Enhancement), a system that automatically generates and refines prompts for identifying landing zones in changing environments. Traditional approaches using fixed prompts for open-vocabulary models struggle with environmental changes and can lead to dangerous outcomes when conditions are not represented in the predefined prompts. PEACE addresses this limitation by dynamically adapting to shifting data distributions. Our key innovation is the dual segmentation of safe and unsafe landing zones, allowing the system to refine the results by removing unsafe areas from potential landing sites. Using only monocular cameras and image segmentation, PEACE can safely guide descent operations from 100 meters to altitudes as low as 20 meters. The testing shows that PEACE significantly outperforms the standard CLIP and CLIPSeg prompting methods, improving the successful identification of safe landing zones from 57% to 92%. We have also demonstrated enhanced performance when replacing CLIPSeg with FastSAM. The complete source code is available as an open-source software 1.

2024-12-31

IROS (published)

doi.org

arxiv.org

Performance modulations phase-locked to action depend on internal state

Tommaso Tosato

Guillaume Dumas

Gustavo Rohenkohl

Pascal Fries

Several studies have probed perceptual performance at different times after a self-paced motor action and found frequency-specific modulatio… (see more)ns of perceptual performance phase-locked to the action. Such action-related modulation has been reported for various frequencies and modulation strengths. In an attempt to establish a basic effect at the population level, we had a relatively large number of participants (n=50) perform a self-paced button press followed by a detection task at threshold, and we applied both fixed- and random-effects tests. The combined data of all trials and participants surprisingly did not show any significant action-related modulation. However, based on previous studies, we explored the possibility that such modulation depends on the participant’s internal state. Indeed, when we split trials based on performance in neighboring trials, then trials in periods of low performance showed an action-related modulation at ≈17 Hz. When we split trials based on the performance in the preceding trial, we found that trials following a “miss” showed an action-related modulation at ≈17 Hz. Finally, when we split participants based on their false-alarm rate, we found that participants with no false alarms showed an action-related modulation at ≈17 Hz. All these effects were significant in random-effects tests, supporting an inference on the population. Together, these findings indicate that action-related modulations are not always detectable. However, the results suggest that specific internal states such as lower attentional engagement and/or higher decision criterion are characterized by a modulation in the beta-frequency range.

2024-12-31

iScience (published)

doi.org

Personalized Negative Reservoir for Incremental Learning in Recommender Systems

Antonios Valkanas

Yuening Wang

Yingxue Zhang

Mark J. Coates

2024-12-31

Trans. Mach. Learn. Res. (published)

doi.org

openreview.net

PETRA: Parallel End-to-End Training of Reversible Architectures

Stéphane Rivaud

Louis Fournier

Thomas Pumir

Eugene Belilovsky

Mickael Eickenberg

Edouard Oyallon

Reversible architectures have been shown to be capable of performing on par with their non-reversible architectures, being applied in deep l… (see more)earning for memory savings and generative modeling. In this work, we show how reversible architectures can solve challenges in parallelizing deep model training. We introduce PETRA, a novel alternative to backpropagation for parallelizing gradient computations. PETRA facilitates effective model parallelism by enabling stages (i.e., a set of layers) to compute independently on different devices, while only needing to communicate activations and gradients between each other. By decoupling the forward and backward passes and keeping a single updated version of the parameters, the need for weight stashing is also removed. We develop a custom autograd-like training framework for PETRA, and we demonstrate its effectiveness on CIFAR-10, ImageNet32, and ImageNet, achieving competitive accuracies comparable to backpropagation using ResNet-18, ResNet-34, and ResNet-50 models.

2024-12-31

ICLR (published)

doi.org

openreview.net

Pixels Under Pressure: Exploring Fine-Tuning Paradigms for Foundation Models in High-Resolution Medical Imaging

Zahra Tehrani Nasab

Amar Kumar

Tal Arbel

Advancements in diffusion-based foundation models have improved text-to-image generation, yet most efforts have been limited to low-resoluti… (see more)on settings. As high-resolution image synthesis becomes increasingly essential for various applications, particularly in medical imaging domains, fine-tuning emerges as a crucial mechanism for adapting these powerful pre-trained models to task-specific requirements and data distributions. In this work, we present a systematic study, examining the impact of various fine-tuning techniques on image generation quality when scaling to high resolution 512x512 pixels. We benchmark a diverse set of fine-tuning methods, including full fine-tuning strategies and parameter-efficient fine-tuning (PEFT). We dissect how different fine-tuning methods influence key quality metrics, including Fr\'echet Inception Distance (FID), Vendi score, and prompt-image alignment. We also evaluate the utility of generated images in a downstream classification task under data-scarce conditions, demonstrating that specific fine-tuning strategies improve both generation fidelity and downstream performance when synthetic images are used for classifier training and evaluation on real images. Our code is accessible through the project website - https://tehraninasab.github.io/PixelUPressure/.

2024-12-31

ELAMI@MICCAI (published)

doi.org

arxiv.org

Position: Evaluating Generative AI Systems is a Social Science Measurement Challenge

Hanna Wallach

Meera Desai

A. Feder Cooper

Angelina Wang

Chad Atalla

Solon Barocas

Su Lin Blodgett

Alexandra Chouldechova

Emily Corvi

P. A. Dow

Jean Garcia-Gathright

A.R. Olteanu

Nicholas Pangakis

Stefanie Reed

Emily Sheng

Dan Vann

Jennifer Wortman Vaughan

Matthew Vogel

Hannah Washington

Abigail Z. Jacobs

The measurement tasks involved in evaluating generative AI (GenAI) systems are especially difficult, leading to what has been described as"a… (see more) tangle of sloppy tests [and] apples-to-oranges comparisons"(Roose, 2024). In this position paper, we argue that the ML community would benefit from learning from and drawing on the social sciences when developing and using measurement instruments for evaluating GenAI systems. Specifically, our position is that evaluating GenAI systems is a social science measurement challenge. We present a four-level framework, grounded in measurement theory from the social sciences, for measuring concepts related to the capabilities, behaviors, and impacts of GenAI. This framework has two important implications for designing and evaluating evaluations: First, it can broaden the expertise involved in evaluating GenAI systems by enabling stakeholders with different perspectives to participate in conceptual debates. Second, it brings rigor to both conceptual and operational debates by offering a set of lenses for interrogating the validity of measurement instruments and their resulting measurements.

2024-12-31

ICML (Position Papers) (published)

doi.org

proceedings.mlr.press

Ex Post Conditions for the Exactness of Optimal Power Flow Conic Relaxations

Jean-Luc Lupien

Antoine Lesage-Landry

Convex relaxations of the optimal power flow (OPF) problem provide an efficient alternative to solving the intractable alternating current (… (see more)AC) optimal power flow. The conic subset of OPF convex relaxations, in particular, greatly accelerate resolution while leading to high-quality approximations that are exact in several scenarios. However, the sufficient conditions guaranteeing exactness are stringent, e.g., requiring radial topologies. In this short communication, we present two equivalent ex post conditions for the exactness of any conic relaxation of the OPF. These rely on obtaining either a rank-1 voltage matrix or self-coherent cycles. Instead of relying on sufficient conditions a priori, satisfying one of the presented ex post conditions acts as an exactness certificate for the computed solution. The operator can therefore obtain an optimality guarantee when solving a conic relaxation even when a priori exactness requirements are not met. Finally, we present numerical examples from the MATPOWER library where the ex post conditions hold even though the exactness sufficient conditions do not, thereby illustrating the use of the conditions.

2024-12-31

Electric Power Systems Research (published)

doi.org

arxiv.org

Predicting Vessel Speed Over Ground: A Machine Learning Approach for Enhancing Maritime Transport

Ismail Bourzak

Loubna Benabbou

Sara El Mekkaoui

Abdelaziz Berrado

Stéphane Caron

2024-12-31

IFAC-PapersOnLine (published)

doi.org

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications