Publications

Quantifying Likeness: A Simple Machine Learning Approach to Identifying Copyright Infringement in (AI-Generated) Artwork

Michaela Drouillard

Ryan Spencer

Nikée Nantambu-Allen

Tegan Maharaj

This study proposes an approach aligned with the legal process to quantify copyright infringement, via stylistic similarity, in AI-generated… (voir plus) artwork. In contrast to typical work in this field, and more in line with a realistic legal setting, our approach quantifies the similarity of a set of potentially-infringing “defendant” artworks to a set of copyrighted “plaintiff" artworks. We frame this as an image classification task, using a fine-tuned ResNet trained on small, customized datasets relevant to each use case. Softmax-normalized probabilities from the model serve as similarity scores for potentially infringing “defendant” artworks, and saliency maps and features visualizations complement the score by highlighting key features and allowing for interpretability. This straightforward image classification approach can be accomplished in a quite simple, low-resource setting, making it accessible for real-world applications. We present a case study using Mickey Mouse as the plaintiff, performing thorough hyperparameter tuning and robustness analysis. Our experiments include optimizing batch size, weight decay, and learning rate, as well as exploring the impact of additional distractor classes. We employ data augmentation, cross-validation, and a linear decay learning rate scheduler to improve model performance, along with conducting scaling experiments with different types of distractor classes. The aims of this work are to illustrate the potential of the approach, and identify settings which generalize well, such that it is as "plug and play" as possible for users to apply with their own plaintiff sets of artworks.

2024-10-11

NeurIPS.cc/2024/Workshop/SafeGenAi (poster)

openreview.net

Rejecting Hallucinated State Targets during Planning

Mingde Zhao

Tristan Sylvain

Romain Laroche

Doina Precup

Yoshua Bengio

In planning processes of computational decision-making agents, generative or predictive models are often used as "generators" to propose "ta… (voir plus)rgets" representing sets of expected or desirable states. Unfortunately, learned models inevitably hallucinate infeasible targets that can cause delusional behaviors and safety concerns. We first investigate the kinds of infeasible targets that generators can hallucinate. Then, we devise a strategy to identify and reject infeasible targets by learning a target feasibility evaluator. To ensure that the evaluator is robust and non-delusional, we adopted a design choice combining off-policy compatible learning rule, distributional architecture, and data augmentation based on hindsight relabeling. Attaching to a planning agent, the designed evaluator learns by observing the agent's interactions with the environment and the targets produced by its generator, without the need to change the agent or its generator. Our controlled experiments show significant reductions in delusional behaviors and performance improvements for various kinds of existing agents.

2024-10-11

NeurIPS.cc/2024/Workshop/SafeGenAi (poster)

doi.org

proceedings.mlr.press

Simulation System Towards Solving Societal-Scale Manipulation

Maximilian Puelma Touzel

Sneheel Sarangi

Austin Welch

Gayatri K

Dan Zhao

Zachary Yang

Hao Yu

Tom Gibbs

Ethan Kosak-Hine

Andreea Musulan

Camille Thibault

Busra Tugce Gurbuz

Reihaneh Rabbany

Jean-François Godbout

Kellin Pelrine

The rise of AI-driven manipulation poses significant risks to societal trust and democratic processes. Yet, studying these effects in real-w… (voir plus)orld settings at scale is ethically and logistically impractical, highlighting a need for simulation tools that can model these dynamics in controlled settings to enable experimentation with possible defenses. We present a simulation environment designed to address this. We elaborate upon the Concordia framework that simulates offline, `real life' activity by adding online interactions to the simulation through social media with the integration of a Mastodon server. Through a variety of means we then improve simulation efficiency and information flow, and add a set of measurement tools, particularly longitudinal surveys of the agents' political positions. We demonstrate the simulator with a tailored example of how partisan manipulation of agents can affect election results.

2024-10-11

NeurIPS.cc/2024/Workshop/SafeGenAi (poster)

openreview.net

The Structural Safety Generalization Problem

Tom Gibbs

Julius Broomfield

George Ingebretsen

Ethan Kosak-Hine

Tia Nasir

Jason Zhang

Reihaneh Iranmanesh

Sara Pieri

Reihaneh Rabbany

Kellin Pelrine

It is widely known that AI is vulnerable to adversarial examples, from pixel perturbations to jailbreaks. We propose that there is a key, ea… (voir plus)sier class of problems that is also still unsolved: failures of safety to generalize over structure, despite semantic equivalence. We demonstrate this vulnerability by showing how recent AI systems are differently vulnerable both to multi-turn and multi-image attacks, compared to their single-turn and single-image counterparts with equivalent meaning. We suggest this is the same class of vulnerability as that found in yet unconnected threads of the literature: vulnerabilities to low-resource languages and indefensibility of strongly superhuman Go AIs to cyclic attacks. When viewed together, these reveal a common picture: models that are not only vulnerable to attacks, but vulnerable to attacks with near identical meaning in their benign and harmful components both, and only different in structure. In contrast to attacks with identical benign input (e.g., pictures that look like cats) but unknown semanticity of the harmful component (e.g., diverse noise that is all unintelligible to humans), these represent a class of attacks where semantic understanding and defense against one version should guarantee defense against others—yet current AI safety measures do not. This vulnerability represents a necessary but not sufficient condition towards defending against attacks whose harmful component has arbitrary semanticity. Consequently, by building on the data and approaches we highlight, we frame an intermediate problem for AI safety to solve, that represents a critical checkpoint towards safe AI while being far more tractable than trying to solve it directly and universally.

2024-10-11

NeurIPS.cc/2024/Workshop/SafeGenAi (poster)

openreview.net

Unlearning in- vs. out-of-distribution data in LLMs under gradient-based methods

Teodora Baluta

Pascal Lamblin

Daniel Tarlow

Fabian Pedregosa

Gintare Karolina Dziugaite

Machine unlearning aims to solve the problem of removing the influence of selected training examples from a learned model. Despite the incre… (voir plus)asing attention to this problem, it remains an open research question how to evaluate unlearning in large language models (LLMs), and what are the critical properties of the data to be unlearned that affect the quality and efficiency of unlearning. This work formalizes a metric to evaluate unlearning quality in generative models, and uses it to assess the trade-offs between unlearning quality and performance. We demonstrate that unlearning out-of-distribution examples requires more unlearning steps but overall presents a better trade-off overall. For in-distribution examples, however, we observe a rapid decay in performance as unlearning progresses. We further evaluate how example's memorization and difficulty affect unlearning under a classical gradient ascent-based approach.

2024-10-11

NeurIPS.cc/2024/Workshop/SafeGenAi (poster)

openreview.net

Efficient line search for optimizing Area Under the ROC Curve in gradient descent

Jadon Fowler

Toby Dylan Hocking

Receiver Operating Characteristic (ROC) curves are useful for evaluation in binary classification and changepoint detection, but difficult t… (voir plus)o use for learning since the Area Under the Curve (AUC) is piecewise constant (gradient zero almost everywhere). Recently the Area Under Min (AUM) of false positive and false negative rates has been proposed as a differentiable surrogate for AUC. In this paper we study the piecewise linear/constant nature of the AUM/AUC, and propose new efficient path-following algorithms for choosing the learning rate which is optimal for each step of gradient descent (line search), when optimizing a linear model. Remarkably, our proposed line search algorithm has the same log-linear asymptotic time complexity as gradient descent with constant step size, but it computes a complete representation of the AUM/AUC as a function of step size. In our empirical study of binary classification problems, we verify that our proposed algorithm is fast and exact; in changepoint detection problems we show that the proposed algorithm is just as accurate as grid search, but faster.

2024-10-10

ArXiv (prépublication)

doi.org

arxiv.org

Finite Sample Complexity Analysis of Binary Segmentation

Toby Dylan Hocking

Binary segmentation is the classic greedy algorithm which recursively splits a sequential data set by optimizing some loss or likelihood fun… (voir plus)ction. Binary segmentation is widely used for changepoint detection in data sets measured over space or time, and as a sub-routine for decision tree learning. In theory it should be extremely fast for

2024-10-10

ArXiv (prépublication)

doi.org

arxiv.org

MATES: a deep learning-based model for locus-specific quantification of transposable elements in single cell

Ruohan Wang

Yumin Zheng

Zijian Zhang

Kailu Song

Erxi Wu

Xiaopeng Zhu

Tao P. Wu

Jun Ding

Transposable elements (TEs) are crucial for genetic diversity and gene regulation. Current single-cell quantification methods often align mu… (voir plus)lti-mapping reads to either ‘best-mapped’ or ‘random-mapped’ locations and categorize them at the subfamily levels, overlooking the biological necessity for accurate, locus-specific TE quantification. Moreover, these existing methods are primarily designed for and focused on transcriptomics data, which restricts their adaptability to single-cell data of other modalities. To address these challenges, here we introduce MATES, a deep-learning approach that accurately allocates multi-mapping reads to specific loci of TEs, utilizing context from adjacent read alignments flanking the TE locus. When applied to diverse single-cell omics datasets, MATES shows improved performance over existing methods, enhancing the accuracy of TE quantification and aiding in the identification of marker TEs for identified cell populations. This development facilitates the exploration of single-cell heterogeneity and gene regulation through the lens of TEs, offering an effective transposon quantification tool for the single-cell genomics community.

2024-10-10

Nature Communications (publié)

doi.org

Physical Simulation for Multi-agent Multi-machine Tending

Abdalwhab Abdalwhab

Giovanni Beltrame

David St-Onge

2024-10-10

ArXiv (prépublication)

doi.org

arxiv.org

PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning

Tingchen Fu

Mrinank Sharma

Philip Torr

Yonadav G Shavit

Shay B. Cohen

David M. Krueger

Fazl Barez

Preference learning is a central component for aligning current LLMs, but this process can be vulnerable to data poisoning attacks. To addre… (voir plus)ss this concern, we introduce PoisonBench, a benchmark for evaluating large language models' susceptibility to data poisoning during preference learning. Data poisoning attacks can manipulate large language model responses to include hidden malicious content or biases, potentially causing the model to generate harmful or unintended outputs while appearing to function normally. We deploy two distinct attack types across eight realistic scenarios, assessing 21 widely-used models. Our findings reveal concerning trends: (1) Scaling up parameter size does not inherently enhance resilience against poisoning attacks; (2) There exists a log-linear relationship between the effects of the attack and the data poison ratio; (3) The effect of data poisoning can generalize to extrapolated triggers that are not included in the poisoned data. These results expose weaknesses in current preference learning techniques, highlighting the urgent need for more robust defenses against malicious models and data manipulation.

2024-10-10

ArXiv (prépublication)

doi.org

openreview.net

SOAK: Same/Other/All K-fold cross-validation for estimating similarity of patterns in data subsets

Toby Dylan Hocking

Gabrielle Thibault

C. S. Bodine

Paul Nelson Arellano

Alexander F. Shenkin

Olivia Jasmine Lindly

In many real-world applications of machine learning, we are interested to know if it is possible to train on the data that we have gathered … (voir plus)so far, and obtain accurate predictions on a new test data subset that is qualitatively different in some respect (time period, geographic region, etc). Another question is whether data subsets are similar enough so that it is beneficial to combine subsets during model training. We propose SOAK, Same/Other/All K-fold cross-validation, a new method which can be used to answer both questions. SOAK systematically compares models which are trained on different subsets of data, and then used for prediction on a fixed test subset, to estimate the similarity of learnable/predictable patterns in data subsets. We show results of using SOAK on six new real data sets (with geographic/temporal subsets, to check if predictions are accurate on new subsets), 3 image pair data sets (subsets are different image types, to check that we get smaller prediction error on similar images), and 11 benchmark data sets with predefined train/test splits (to check similarity of predefined splits).

2024-10-10

ArXiv (prépublication)

doi.org

arxiv.org

"I Am the One and Only, Your Cyber BFF": Understanding the Impact of GenAI Requires Understanding the Impact of Anthropomorphic AI

Myra Cheng

Alicia DeVrio

Lisa Egede

Su Lin Blodgett

A.R. Olteanu

Many state-of-the-art generative AI (GenAI) systems are increasingly prone to anthropomorphic behaviors, i.e., to generating outputs that ar… (voir plus)e perceived to be human-like. While this has led to scholars increasingly raising concerns about possible negative impacts such anthropomorphic AI systems can give rise to, anthropomorphism in AI development, deployment, and use remains vastly overlooked, understudied, and underspecified. In this perspective, we argue that we cannot thoroughly map the social impacts of generative AI without mapping the social impacts of anthropomorphic AI, and outline a call to action.

2024-10-10

ArXiv (prépublication)

doi.org

arxiv.org

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Publications

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Mots-clés populaires:

Publications