Publications

Adversarial Training with Synthesized Data: A Path to Robust and Generalizable Neural Networks

Adversarial Training (AT) is a well-known framework designed to mitigate adversarial vulnerabilities in neural networks. Recent research ind… (voir plus)icates that incorporating adversarial examples (AEs) in training can enhance models' generalization capabilities. To understand the impact of AEs on learning dynamics, we study AT through the lens of sample difficulty methodologies. Our findings show that AT leads to more stable learning dynamics compared to Natural Training (NT), resulting in gradual performance improvements and less overconfident predictions. This suggests that AT steers training away from learning easy, perturbable spurious features toward more resilient and generalizable ones. However, a trade-off exists between adversarial robustness and generalization gains, due to robust overfitting, limiting practical deployment. To address this, we propose using synthesized data to bridge this gap. Our results demonstrate that AT benefits significantly from synthesized data, whereas NT does not, enhancing generalization without compromising robustness and offering new avenues for developing robust and generalizable models.

2024-06-28

ICML.cc/2024/Workshop/NextGenAISafety (poster)

openreview.net

Decomposed evaluations of geographic disparities in text-to-image models

Abhishek Sureddy

Dishant Padalia

Nandhinee Periyakaruppan

Oindrila Saha

Adina Williams

Adriana Romero Soriano

Megan Richards

Polina Kirichenko

Melissa Hall

2024-06-28

ICML.cc/2024/Workshop/NextGenAISafety (poster)

doi.org

openreview.net

Economic evaluation of the effect of needle and syringe programs on skin, soft tissue, and vascular infections in people who inject drugs: a microsimulation modelling approach

Jihoon Lim

W Alton Russell

Mariam El-Sheikh

David Buckeridge

Dimitra Panagiotoglou

2024-06-28

Harm Reduction Journal (publié)

doi.org

Exploring Scaling Trends in LLM Robustness

Nikolaus H. R. Howe

Michał Zając

Ian R. McKenzie

Oskar John Hollinsworth

Tom Tseng

Aaron David Tucker

Pierre-Luc Bacon

Adam Gleave

Language model capabilities predictably improve from scaling a model's size and training data. Motivated by this, increasingly large languag… (voir plus)e models have been trained, yielding an array of impressive capabilities. Yet these models are vulnerable to adversarial prompts, such as"jailbreaks"that hijack models to perform undesired behaviors, posing a significant risk of misuse. Prior work indicates that computer vision models become more robust with model and data scaling, raising the question: does language model robustness also improve with scale? We study this question empirically, finding that larger models respond substantially better to adversarial training, but there is little to no benefit from model scale in the absence of explicit defenses.

2024-06-28

ICML.cc/2024/Workshop/NextGenAISafety (poster)

doi.org

openreview.net

Game On, Hate Off: A Study of Toxicity in Online Multiplayer Environments

Zachary Yang

Nicolas Grenon-Godbout

Reihaneh Rabbany

2024-06-28

Games Res. Pract. (publié)

doi.org

In-Context Learning, Can It Break Safety?

David Dobre

2024-06-28

ICML.cc/2024/Workshop/NextGenAISafety (poster)

openreview.net

Predicting the Population Risk of Suicide Using Routinely Collected Health Administrative Data in Quebec, Canada: Model-Based Synthetic Estimation Study

JianLi Wang

Fatemeh Gholi Zadeh Kharrat

Geneviève Gariépy

Christian Gagné

Jean-François Pelletier

Victoria Massamba

Pascale Lévesque

Mada Mohammed

Alain Lesage

Background Suicide is a significant public health issue. Many risk prediction tools have been developed to estimate an individual’s risk o… (voir plus)f suicide. Risk prediction models can go beyond individual risk assessment; one important application of risk prediction models is population health planning. Suicide is a result of the interaction among the risk and protective factors at the individual, health care system, and community levels. Thus, policy and decision makers can play an important role in suicide prevention. However, few prediction models for the population risk of suicide have been developed. Objective This study aims to develop and validate prediction models for the population risk of suicide using health administrative data, considering individual-, health system–, and community-level predictors. Methods We used a case-control study design to develop sex-specific risk prediction models for suicide, using the health administrative data in Quebec, Canada. The training data included all suicide cases (n=8899) that occurred from January 1, 2002, to December 31, 2010. The control group was a 1% random sample of living individuals in each year between January 1, 2002, and December 31, 2010 (n=645,590). Logistic regression was used to develop the prediction models based on individual-, health care system–, and community-level predictors. The developed model was converted into synthetic estimation models, which concerted the individual-level predictors into community-level predictors. The synthetic estimation models were directly applied to the validation data from January 1, 2011, to December 31, 2019. We assessed the performance of the synthetic estimation models with four indicators: the agreement between predicted and observed proportions of suicide, mean average error, root mean square error, and the proportion of correctly identified high-risk regions. Results The sex-specific models based on individual data had good discrimination (male model: C=0.79; female model: C=0.85) and calibration (Brier score for male model 0.01; Brier score for female model 0.005). With the regression-based synthetic models applied in the validation data, the absolute differences between the synthetic risk estimates and observed suicide risk ranged from 0% to 0.001%. The root mean square errors were under 0.2. The synthetic estimation model for males correctly predicted 4 of 5 high-risk regions in 8 years, and the model for females correctly predicted 4 of 5 high-risk regions in 5 years. Conclusions Using linked health administrative databases, this study demonstrated the feasibility and the validity of developing prediction models for the population risk of suicide, incorporating individual-, health system–, and community-level variables. Synthetic estimation models built on routinely collected health administrative data can accurately predict the population risk of suicide. This effort can be enhanced by timely access to other critical information at the population level.

2024-06-28

JMIR Public Health and Surveillance (publié)

doi.org

Predicting the Population Risk of Suicide Using Routinely Collected Health Administrative Data in Quebec, Canada: Model-Based Synthetic Estimation Study

JianLi Wang

Fatemeh Gholi Zadeh Kharrat

Geneviève Gariépy

Christian Gagné

Jean-François Pelletier

Victoria Massamba

Pascale Lévesque

Mada Mohammed

Alain Lesage

2024-06-28

JMIR Public Health and Surveillance (publié)

doi.org

A Randomized Controlled Simulation Trial of a Neonatal Resuscitation Digital Game Simulator for Labour and Delivery Room Staff

Christiane Bilodeau

Georg M. Schmölzer

Maria Cutumisu

2024-06-28

Children (publié)

doi.org

Robust Knowledge Unlearning via Mechanistic Localizations

Phillip Huang Guo

Aaquib Syed

Abhay Sheshadri

Aidan Ewart

Gintare Karolina Dziugaite

2024-06-28

ICML.cc/2024/Workshop/NextGenAISafety (poster)

openreview.net

Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques

Daniel Z Kaplan

Vision-Language Models (VLMs) have witnessed a surge in both research and real-world applications. However, as they becoming increasingly pr… (voir plus)evalent, ensuring their robustness against adversarial attacks is paramount. This work systematically investigates the impact of model design choices on the adversarial robustness of VLMs against image-based attacks. Additionally, we introduce novel, cost-effective approaches to enhance robustness through prompt formatting. By rephrasing questions and suggesting potential adversarial perturbations, we demonstrate substantial improvements in model robustness against strong image-based attacks such as Auto-PGD. Our findings provide important guidelines for developing more robust VLMs, particularly for deployment in safety-critical environments.

2024-06-28

ICML.cc/2024/Workshop/NextGenAISafety (poster)

doi.org

openreview.net

Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects

Orevaoghene Ahia

Aremu Anuoluwapo

Diana Abagyan

Hila Gonen

David Ifeoluwa Adelani

Daud Abolade

Noah A. Smith

Yulia Tsvetkov

Yoruba—an African language with roughly 47 million speakers—encompasses a continuum with several dialects. Recent efforts to develop NLP… (voir plus) technologies for African languages have focused on their standard dialects, resulting in disparities for dialects and varieties for which there are little to no resources or tools. We take steps towards bridging this gap by introducing a new high-quality parallel text and speech corpus; YORULECT across three domains and four regional yoruba dialects. To develop this corpus, we engaged native speakers, traveling to communities where these dialects are spoken, to collect text and speech data. Using our newly created corpus, we conducted extensive experiments on (text) machine translation, automatic speech recognition, and speech-to-text translation. Our results reveal substantial performance disparities between standard yoruba and the other dialects across all tasks. However, we also show that with dialect-adaptive finetuning, we are able to narrow this gap. We believe our dataset and experimental analysis will contribute greatly to developing NLP tools for Yoruba and its dialects, and potentially for other African languages, by improving our understanding of existing challenges and offering a high-quality dataset for further development. We will release YORULECT dataset and models publicly under an open license.

2024-06-27

ArXiv (prépublication)

doi.org

arxiv.org

Hackathon | Créer une IA plus sécuritaire pour la santé mentale des jeunes

Communauté de pratique de Mila : Sécurité en IA

Éclaireurs autochtones en IA

Avantage IA

Publications

Hackathon | Créer une IA plus sécuritaire pour la santé mentale des jeunes

Communauté de pratique de Mila : Sécurité en IA

Éclaireurs autochtones en IA

Avantage IA

Mots-clés populaires:

Publications