Adversarial Training with Synthesized Data: A Path to Robust and Generalizable Neural Networks
Reza Bayat
Adversarial Training (AT) is a well-known framework designed to mitigate adversarial vulnerabilities in neural networks. Recent research ind… (see more)icates that incorporating adversarial examples (AEs) in training can enhance models' generalization capabilities. To understand the impact of AEs on learning dynamics, we study AT through the lens of sample difficulty methodologies. Our findings show that AT leads to more stable learning dynamics compared to Natural Training (NT), resulting in gradual performance improvements and less overconfident predictions. This suggests that AT steers training away from learning easy, perturbable spurious features toward more resilient and generalizable ones. However, a trade-off exists between adversarial robustness and generalization gains, due to robust overfitting, limiting practical deployment. To address this, we propose using synthesized data to bridge this gap. Our results demonstrate that AT benefits significantly from synthesized data, whereas NT does not, enhancing generalization without compromising robustness and offering new avenues for developing robust and generalizable models.
Decomposed evaluations of geographic disparities in text-to-image models
Abhishek Sureddy
Dishant Padalia
Nandhinee Periyakaruppan
Oindrila Saha
Adina Williams
Megan Richards
Polina Kirichenko
Melissa Hall
Economic evaluation of the effect of needle and syringe programs on skin, soft tissue, and vascular infections in people who inject drugs: a microsimulation modelling approach
Jihoon Lim
W Alton Russell
Mariam El-Sheikh
Dimitra Panagiotoglou
Exploring Scaling Trends in LLM Robustness
Nikolaus H. R. Howe
Michał Zając
Ian R. McKenzie
Oskar John Hollinsworth
Tom Tseng
Aaron David Tucker
Adam Gleave
Language model capabilities predictably improve from scaling a model's size and training data. Motivated by this, increasingly large languag… (see more)e models have been trained, yielding an array of impressive capabilities. Yet these models are vulnerable to adversarial prompts, such as"jailbreaks"that hijack models to perform undesired behaviors, posing a significant risk of misuse. Prior work indicates that computer vision models become more robust with model and data scaling, raising the question: does language model robustness also improve with scale? We study this question empirically, finding that larger models respond substantially better to adversarial training, but there is little to no benefit from model scale in the absence of explicit defenses.
Game On, Hate Off: A Study of Toxicity in Online Multiplayer Environments
Zachary Yang
Nicolas Grenon-Godbout
In-Context Learning, Can It Break Safety?
Sophie Xhonneux
David Dobre
Michael Noukhovitch
Predicting the Population Risk of Suicide Using Routinely Collected Health Administrative Data in Quebec, Canada: Model-Based Synthetic Estimation Study
JianLi Wang
Fatemeh Gholi Zadeh Kharrat
Geneviève Gariépy
Jean-François Pelletier
Victoria Massamba
Pascale Lévesque
Mada Mohammed
Alain Lesage
Background Suicide is a significant public health issue. Many risk prediction tools have been developed to estimate an individual’s risk o… (see more)f suicide. Risk prediction models can go beyond individual risk assessment; one important application of risk prediction models is population health planning. Suicide is a result of the interaction among the risk and protective factors at the individual, health care system, and community levels. Thus, policy and decision makers can play an important role in suicide prevention. However, few prediction models for the population risk of suicide have been developed. Objective This study aims to develop and validate prediction models for the population risk of suicide using health administrative data, considering individual-, health system–, and community-level predictors. Methods We used a case-control study design to develop sex-specific risk prediction models for suicide, using the health administrative data in Quebec, Canada. The training data included all suicide cases (n=8899) that occurred from January 1, 2002, to December 31, 2010. The control group was a 1% random sample of living individuals in each year between January 1, 2002, and December 31, 2010 (n=645,590). Logistic regression was used to develop the prediction models based on individual-, health care system–, and community-level predictors. The developed model was converted into synthetic estimation models, which concerted the individual-level predictors into community-level predictors. The synthetic estimation models were directly applied to the validation data from January 1, 2011, to December 31, 2019. We assessed the performance of the synthetic estimation models with four indicators: the agreement between predicted and observed proportions of suicide, mean average error, root mean square error, and the proportion of correctly identified high-risk regions. Results The sex-specific models based on individual data had good discrimination (male model: C=0.79; female model: C=0.85) and calibration (Brier score for male model 0.01; Brier score for female model 0.005). With the regression-based synthetic models applied in the validation data, the absolute differences between the synthetic risk estimates and observed suicide risk ranged from 0% to 0.001%. The root mean square errors were under 0.2. The synthetic estimation model for males correctly predicted 4 of 5 high-risk regions in 8 years, and the model for females correctly predicted 4 of 5 high-risk regions in 5 years. Conclusions Using linked health administrative databases, this study demonstrated the feasibility and the validity of developing prediction models for the population risk of suicide, incorporating individual-, health system–, and community-level variables. Synthetic estimation models built on routinely collected health administrative data can accurately predict the population risk of suicide. This effort can be enhanced by timely access to other critical information at the population level.
Predicting the Population Risk of Suicide Using Routinely Collected Health Administrative Data in Quebec, Canada: Model-Based Synthetic Estimation Study
JianLi Wang
Fatemeh Gholi Zadeh Kharrat
Geneviève Gariépy
Jean-François Pelletier
Victoria Massamba
Pascale Lévesque
Mada Mohammed
Alain Lesage
A Randomized Controlled Simulation Trial of a Neonatal Resuscitation Digital Game Simulator for Labour and Delivery Room Staff
Christiane Bilodeau
Georg M. Schmölzer
Robust Knowledge Unlearning via Mechanistic Localizations
Phillip Huang Guo
Aaquib Syed
Abhay Sheshadri
Aidan Ewart
Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques
Rishika Bhagwatkar
Shravan Nayak
Reza Bayat
Alexis Roger
Daniel Z Kaplan
Vision-Language Models (VLMs) have witnessed a surge in both research and real-world applications. However, as they becoming increasingly pr… (see more)evalent, ensuring their robustness against adversarial attacks is paramount. This work systematically investigates the impact of model design choices on the adversarial robustness of VLMs against image-based attacks. Additionally, we introduce novel, cost-effective approaches to enhance robustness through prompt formatting. By rephrasing questions and suggesting potential adversarial perturbations, we demonstrate substantial improvements in model robustness against strong image-based attacks such as Auto-PGD. Our findings provide important guidelines for developing more robust VLMs, particularly for deployment in safety-critical environments.
Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects
Orevaoghene Ahia
Aremu Anuoluwapo
Diana Abagyan
Hila Gonen
Daud Abolade
Noah A. Smith
Yulia Tsvetkov
Yoruba—an African language with roughly 47 million speakers—encompasses a continuum with several dialects. Recent efforts to develop NLP… (see more) technologies for African languages have focused on their standard dialects, resulting in disparities for dialects and varieties for which there are little to no resources or tools. We take steps towards bridging this gap by introducing a new high-quality parallel text and speech corpus; YORULECT across three domains and four regional yoruba dialects. To develop this corpus, we engaged native speakers, traveling to communities where these dialects are spoken, to collect text and speech data. Using our newly created corpus, we conducted extensive experiments on (text) machine translation, automatic speech recognition, and speech-to-text translation. Our results reveal substantial performance disparities between standard yoruba and the other dialects across all tasks. However, we also show that with dialect-adaptive finetuning, we are able to narrow this gap. We believe our dataset and experimental analysis will contribute greatly to developing NLP tools for Yoruba and its dialects, and potentially for other African languages, by improving our understanding of existing challenges and offering a high-quality dataset for further development. We will release YORULECT dataset and models publicly under an open license.