Publications

Game On, Hate Off: A Study of Toxicity in Online Multiplayer Environments

Zachary Yang

Nicolas Grenon-Godbout

Reihaneh Rabbany

2024-06-28

Games: Research and Practice (published)

doi.org

In-Context Learning, Can It Break Safety?

Sophie Xhonneux

David Dobre

Michael Noukhovitch

Jian Tang

Gauthier Gidel

Dhanya Sridhar

2024-06-28

ICML.cc/2024/Workshop/NextGenAISafety (poster)

openreview.net

Predicting the Population Risk of Suicide Using Routinely Collected Health Administrative Data in Quebec, Canada: Model-Based Synthetic Estimation Study

JianLi Wang

Fatemeh Gholi Zadeh Kharrat

Geneviève Gariépy

Christian Gagné

Jean-François Pelletier

Victoria Massamba

Pascale Lévesque

Mada Mohammed

Alain Lesage

2024-06-28

JMIR Public Health and Surveillance (published)

doi.org

Robust Knowledge Unlearning via Mechanistic Localizations

Phillip Huang Guo

Aaquib Syed

Abhay Sheshadri

Aidan Ewart

Gintare Karolina Dziugaite

2024-06-28

ICML.cc/2024/Workshop/NextGenAISafety (poster)

openreview.net

Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques

Rishika Bhagwatkar

Shravan Nayak

Reza Bayat

Alexis Roger

Daniel Z Kaplan

Pouya Bashivan

Irina Rish

Vision-Language Models (VLMs) have witnessed a surge in both research and real-world applications. However, as they becoming increasingly pr… (see more)evalent, ensuring their robustness against adversarial attacks is paramount. This work systematically investigates the impact of model design choices on the adversarial robustness of VLMs against image-based attacks. Additionally, we introduce novel, cost-effective approaches to enhance robustness through prompt formatting. By rephrasing questions and suggesting potential adversarial perturbations, we demonstrate substantial improvements in model robustness against strong image-based attacks such as Auto-PGD. Our findings provide important guidelines for developing more robust VLMs, particularly for deployment in safety-critical environments.

2024-06-28

ICML.cc/2024/Workshop/NextGenAISafety (poster)

doi.org

openreview.net

A Context-Driven Approach for Co-Auditing Smart Contracts with The Support of GPT-4 code interpreter

Mohamed Salah Bouafif

Chen Zheng

Ilham Qasse

Ed Zulkoski

Mohammad Hamdaqa

Foutse Khomh

The surge in the adoption of smart contracts necessitates rigorous auditing to ensure their security and reliability. Manual auditing, altho… (see more)ugh comprehensive, is time-consuming and heavily reliant on the auditor's expertise. With the rise of Large Language Models (LLMs), there is growing interest in leveraging them to assist auditors in the auditing process (co-auditing). However, the effectiveness of LLMs in smart contract co-auditing is contingent upon the design of the input prompts, especially in terms of context description and code length. This paper introduces a novel context-driven prompting technique for smart contract co-auditing. Our approach employs three techniques for context scoping and augmentation, encompassing code scoping to chunk long code into self-contained code segments based on code inter-dependencies, assessment scoping to enhance context description based on the target assessment goal, thereby limiting the search space, and reporting scoping to force a specific format for the generated response. Through empirical evaluations on publicly available vulnerable contracts, our method demonstrated a detection rate of 96\% for vulnerable functions, outperforming the native prompting approach, which detected only 53\%. To assess the reliability of our prompting approach, manual analysis of the results was conducted by expert auditors from our partner, Quantstamp, a world-leading smart contract auditing company. The experts' analysis indicates that, in unlabeled datasets, our proposed approach enhances the proficiency of the GPT-4 code interpreter in detecting vulnerabilities.

2024-06-26

ArXiv (preprint)

doi.org

arxiv.org

Data harmonization for Advancing research on Personalized Rehabilitation Interventions for Patients with Traumatic Brain Injury and Stroke: A proof of concept

Dorra Rakia Allegue

Despoina Petsani

Nathalie Ponthon

Evdokimos Konstantinidis

Panagiotis Bamidis

Eva Kehayia

Audrey Durand

Sara Ahmed

Stroke and traumatic brain injury (TBI) are leading causes of morbidity and mortality, affecting survivors’ mobility and social participat… (see more)ion. Although personalized interventions could positively impact survivors' recovery, the effectiveness of such interventions remains unclear. Open-access data repositories can provide access to multiple shared data which could help uncover new evidence of effective interventions; however, harmonizing data between different studies requires many steps to make it possible given the various methods of data collection, intervention characteristics and population sociodemographic profile. This proof-of-concept study aimed to describe the steps and anchors that contributed to the development of guiding frameworks to harmonize data across different studies. Data were extracted from the Federal Interagency Traumatic Brain Injury Research (FITBIR) repository and stored on an online cloud platform. The outcome measures were mapped to mobility determinants using the International Classification of Functioning, Disability, and Health (ICF) and Webber framework. The intervention's effect was categorized according to the Minimal Clinically Important Difference (MCID)s of the measures administered. The study proposed a novel framework for intervention features, which aims to enhance our understanding of the mechanisms of action and potential impact of rehabilitation interventions. The framework classified interventions based on their nature, context, specific body systems, dosage, caregiver assistance, and behaviour change strategies. In conclusion, this study demonstrated the feasibility of harmonizing data extracted from different sources in the FITBIR repository. Leveraging existing open databases offers tremendous opportunities to advance research on personalized interventions for patients with TBI and stroke and inform decision-making during transitions.

2024-06-26

Petra (published)

doi.org

Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers

Jonas Ngnaw'e

Sabyasachi Sahoo

Yann Pequignot

Frederic Precioso

Christian Gagné

2024-06-26

ArXiv (preprint)

doi.org

arxiv.org

Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers

Jonas Ngnaw'e

Sabyasachi Sahoo

Yann Batiste Pequignot

Fr'ed'eric Precioso

Christian Gagné

Despite extensive research on adversarial training strategies to improve robustness, the decisions of even the most robust deep learning mod… (see more)els can still be quite sensitive to imperceptible perturbations, creating serious risks when deploying them for high-stakes real-world applications. While detecting such cases may be critical, evaluating a model's vulnerability at a per-instance level using adversarial attacks is computationally too intensive and unsuitable for real-time deployment scenarios. The input space margin is the exact score to detect non-robust samples and is intractable for deep neural networks. This paper introduces the concept of margin consistency -- a property that links the input space margins and the logit margins in robust models -- for efficient detection of vulnerable samples. First, we establish that margin consistency is a necessary and sufficient condition to use a model's logit margin as a score for identifying non-robust samples. Next, through comprehensive empirical analysis of various robustly trained models on CIFAR10 and CIFAR100 datasets, we show that they indicate strong margin consistency with a strong correlation between their input space margins and the logit margins. Then, we show that we can effectively use the logit margin to confidently detect brittle decisions with such models and accurately estimate robust accuracy on an arbitrarily large test set by estimating the input margins only on a small subset. Finally, we address cases where the model is not sufficiently margin-consistent by learning a pseudo-margin from the feature representation. Our findings highlight the potential of leveraging deep representations to efficiently assess adversarial vulnerability in deployment scenarios.

2024-06-26

ArXiv (preprint)

doi.org

arxiv.org

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

Alexander Khazatsky

Karl Pertsch

Suraj Nair

Ashwin Balakrishna

Sudeep Dasari

Siddharth Karamcheti

Soroush Nasiriany

Mohan Kumar Srirama

Lawrence Yunliang Chen

Kirsty Ellis

Peter David Fagan

Joey Hejna

Masha Itkina

Marion Lepert

Yecheng Jason Ma

Ye Ma

Patrick Tree Miller

Jimmy Wu

Suneel Belkhale

Shivin Dass … (see 80 more)

Huy Ha

Arhan Jain

Abraham Lee

Youngwoon Lee

Marius Memmel

Sungjae Park

Ilija Radosavovic

Kaiyuan Wang

Albert Zhan

Kevin Black

Cheng Chi

Kyle Beltran Hatch

Shan Lin

Jingpei Lu

Jean Mercat

Abdul Rehman

Pannag R Sanketi

Archit Sharma

Cody Simpson

Quan Vuong

Homer Rich Walke

Blake Wulfe

Ted Xiao

Jonathan Heewon Yang

Arefeh Yavary

Tony Z. Zhao

Christopher Agia

Rohan Baijal

Mateo Guaman Castro

Daphne Chen

Qiuyu Chen

Trinity Chung

Jaimyn Drake

Ethan Paul Foster

Jensen Gao

David Antonio Herrera

Minho Heo

Kyle Hsu

Jiaheng Hu

Donovon Jackson

Charlotte Le

Yunshuang Li

K. Lin

Roy Lin

Zehan Ma

Abhiram Maddukuri

Suvir Mirchandani

Daniel Morton

Tony Khuong Nguyen

Abigail O'Neill

Rosario Scalise

Derick Seale

Victor Son

Stephen Tian

Emi Tran

Andrew E. Wang

Yilin Wu

Annie Xie

Jingyun Yang

Patrick Yin

Yunchu Zhang

Osbert Bastani

Glen Berseth

Jeannette Bohg

Ken Goldberg

Abhinav Gupta

Abhishek Gupta

Dinesh Jayaraman

Joseph J Lim

Jitendra Malik

Roberto Martín-Martín

Subramanian Ramamoorthy

Dorsa Sadigh

Shuran Song

Jiajun Wu

Michael C. Yip

Yuke Zhu

Thomas Kollar

Sergey Levine

Chelsea Finn

The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and … (see more)robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.

2024-06-26

roboticsfoundation.org/RSS/2024/Workshop/DGR (poster)

doi.org

openreview.net

Implicit Diffusion: Efficient Optimization through Stochastic Sampling

Pierre Marion

Anna Korba

Peter Bartlett

Mathieu Blondel

Valentin De Bortoli

Arnaud Doucet

Felipe Llinares-López

Courtney Paquette

Quentin Berthet

2024-06-26

ICML.cc/2024/Workshop/Differentiable_Almost_Everything (published)

doi.org

openreview.net

Learning to Design Data-structures: A Case Study of Nearest Neighbor Search

Omar Salemohamed

Vatsal Sharan

Shivam Garg

Laurent Charlin

Gregory Valiant

We propose a general framework for automating data-structure design and apply it to the problem of nearest neighbor search. Our model adapts… (see more) to the underlying data distribution and provides fine-grained control over query and space complexity, enabling the discovery of solutions tailored to problem-specific constraints. We are able to reverse-engineer learned algorithms in several settings. In 1D, the model discovers optimal distribution (in)dependent algorithms such as binary search and variants of interpolation search. In higher dimensions, the model learns solutions that resemble K-d trees in some regimes, while in others, have elements of locality-sensitive hashing.

2024-06-26

ICML.cc/2024/Workshop/Differentiable_Almost_Everything (published)

openreview.net

Indigenous Pathfinders in AI

Rising to the Occasion

TRAIL for Professionals

Publications

Indigenous Pathfinders in AI

Rising to the Occasion

TRAIL for Professionals

Popular keywords:

Publications