SpeechBrain-MOABB: An open-source Python library for benchmarking deep neural networks applied to EEG signals
Davide Borra
Francesco Paissan
Tracing Optimization for Performance Modeling and Regression Detection
Kaveh Shahedi
Heng Li
Maxime Lamothe
Software performance modeling plays a crucial role in developing and maintaining software systems. A performance model analytically describe… (see more)s the relationship between the performance of a system and its runtime activities. This process typically examines various aspects of a system's runtime behavior, such as the execution frequency of functions or methods, to forecast performance metrics like program execution time. By using performance models, developers can predict expected performance and thereby effectively identify and address unexpected performance regressions when actual performance deviates from the model's predictions. One common and precise method for capturing performance behavior is software tracing, which involves instrumenting the execution of a program, either at the kernel level (e.g., system calls) or application level (e.g., function calls). However, due to the nature of tracing, it can be highly resource-intensive, making it impractical for production environments where resources are limited. In this work, we propose statistical approaches to reduce tracing overhead by identifying and excluding performance-insensitive code regions, particularly application-level functions, from tracing while still building accurate performance models that can capture performance degradations. By selecting an optimal set of functions to be traced, we can construct optimized performance models that achieve an R-2 score of up to 99% and, sometimes, outperform full tracing models (models using non-optimized tracing data), while significantly reducing the tracing overhead by more than 80% in most cases. Our optimized performance models can also capture performance regressions in our studied programs effectively, demonstrating their usefulness in real-world scenarios. Our approach is fully automated, making it ready to be used in production environments with minimal human effort.
Unsupervised Object Discovery: A Comprehensive Survey and Unified Taxonomy
Jos'e-Fabian Villa-V'asquez
Unsupervised object discovery is commonly interpreted as the task of localizing and/or categorizing objects in visual data without the need … (see more)for labeled examples. While current object recognition methods have proven highly effective for practical applications, the ongoing demand for annotated data in real-world scenarios drives research into unsupervised approaches. Furthermore, existing literature in object discovery is both extensive and diverse, posing a significant challenge for researchers that aim to navigate and synthesize this knowledge. Motivated by the evidenced interest in this avenue of research, and the lack of comprehensive studies that could facilitate a holistic understanding of unsupervised object discovery, this survey conducts an in-depth exploration of the existing approaches and systematically categorizes this compendium based on the tasks addressed and the families of techniques employed. Additionally, we present an overview of common datasets and metrics, highlighting the challenges of comparing methods due to varying evaluation protocols. This work intends to provide practitioners with an insightful perspective on the domain, with the hope of inspiring new ideas and fostering a deeper understanding of object discovery approaches.
Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation
Shambhavi Mishra
Julio Silva-Rodríguez
Ismail Ben Ayed
Jose Dolz
Vision-language foundation models, such as CLIP, have shown unprecedented zero-shot performance across a wide range of tasks. Nevertheless, … (see more)these models may be unreliable under distributional shifts, as their performance is significantly degraded. In this work, we explore how to efficiently leverage class text information to mitigate these distribution drifts encountered by large pre-trained vision-language models (VLMs) during test-time inference. In particular, we propose to generate pseudo-labels for the test-time samples by exploiting generic class text embeddings as fixed centroids of a label assignment problem, which is efficiently solved with Optimal Transport. Furthermore, the proposed adaptation method (CLIP-OT) integrates a multiple template knowledge distillation approach, which replicates multi-view contrastive learning strategies in unsupervised representation learning but without incurring additional computational complexity. Extensive experiments on multiple popular test-time adaptation benchmarks presenting diverse complexity empirically show the superiority of CLIP-OT, achieving performance gains of up to 7% over recent state-of-the-art methods, yet being computationally and memory efficient.
Improving Adversarial Transferability via Model Alignment
Avery Ma
Amir-massoud Farahmand
Yangchen Pan
Philip Torr
Jindong Gu
Neural networks are susceptible to adversarial perturbations that are transferable across different models. In this paper, we introduce a no… (see more)vel model alignment technique aimed at improving a given source model's ability in generating transferable adversarial perturbations. During the alignment process, the parameters of the source model are fine-tuned to minimize an alignment loss. This loss measures the divergence in the predictions between the source model and another, independently trained model, referred to as the witness model. To understand the effect of model alignment, we conduct a geometric analysis of the resulting changes in the loss landscape. Extensive experiments on the ImageNet dataset, using a variety of model architectures, demonstrate that perturbations generated from aligned source models exhibit significantly higher transferability than those from the original source model.
Soft Condorcet Optimization for Ranking of General Agents
Marc Lanctot
Kate Larson
Michael Kaisers
Quentin Berthet
Ian Gemp
Manfred Diaz
Roberto-Rafael Maura-Rivero
Yoram Bachrach
Anna Koop
A common way to drive progress of AI models and agents is to compare their performance on standardized benchmarks. Comparing the performance… (see more) of general agents requires aggregating their individual performances across a potentially wide variety of different tasks. In this paper, we describe a novel ranking scheme inspired by social choice frameworks, called Soft Condorcet Optimization (SCO), to compute the optimal ranking of agents: the one that makes the fewest mistakes in predicting the agent comparisons in the evaluation data. This optimal ranking is the maximum likelihood estimate when evaluation data (which we view as votes) are interpreted as noisy samples from a ground truth ranking, a solution to Condorcet's original voting system criteria. SCO ratings are maximal for Condorcet winners when they exist, which we show is not necessarily true for the classical rating system Elo. We propose three optimization algorithms to compute SCO ratings and evaluate their empirical performance. When serving as an approximation to the Kemeny-Young voting method, SCO rankings are on average 0 to 0.043 away from the optimal ranking in normalized Kendall-tau distance across 865 preference profiles from the PrefLib open ranking archive. In a simulated noisy tournament setting, SCO achieves accurate approximations to the ground truth ranking and the best among several baselines when 59\% or more of the preference data is missing. Finally, SCO ranking provides the best approximation to the optimal ranking, measured on held-out test sets, in a problem containing 52,958 human players across 31,049 games of the classic seven-player game of Diplomacy.
Beyond Causal Discovery for Astronomy: Learning Meaningful Representations with Independent Component Analysis
Zehao Jin
Mario Pasquato
Benjamin L. Davis
Andrea Maccio
General Causal Imputation via Synthetic Interventions
Marco Jiralerspong
Thomas Jiralerspong
Vedant Shah
Given two sets of elements (such as cell types and drug compounds), researchers typically only have access to a limited subset of their inte… (see more)ractions. The task of causal imputation involves using this subset to predict unobserved interactions. Squires et al. (2022) have proposed two estimators for this task based on the synthetic interventions (SI) estimator: SI-A (for actions) and SI-C (for contexts). We extend their work and introduce a novel causal imputation estimator, generalized synthetic interventions (GSI). We prove the identifiability of this estimator for data generated from a more complex latent factor model. On synthetic and real data we show empirically that it recovers or outperforms their estimators.
Unsupervised Object Discovery: A Comprehensive Survey and Unified Taxonomy
Jos'e-Fabian Villa-V'asquez
Unsupervised object discovery is commonly interpreted as the task of localizing and/or categorizing objects in visual data without the need … (see more)for labeled examples. While current object recognition methods have proven highly effective for practical applications, the ongoing demand for annotated data in real-world scenarios drives research into unsupervised approaches. Furthermore, existing literature in object discovery is both extensive and diverse, posing a significant challenge for researchers that aim to navigate and synthesize this knowledge. Motivated by the evidenced interest in this avenue of research, and the lack of comprehensive studies that could facilitate a holistic understanding of unsupervised object discovery, this survey conducts an in-depth exploration of the existing approaches and systematically categorizes this compendium based on the tasks addressed and the families of techniques employed. Additionally, we present an overview of common datasets and metrics, highlighting the challenges of comparing methods due to varying evaluation protocols. This work intends to provide practitioners with an insightful perspective on the domain, with the hope of inspiring new ideas and fostering a deeper understanding of object discovery approaches.
AugmenToxic: Leveraging Reinforcement Learning to Optimize LLM Instruction Fine-Tuning for Data Augmentation to Enhance Toxicity Detection
Arezo Bodaghi
Ketra A. Schmitt
From Silos to Systems: Process-Oriented Hazard Analysis for AI Systems
Shalaleh Rismani
Roel Dobbe
Injury and violence in the context of sustainable development
Kidist Bartolomeos
Ryan Lett
Respicious Boniface
Victoria Munthali
Tarek Razek
Dan Deckelbaum
David Bracco
Ermiyas Belay
Fitsum Kifle
David Ulrich Dalle
Celestin Bilong Mbangtang
Arsene Daniel Nyalundja
Jondre Macaraeg
Irene Dzirasa
Ulrick Sidney Kanmounye
Delanyo Dovlo
Kwadwo Koram
Eugene Nyarko
Desmond T. Jumbam
Emnet Tesfay Shimber … (see 180 more)
Taylor Jaraczewski
Maria Sgro
Ajiel Mae Basmayor
Asegid Ergete
Mary Schroeder
Adam Gyedu
Emmanuel Nakua
Peter Donkor
Charles Mock
Atalel Awedew
Halid Melkamu
Sisay Bekele
Berhanu Hailemariam
Enku Shiferaw
Yishak Shiferaw
Wubetie Yirdaw
Debojit Basak
Deepa Kizhakke Veetil
Nobhojit Roy
Martin Gerdin Wärnberg
Santosh Rath
Mohammed A.S Abdullahi
Kefas Mbaya
Abubakar Kakasanda
Stephanie Danjuma
Hector Olasoji
Alemayehu Bedada
Mpapho Joseph Motsumi
Shimelis Genna Hamda
Demuma Amdisa
Getachew Tilahun
Matthew Boroditsky
Mark Hill
Roy Hilzenrat
Rachel Livergant
Jayd Adams
Catherine Binda
Allison Chhor
Helen Hsiao
Faizal Haji
Esther Chin
Felix Oyania
Caroline Q. Stephens
Sarah Ullrich
Meera Kotagal
Francis Bajunirwe
Doruk Ozgediz
Dionysia Kravarioti
Lye-Yeng Wong
Tsegazeab Laeke Teklemariam
Abenezer Tirsit
Tewodros Liyew
Mark Ferguson
Timothy Plackett
Jaymie Claire Henry
Meseret Abeza
Seye Mesfin Minas
Maryse Bouchard
Dimuthu Tennakoon
Rahul Burra
Fleming Mathew
Annabelle Jones
Sargun Virk
Shlok Patel
Tanaz Vaghaiwalla
James Hudspeth
Tracy Rabin
Virginia Rowthorn
Raymond R. Price
Nakul Raykar
Gilgamesh Eamer
Stephen Mutiso
Yvette Kisaka
Gladwell Gathecha
Ronald Lett
Chibuike Onu
Emmanuel Ameh
Matthias Igoche
Paschal Anyanwu
Eunice Onuh
Oikeh Ojeamen
Edith Terna Yawe
Amina Abubakar
Yakubu Ashoms
Hadiza Suleiman
Naomi Musa
Daniel Kisitu Kyengera
Netsanet Abebe
Richard Gardener
Nebyou Seyoum Abebe
Henok T/Silasie Zeleke
Kacylia Roy Proulx
Shreenik Kundu
Boaz Laor
Riya Sawhney
Taylor Wurdeman
Fabio Botelho
Ayla Gerk
Elena Guadagno
Mengistu Ayele
Azarias Kassahun
Tsegazeab Laeke
Mestet Yibeltal
Bereket Hailu
Ermias Fikru
Shemsedin Ibro
Abdeta Workineh
Fikadu Balcha
Fira Abamecha
Sheka Shemsi
Abdullah Saleh Alruwaili
Gabriel Rodriguez
Anna Jose
Shahd Ebied
Samuel Girma
Abigael Abiy
Hussien Endris Assen
Kalab Tesfaye
Kassaye Demeke
Aklilu Yiheyis
Khalid Jemal
Demeke Yilkal
Ashenafi Amsalu
Lema Derseh
Yophtahe W/Gerima
Tadesse Belayneh
Mekuanint Tiruneh
Almaw Bitew
Sewbesew Yitayih
Tadesse Awoke
Chanyalew Worku
Anissa Mohammed
Mohammed Alemu
Mohammed Yesuf
Fantu Mamo
Kegnie Shitu
Biks Liyew
Ayenew Gucho
Gezahegn Tilahun
Timothy Love
Andrew Chew
Brian Kasagga
Berjo Takoutsing
Obuku Ekwaro
Emmanuel Elobu
Degisew Dersso Mengistu
Alex Zhuang
Bethlehem Shiferew
Gelila Mengistu
Ayalew Zewdie
Nahom Tadelle
Alegnta Gebreyesus
Elise Presser
Katie Iverson
Christopher Dodgion
Thomas G. Weiser
Rachel Koch
Nichole Starr
Davy Lau
Irena Zivkovic
Shahrzad Joharifard
Emilie Joos
Naisan Garraway
Francesca Vituci
Eric O’Flynn
Ines Péric
Léa Simon
Geoffrey Ibbotson
Tsion Seyoum
Aklilu Azazh
Lemlem Beza
Ifeanyichukwu Onah
Chijioke Chukwuma
Dagim Berhanu
Jason Shenoi
Nick Sears
Yoseph Bedore
Richard Caplan
Wongel Tena Shale
invaluable