Publications

Open-Source Conversational AI with SpeechBrain 1.0
Adel Moumen
Sylvain de Langen
Yingzhi Wang
Zeyu Zhao
Shucong Zhang
Georgios Karakasidis
Pierre Champion
Aku Rouhe
Rudolf Braun … (see 11 more)
Florian Mai
Juan Zuluaga-Gomez
Seyed Mahed Mousavi
Andreas Nautsch
Xuechen Liu
Sangeet Sagar
Jarod Duret
Salima Mdhaffar
Gaëlle Laperrière
Yannick Estève
SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech rec… (see more)ognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and replicability by releasing both the pre-trained models and the complete "recipes" of code and algorithms required for training them. This paper presents SpeechBrain 1.0, a significant milestone in the evolution of the toolkit, which now has over 200 recipes for speech, audio, and language processing tasks, and more than 100 models available on Hugging Face. SpeechBrain 1.0 introduces new technologies to support diverse learning modalities, Large Language Model (LLM) integration, and advanced decoding strategies, along with novel models, tasks, and modalities. It also includes a new benchmark repository, offering researchers a unified platform for evaluating models across diverse tasks.
Operational Research: Methods and Applications
Fotios Petropoulos
Gilbert Laporte
Emel Aktas
Sibel A. Alumur
Claudia Archetti
Hayriye Ayhan
Maria Battarra
Julia A. Bennell
Jean-Marie Bourjolly
John E. Boylan
Michele Breton
David Canca
Bo Chen
Cihan Tugrul Cicek
Louis Anthony Cox, Jr
Christine S.M. Currie
Erik Demeulemeester
Li Ding
Stephen M. Disney … (see 62 more)
Matthias Ehrgott
Martin J. Eppler
Gunes Erdogan
Bernard Fortz
L. Alberto Franco
Jens Frische
Salvatore Greco
Amanda J. Gregory
Raimo P. Hamalainen
Willy Herroelen
Mike Hewitt
Jan Holmstrom
John N. Hooker
Tugce Isik
Jill Johnes
Bahar Y. Kara
Ozlem Karsu
Katherine Kent
Charlotte Kohler
Martin Kunc
Yong-Hong Kuo
Judit Lienert
Adam N. Letchford
Janny Leung
Dong Li
Haitao Li
Ivana Ljubic
Andrea Lodi
Sebastian Lozano
Virginie Lurkin
Silvano Martello
Ian G. McHale
Gerald Midgley
John D.W. Morecroft
Akshay Mutha
Ceyda Oguz
Sanja Petrovic
Ulrich Pferschy
Harilaos N. Psaraftis
Sam Rose
Lauri Saarinen
Said Salhi
Jing-Sheng Song
Dimitrios Sotiros
Kathryn E. Stecke
Arne K. Strauss
Istenc Tarhan
Clemens Thielen
Paolo Toth
Greet Vanden Berghe
Christos Vasilakis
Vikrant Vaze
Daniele Vigo
Kai Virtanen
Xun Wang
Rafał Weron
Leroy White
Tom Van Woensel
Mike Yearworth
E. Alper Yıldırım
Georges Zaccour
Xuying Zhao
Throughout its history, Operational Research has evolved to include a variety of methods, models and algorithms that have been applied to a … (see more)diverse and wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first aims to summarise the up-to-date knowledge and provide an overview of the state-of-the-art methods and key developments in the various subdomains of the field. The second offers a wide-ranging list of areas where Operational Research has been applied. The article is meant to be read in a nonlinear fashion. It should be used as a point of reference or first-port-of-call for a diverse pool of readers: academics, researchers, students, and practitioners. The entries within the methods and applications sections are presented in alphabetical order. The authors dedicate this paper to the 2023 Turkey/Syria earthquake victims. We sincerely hope that advances in OR will play a role towards minimising the pain and suffering caused by this and future catastrophes.
Optimal Approximate Minimization of One-Letter Weighted Finite Automata
In this paper, we study the approximate minimization problem of weighted finite automata (WFAs): to compute the best possible approximation … (see more)of a WFA given a bound on the number of states. By reformulating the problem in terms of Hankel matrices, we leverage classical results on the approximation of Hankel operators, namely the celebrated Adamyan-Arov-Krein (AAK) theory. We solve the optimal spectral-norm approximate minimization problem for irredundant WFAs with real weights, defined over a one-letter alphabet. We present a theoretical analysis based on AAK theory, and bounds on the quality of the approximation in the spectral norm and
Optimal Zero-Shot Detector for Multi-Armed Attacks
Federica Granese
Marco Romanelli
This paper explores a scenario in which a malicious actor employs a multi-armed attack strategy to manipulate data samples, offering them va… (see more)rious avenues to introduce noise into the dataset. Our central objective is to protect the data by detecting any alterations to the input. We approach this defensive strategy with utmost caution, operating in an environment where the defender possesses significantly less information compared to the attacker. Specifically, the defender is unable to utilize any data samples for training a defense model or verifying the integrity of the channel. Instead, the defender relies exclusively on a set of pre-existing detectors readily available"off the shelf". To tackle this challenge, we derive an innovative information-theoretic defense approach that optimally aggregates the decisions made by these detectors, eliminating the need for any training data. We further explore a practical use-case scenario for empirical evaluation, where the attacker possesses a pre-trained classifier and launches well-known adversarial attacks against it. Our experiments highlight the effectiveness of our proposed solution, even in scenarios that deviate from the optimal setup.
Parameter Efficient Fine-tuning of Transformer-Based Language Models Using Dataset Pruning
Sayed Mohammadreza Tayaranian Hosseini
Seyyed Hasan Mozafari
Brett Meyer
The widespread use of transformer-based language models is in part owed to their ease of adaptation to various tasks. Fine-tuning is a metho… (see more)d of adapting pre-trained language models to a downstream task. The resource requirements for fine-tuning, although still less than pre-training, has been increasing due to the significant growth in the number of parameters of language models. Parameter efficient fine-tuning methods limit the set of model parameters that are updated during fine-tuning, leading to reductions in both memory usage and fine-tuning time. Dataset pruning is another method of efficient fine-tuning which removes training data points, thus reducing training time, while maintaining the evaluation performance of the fine-tuned model. In this work, we apply dataset pruning on top of parameter efficient fine-tuning to further reduce the hardware requirements of the fine-tuning. Our approach benefits from lower memory usage of parameter efficient methods while addressing their long fine-tuning time with dataset pruning. On average, our proposed method uses 22% of the fine-tuning dataset while updating only 0.5% of model parameters. As a result, while achieving an evaluation performance similar to full fine-tuning, our method reduces the peak memory usage of the fine-tuning by 40% and its wall clock time by 83%.
Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers
Umberto Cappellazzo
Daniele Falavigna
Alessio Brutti
Mirco Ravanaelli
The common modus operandi of fine-tuning large pre-trained Transformer models entails the adaptation of all their parameters (i.e., full fin… (see more)e-tuning). While achieving striking results on multiple tasks, this approach becomes unfeasible as the model size and the number of downstream tasks increase. In natural language processing and computer vision, parameter-efficient approaches like prompt-tuning and adapters have emerged as solid alternatives by fine-tuning only a small number of extra parameters, without sacrificing performance accuracy. For audio classification tasks, the Audio Spectrogram Transformer model shows impressive results. However, surprisingly, how to efficiently adapt it to several downstream tasks has not been tackled before. In this paper, we bridge this gap and present a detailed investigation of common parameter-efficient methods, revealing that adapters and LoRA consistently outperform the other methods across four benchmarks. Whereas adapters prove to be more efficient in few-shot learning settings, LoRA turns out to scale better as we increase the number of learnable parameters. We finally carry out ablation studies to find the best configuration for adapters and LoRA.
Performance reserves in brain-imaging-based phenotype prediction
Marc-Andre Schulz
Stefan Haufe
John-Dylan Haynes
Kerstin Ritter
This study examines the impact of sample size on predicting cognitive and mental health phenotypes from brain imaging via machine learning. … (see more)Our analysis shows a 3- to 9-fold improvement in prediction performance when sample size increases from 1,000 to 1 M participants. However, despite this increase, the data suggest that prediction accuracy remains worryingly low and far from fully exploiting the predictive potential of brain imaging data. Additionally, we find that integrating multiple imaging modalities boosts prediction accuracy, often equivalent to doubling the sample size. Interestingly, the most informative imaging modality often varied with increasing sample size, emphasizing the need to consider multiple modalities. Despite significant performance reserves for phenotype prediction, achieving substantial improvements may necessitate prohibitively large sample sizes, thus casting doubt on the practical or clinical utility of machine learning in some areas of neuroimaging.
PhAST: Physics-Aware, Scalable, and Task-Specific GNNs for Accelerated Catalyst Design
Mitigating the climate crisis requires a rapid transition towards lower-carbon energy. Catalyst materials play a crucial role in the electro… (see more)chemical reactions involved in numerous industrial processes key to this transition, such as renewable energy storage and electrofuel synthesis. To reduce the energy spent on such activities, we must quickly discover more efficient catalysts to drive electrochemical reactions. Machine learning (ML) holds the potential to efficiently model materials properties from large amounts of data, accelerating electrocatalyst design. The Open Catalyst Project OC20 dataset was constructed to that end. However, ML models trained on OC20 are still neither scalable nor accurate enough for practical applications. In this paper, we propose task-specific innovations applicable to most architectures, enhancing both computational efficiency and accuracy. This includes improvements in (1) the graph creation step, (2) atom representations, (3) the energy prediction head, and (4) the force prediction head. We describe these contributions, referred to as PhAST, and evaluate them thoroughly on multiple architectures. Overall, PhAST improves energy MAE by 4 to 42
PID Accelerated Temporal Difference Algorithms
Mark Bedaywi
Amin Rakhsha
Long-horizon tasks, which have a large discount factor, pose a challenge for most conventional reinforcement learning (RL) algorithms. Algor… (see more)ithms such as Value Iteration and Temporal Difference (TD) learning have a slow convergence rate and become inefficient in these tasks. When the transition distributions are given, PID VI was recently introduced to accelerate the convergence of Value Iteration using ideas from control theory. Inspired by this, we introduce PID TD Learning and PID Q-Learning algorithms for the RL setting, in which only samples from the environment are available. We give a theoretical analysis of the convergence of PID TD Learning and its acceleration compared to the conventional TD Learning. We also introduce a method for adapting PID gains in the presence of noise and empirically verify its effectiveness.
Policy Gradient Methods in the Presence of Symmetries and State Abstractions
Reinforcement learning (RL) on high-dimensional and complex problems relies on abstraction for improved efficiency and generalization. In th… (see more)is paper, we study abstraction in the continuous-control setting, and extend the definition of Markov decision process (MDP) homomorphisms to the setting of continuous state and action spaces. We derive a policy gradient theorem on the abstract MDP for both stochastic and deterministic policies. Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization. Based on these theorems, we propose a family of actor-critic algorithms that are able to learn the policy and the MDP homomorphism map simultaneously, using the lax bisimulation metric. Finally, we introduce a series of environments with continuous symmetries to further demonstrate the ability of our algorithm for action abstraction in the presence of such symmetries. We demonstrate the effectiveness of our method on our environments, as well as on challenging visual control tasks from the DeepMind Control Suite. Our method's ability to utilize MDP homomorphisms for representation learning leads to improved performance, and the visualizations of the latent space clearly demonstrate the structure of the learned abstraction.
Population Monte Carlo With Normalizing Flow
Soumyasundar Pal
Mark J. Coates
Adaptive importance sampling (AIS) methods provide a useful alternative to Markov Chain Monte Carlo (MCMC) algorithms for performing inferen… (see more)ce of intractable distributions. Population Monte Carlo (PMC) algorithms constitute a family of AIS approaches which adapt the proposal distributions iteratively to improve the approximation of the target distribution. Recent work in this area primarily focuses on ameliorating the proposal adaptation procedure for high-dimensional applications. However, most of the AIS algorithms use simple proposal distributions for sampling, which might be inadequate in exploring target distributions with intricate geometries. In this work, we construct expressive proposal distributions in the AIS framework using normalizing flow, an appealing approach for modeling complex distributions. We use an iterative parameter update rule to enhance the approximation of the target distribution. Numerical experiments show that in high-dimensional settings, the proposed algorithm offers significantly improved performance compared to the existing techniques.
Precise Accuracy / Robustness Tradeoffs in Regression: Case of General Norms
Elvis Dopgima Dohmatob
Meyer Scetbon