Publications

Open-Source Conversational AI with SpeechBrain 1.0

Mirco Ravanelli

Titouan Parcollet

Adel Moumen

Sylvain de Langen

Yingzhi Wang

Zeyu Zhao

Shucong Zhang

Georgios Karakasidis

Sung-Lin Yeh

Pierre Champion

Aku Rouhe

Rudolf Braun … (see 11 more)

Florian Mai

Juan Zuluaga-Gomez

Seyed Mahed Mousavi

Andreas Nautsch

Xuechen Liu

Sangeet Sagar

Jarod Duret

Salima Mdhaffar

Gaëlle Laperrière

Renato De Mori

Yannick Estève

SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech rec… (see more)ognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and replicability by releasing both the pre-trained models and the complete "recipes" of code and algorithms required for training them. This paper presents SpeechBrain 1.0, a significant milestone in the evolution of the toolkit, which now has over 200 recipes for speech, audio, and language processing tasks, and more than 100 models available on Hugging Face. SpeechBrain 1.0 introduces new technologies to support diverse learning modalities, Large Language Model (LLM) integration, and advanced decoding strategies, along with novel models, tasks, and modalities. It also includes a new benchmark repository, offering researchers a unified platform for evaluating models across diverse tasks.

2023-12-31

arXiv (preprint)

doi.org

arxiv.org

Operational Research: Methods and Applications

Fotios Petropoulos

Gilbert Laporte

Emel Aktas

Sibel A. Alumur

Claudia Archetti

Hayriye Ayhan

Maria Battarra

Julia A. Bennell

Jean-Marie Bourjolly

John E. Boylan

Michele Breton

David Canca

Laurent Charlin

Bo Chen

Cihan Tugrul Cicek

Louis Anthony Cox, Jr

Christine S.M. Currie

Erik Demeulemeester

Li Ding

Stephen M. Disney … (see 62 more)

Matthias Ehrgott

Martin J. Eppler

Gunes Erdogan

Bernard Fortz

L. Alberto Franco

Jens Frische

Salvatore Greco

Amanda J. Gregory

Raimo P. Hamalainen

Willy Herroelen

Mike Hewitt

Jan Holmstrom

John N. Hooker

Tugce Isik

Jill Johnes

Bahar Y. Kara

Ozlem Karsu

Katherine Kent

Charlotte Kohler

Martin Kunc

Yong-Hong Kuo

Judit Lienert

Adam N. Letchford

Janny Leung

Dong Li

Haitao Li

Ivana Ljubic

Andrea Lodi

Sebastian Lozano

Virginie Lurkin

Silvano Martello

Ian G. McHale

Gerald Midgley

John D.W. Morecroft

Akshay Mutha

Ceyda Oguz

Sanja Petrovic

Ulrich Pferschy

Harilaos N. Psaraftis

Sam Rose

Lauri Saarinen

Said Salhi

Jing-Sheng Song

Dimitrios Sotiros

Kathryn E. Stecke

Arne K. Strauss

Istenc Tarhan

Clemens Thielen

Paolo Toth

Greet Vanden Berghe

Christos Vasilakis

Vikrant Vaze

Daniele Vigo

Kai Virtanen

Xun Wang

Rafał Weron

Leroy White

Tom Van Woensel

Mike Yearworth

E. Alper Yıldırım

Georges Zaccour

Xuying Zhao

Throughout its history, Operational Research has evolved to include a variety of methods, models and algorithms that have been applied to a … (see more)diverse and wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first aims to summarise the up-to-date knowledge and provide an overview of the state-of-the-art methods and key developments in the various subdomains of the field. The second offers a wide-ranging list of areas where Operational Research has been applied. The article is meant to be read in a nonlinear fashion. It should be used as a point of reference or first-port-of-call for a diverse pool of readers: academics, researchers, students, and practitioners. The entries within the methods and applications sections are presented in alphabetical order. The authors dedicate this paper to the 2023 Turkey/Syria earthquake victims. We sincerely hope that advances in OR will play a role towards minimising the pain and suffering caused by this and future catastrophes.

2023-12-31

J. Oper. Res. Soc. (published)

doi.org

arxiv.org

Optimal Approximate Minimization of One-Letter Weighted Finite Automata

Clara Lacroce

Borja Balle

Prakash Panangaden

Guillaume Rabusseau

In this paper, we study the approximate minimization problem of weighted finite automata (WFAs): to compute the best possible approximation … (see more)of a WFA given a bound on the number of states. By reformulating the problem in terms of Hankel matrices, we leverage classical results on the approximation of Hankel operators, namely the celebrated Adamyan-Arov-Krein (AAK) theory. We solve the optimal spectral-norm approximate minimization problem for irredundant WFAs with real weights, defined over a one-letter alphabet. We present a theoretical analysis based on AAK theory, and bounds on the quality of the approximation in the spectral norm and

2023-12-31

Mathematical Structures in Computer Science (published)

doi.org

openreview.net

Optimal Zero-Shot Detector for Multi-Armed Attacks

Federica Granese

Marco Romanelli

Pablo Piantanida

This paper explores a scenario in which a malicious actor employs a multi-armed attack strategy to manipulate data samples, offering them va… (see more)rious avenues to introduce noise into the dataset. Our central objective is to protect the data by detecting any alterations to the input. We approach this defensive strategy with utmost caution, operating in an environment where the defender possesses significantly less information compared to the attacker. Specifically, the defender is unable to utilize any data samples for training a defense model or verifying the integrity of the channel. Instead, the defender relies exclusively on a set of pre-existing detectors readily available"off the shelf". To tackle this challenge, we derive an innovative information-theoretic defense approach that optimally aggregates the decisions made by these detectors, eliminating the need for any training data. We further explore a practical use-case scenario for empirical evaluation, where the attacker possesses a pre-trained classifier and launches well-known adversarial attacks against it. Our experiments highlight the effectiveness of our proposed solution, even in scenarios that deviate from the optimal setup.

2023-12-31

AISTATS (published)

doi.org

proceedings.mlr.press

Parameter Efficient Fine-tuning of Transformer-Based Language Models Using Dataset Pruning

Sayed Mohammadreza Tayaranian Hosseini

Seyyed Hasan Mozafari

James Clark

Brett Meyer

Warren Gross

The widespread use of transformer-based language models is in part owed to their ease of adaptation to various tasks. Fine-tuning is a metho… (see more)d of adapting pre-trained language models to a downstream task. The resource requirements for fine-tuning, although still less than pre-training, has been increasing due to the significant growth in the number of parameters of language models. Parameter efficient fine-tuning methods limit the set of model parameters that are updated during fine-tuning, leading to reductions in both memory usage and fine-tuning time. Dataset pruning is another method of efficient fine-tuning which removes training data points, thus reducing training time, while maintaining the evaluation performance of the fine-tuned model. In this work, we apply dataset pruning on top of parameter efficient fine-tuning to further reduce the hardware requirements of the fine-tuning. Our approach benefits from lower memory usage of parameter efficient methods while addressing their long fine-tuning time with dataset pruning. On average, our proposed method uses 22% of the fine-tuning dataset while updating only 0.5% of model parameters. As a result, while achieving an evaluation performance similar to full fine-tuning, our method reduces the peak memory usage of the fine-tuning by 40% and its wall clock time by 83%.

2023-12-31

Asilomar Conference on Signals, Systems, and Computers (published)

doi.org

Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers

Umberto Cappellazzo

Daniele Falavigna

Alessio Brutti

Mirco Ravanaelli

The common modus operandi of fine-tuning large pre-trained Transformer models entails the adaptation of all their parameters (i.e., full fin… (see more)e-tuning). While achieving striking results on multiple tasks, this approach becomes unfeasible as the model size and the number of downstream tasks increase. In natural language processing and computer vision, parameter-efficient approaches like prompt-tuning and adapters have emerged as solid alternatives by fine-tuning only a small number of extra parameters, without sacrificing performance accuracy. For audio classification tasks, the Audio Spectrogram Transformer model shows impressive results. However, surprisingly, how to efficiently adapt it to several downstream tasks has not been tackled before. In this paper, we bridge this gap and present a detailed investigation of common parameter-efficient methods, revealing that adapters and LoRA consistently outperform the other methods across four benchmarks. Whereas adapters prove to be more efficient in few-shot learning settings, LoRA turns out to scale better as we increase the number of learnable parameters. We finally carry out ablation studies to find the best configuration for adapters and LoRA.

2023-12-31

MLSP (published)

doi.org

arxiv.org

Performance reserves in brain-imaging-based phenotype prediction

Marc-Andre Schulz

Danilo Bzdok

Stefan Haufe

John-Dylan Haynes

Kerstin Ritter

This study examines the impact of sample size on predicting cognitive and mental health phenotypes from brain imaging via machine learning. … (see more)Our analysis shows a 3- to 9-fold improvement in prediction performance when sample size increases from 1,000 to 1 M participants. However, despite this increase, the data suggest that prediction accuracy remains worryingly low and far from fully exploiting the predictive potential of brain imaging data. Additionally, we find that integrating multiple imaging modalities boosts prediction accuracy, often equivalent to doubling the sample size. Interestingly, the most informative imaging modality often varied with increasing sample size, emphasizing the need to consider multiple modalities. Despite significant performance reserves for phenotype prediction, achieving substantial improvements may necessitate prohibitively large sample sizes, thus casting doubt on the practical or clinical utility of machine learning in some areas of neuroimaging.

2023-12-31

Cell Reports (published)

doi.org

PhAST: Physics-Aware, Scalable, and Task-Specific GNNs for Accelerated Catalyst Design

Alexandre Duval

Victor Schmidt

Santiago Miret

Yoshua Bengio

Alex Hernandez-Garcia

David Rolnick

Mitigating the climate crisis requires a rapid transition towards lower-carbon energy. Catalyst materials play a crucial role in the electro… (see more)chemical reactions involved in numerous industrial processes key to this transition, such as renewable energy storage and electrofuel synthesis. To reduce the energy spent on such activities, we must quickly discover more efficient catalysts to drive electrochemical reactions. Machine learning (ML) holds the potential to efficiently model materials properties from large amounts of data, accelerating electrocatalyst design. The Open Catalyst Project OC20 dataset was constructed to that end. However, ML models trained on OC20 are still neither scalable nor accurate enough for practical applications. In this paper, we propose task-specific innovations applicable to most architectures, enhancing both computational efficiency and accuracy. This includes improvements in (1) the graph creation step, (2) atom representations, (3) the energy prediction head, and (4) the force prediction head. We describe these contributions, referred to as PhAST, and evaluate them thoroughly on multiple architectures. Overall, PhAST improves energy MAE by 4 to 42

2023-12-31

J. Mach. Learn. Res. (published)

doi.org

openreview.net

PID Accelerated Temporal Difference Algorithms

Mark Bedaywi

Amin Rakhsha

Amir-massoud Farahmand

Long-horizon tasks, which have a large discount factor, pose a challenge for most conventional reinforcement learning (RL) algorithms. Algor… (see more)ithms such as Value Iteration and Temporal Difference (TD) learning have a slow convergence rate and become inefficient in these tasks. When the transition distributions are given, PID VI was recently introduced to accelerate the convergence of Value Iteration using ideas from control theory. Inspired by this, we introduce PID TD Learning and PID Q-Learning algorithms for the RL setting, in which only samples from the environment are available. We give a theoretical analysis of the convergence of PID TD Learning and its acceleration compared to the conventional TD Learning. We also introduce a method for adapting PID gains in the presence of noise and empirically verify its effectiveness.

2023-12-31

RLJ (published)

doi.org

arxiv.org

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

Prakash Panangaden

Sahand Rezaei-Shoshtari

Rosie Zhao

David Meger

Doina Precup

Reinforcement learning (RL) on high-dimensional and complex problems relies on abstraction for improved efficiency and generalization. In th… (see more)is paper, we study abstraction in the continuous-control setting, and extend the definition of Markov decision process (MDP) homomorphisms to the setting of continuous state and action spaces. We derive a policy gradient theorem on the abstract MDP for both stochastic and deterministic policies. Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization. Based on these theorems, we propose a family of actor-critic algorithms that are able to learn the policy and the MDP homomorphism map simultaneously, using the lax bisimulation metric. Finally, we introduce a series of environments with continuous symmetries to further demonstrate the ability of our algorithm for action abstraction in the presence of such symmetries. We demonstrate the effectiveness of our method on our environments, as well as on challenging visual control tasks from the DeepMind Control Suite. Our method's ability to utilize MDP homomorphisms for representation learning leads to improved performance, and the visualizations of the latent space clearly demonstrate the structure of the learned abstraction.

2023-12-31

J. Mach. Learn. Res. (published)

doi.org

arxiv.org

Population Monte Carlo With Normalizing Flow

Soumyasundar Pal

Antonios Valkanas

Mark J. Coates

Adaptive importance sampling (AIS) methods provide a useful alternative to Markov Chain Monte Carlo (MCMC) algorithms for performing inferen… (see more)ce of intractable distributions. Population Monte Carlo (PMC) algorithms constitute a family of AIS approaches which adapt the proposal distributions iteratively to improve the approximation of the target distribution. Recent work in this area primarily focuses on ameliorating the proposal adaptation procedure for high-dimensional applications. However, most of the AIS algorithms use simple proposal distributions for sampling, which might be inadequate in exploring target distributions with intricate geometries. In this work, we construct expressive proposal distributions in the AIS framework using normalizing flow, an appealing approach for modeling complex distributions. We use an iterative parameter update rule to enhance the approximation of the target distribution. Numerical experiments show that in high-dimensional settings, the proposed algorithm offers significantly improved performance compared to the existing techniques.

2023-12-31

IEEE Signal Processing Letters (published)

doi.org

arxiv.org

Precise Accuracy / Robustness Tradeoffs in Regression: Case of General Norms

Elvis Dopgima Dohmatob

Meyer Scetbon

2023-12-31

International Conference on Machine Learning (published)

proceedings.mlr.press

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications