Sung-Lin Yeh

Alumni

Publications

Open-Source Conversational AI with SpeechBrain 1.0

Mirco Ravanelli

Titouan Parcollet

Adel Moumen

Sylvain de Langen

Yingzhi Wang

Zeyu Zhao

Shucong Zhang

Georgios Karakasidis

Sung-Lin Yeh

Pierre Champion

Aku Rouhe

Rudolf Braun … (see 13 more)

Florian Mai

Juan Zuluaga-Gomez

Seyed Mahed Mousavi

Andreas Nautsch

Ha Nguyen

Xuechen Liu

Sangeet Sagar

Jarod Duret

Salima Mdhaffar

Gaëlle Laperrière

Mickael Rouvier

Renato De Mori

Yannick Estève

SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech rec… (see more)ognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and replicability by releasing both the pre-trained models and the complete "recipes" of code and algorithms required for training them. This paper presents SpeechBrain 1.0, a significant milestone in the evolution of the toolkit, which now has over 200 recipes for speech, audio, and language processing tasks, and more than 100 models available on Hugging Face. SpeechBrain 1.0 introduces new technologies to support diverse learning modalities, Large Language Model (LLM) integration, and advanced decoding strategies, along with novel models, tasks, and modalities. It also includes a new benchmark repository, offering researchers a unified platform for evaluating models across diverse tasks.

2024-06-28

arXiv (preprint)

doi.org

arxiv.org

SpeechBrain: A General-Purpose Speech Toolkit

Mirco Ravanelli

Titouan Parcollet

Peter Plantinga

Aku Rouhe

Samuele Cornell

Chien-Feng Liao

Elena Rastorgueva

François Grondin

William Aris

Hwidong Na

Yan Gao

Renato De Mori … (see 1 more)

Yoshua Bengio

SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the research and development of neural speech proc… (see more)essing technologies by being simple, flexible, user-friendly, and well-documented. This paper describes the core architecture designed to support several tasks of common interest, allowing users to naturally conceive, compare and share novel speech processing pipelines. SpeechBrain achieves competitive or state-of-the-art performance in a wide range of speech benchmarks. It also provides training recipes, pretrained models, and inference scripts for popular speech datasets, as well as tutorials which allow anyone with basic Python proficiency to familiarize themselves with speech technologies.

2020-12-31

arXiv (preprint)

doi.org

arxiv.org

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Sung-Lin Yeh

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Sung-Lin Yeh

Publications