Publications

Audio Editing with Non-Rigid Text Prompts

Francesco Paissan

Zhepei Wang

Paris Smaragdis

In this paper, we explore audio-editing with non-rigid text edits. We show that the proposed editing pipeline is able to create audio edits … (voir plus)that remain faithful to the input audio. We explore text prompts that perform addition, style transfer, and in-painting. We quantitatively and qualitatively show that the edits are able to obtain results which outperform Audio-LDM, a recently released text-prompted audio generation model. Qualitative inspection of the results points out that the edits given by our approach remain more faithful to the input audio in terms of keeping the original onsets and offsets of the audio events.

2024-09-01

Interspeech 2024 (publié)

doi.org

arxiv.org

Clinical Care Trajectory Assessment of Children with Congenital Diaphragmatic Hernia and Neurodevelopmental Impairment

Alexandra Dimmer

Gabriel Altit

Sabrina Beauseigle

Elena Guadagno

Louise Koclas

Katryn Paquette

Ana Sant’Anna

Adam Shapiro

Dan Poenaru

Pramod Puligandla

2024-09-01

Journal of Pediatric Surgery (publié)

doi.org

Data Privacy for Record Linkage and Beyond

Shurong Lin

Eric Kolaczyk

In a data-driven world, two prominent research problems are record linkage and data privacy, among others. Record linkage is essential for i… (voir plus)mproving decision-making by integrating information of the same entities from different sources. On the other hand, data privacy research seeks to balance the need to extract accurate insights from data with the imperative to protect the privacy of the entities involved. Inevitably, data privacy issues arise in the context of record linkage. This article identifies two complementary aspects at the intersection of these two fields: (1) how to ensure privacy during record linkage and (2) how to mitigate privacy risks when releasing the analysis results after record linkage. We specifically discuss privacy-preserving record linkage, differentially private regression, and related topics.

2024-09-01

Social Science Research Network (publié)

doi.org

Do machine learning methods Make Better predictions in pharmacoepidemiology?

Ana Paula Pena-Gralle

Mireille E. Schnitzer

Sofia-Nada Boureguaa

Félix Morin

Marc-André Legault

Caroline Sirois

Alice Dragomir

Lucie Blais

2024-09-01

Annals of Epidemiology (publié)

doi.org

Predicting Five-Year All-Cause Mortality in COPD Patients Using Machine Learning

Ana Paula Pena-Gralle

Amélie Forget

Sofia-Nada Boureguaa

Marc-André Legault

Lucie Blais

2024-09-01

Annals of Epidemiology (publié)

doi.org

Virtual Reality for Pediatric Trauma Education - A Preliminary Face and Content Validation Study

Fabio Botelho

Said Ashkar

Shreenik Kundu

Tj Matthews

Elena Guadgano

Dan Poenaru

2024-09-01

Journal of Pediatric Surgery (publié)

doi.org

Virtual Reality for Pediatric Trauma Education - A Preliminary Face and Content Validation Study.

Fabio Botelho

Said Ashkar

Shreenik Kundu

Tj Matthews

Elena Guadgano

Dan Poenaru

2024-09-01

Journal of Pediatric Surgery (publié)

doi.org

Herbarium collections remain essential in the age of community science

Isaac Eckert

Anne Bruneau

D. Metsger

Simon Joly

T. Dickinson

Laura J. Pollock

2024-08-31

Nature Communications (publié)

doi.org

Herbarium collections remain essential in the age of community science

Isaac Eckert

Anne Bruneau

D. Metsger

Simon Joly

T. Dickinson

Laura J. Pollock

2024-08-31

Nature Communications (publié)

doi.org

Progres: Prompted Generative Rescoring on ASR N-Best

Ada Defne Tur

Adel Moumen

Mirco Ravanelli

Large Language Models (LLMs) have shown their ability to improve the performance of speech recognizers by effectively rescoring the n-best h… (voir plus)ypotheses generated during the beam search process. However, the best way to exploit recent generative instruction-tuned LLMs for hypothesis rescoring is still unclear. This paper proposes a novel method that uses instruction-tuned LLMs to dynamically expand the n-best speech recognition hypotheses with new hypotheses generated through appropriately-prompted LLMs. Specifically, we introduce a new zero-shot method for ASR n-best rescoring, which combines confidence scores, LLM sequence scoring, and prompt-based hypothesis generation. We compare Llama-3-Instruct, GPT-3.5 Turbo, and GPT-4 Turbo as prompt-based generators with Llama-3 as sequence scorer LLM. We evaluated our approach using different speech recognizers and observed significant relative improvement in the word error rate (WER) ranging from 5% to 25%.

2024-08-30

ArXiv (prépublication)

doi.org

arxiv.org

Progres: Prompted Generative Rescoring on ASR N-Best

Ada Defne Tur

Adel Moumen

Mirco Ravanelli

Large Language Models (LLMs) have shown their ability to improve the performance of speech recognizers by effectively rescoring the n-best h… (voir plus)ypotheses generated during the beam search process. However, the best way to exploit recent generative instruction-tuned LLMs for hypothesis rescoring is still unclear. This paper proposes a novel method that uses instruction-tuned LLMs to dynamically expand the n-best speech recognition hypotheses with new hypotheses generated through appropriately-prompted LLMs. Specifically, we introduce a new zero-shot method for ASR n-best rescoring, which combines confidence scores, LLM sequence scoring, and prompt-based hypothesis generation. We compare Llama-3-Instruct, GPT-3.5 Turbo, and GPT-4 Turbo as prompt-based generators with Llama-3 as sequence scorer LLM. We evaluated our approach using different speech recognizers and observed significant relative improvement in the word error rate (WER) ranging from 5% to 25%.

2024-08-30

ArXiv (prépublication)

doi.org

arxiv.org

Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration

Rongge Zhang

Haechan Mark Bong

Giovanni Beltrame

Exploration in unknown and unstructured environments is a pivotal requirement for robotic applications. A robot’s exploration behavior can… (voir plus) be inherently affected by the performance of its Simultaneous Localization and Mapping (SLAM) subsystem, although SLAM and exploration are generally studied separately. In this paper, we formulate exploration as an active mapping problem and extend it with semantic information. We introduce a novel active metric-semantic SLAM approach, leveraging recent research advances in information theory and spectral graph theory: we combine semantic mutual information and the connectivity metrics of the underlying pose graph of the SLAM subsystem. We use the resulting utility function to evaluate different trajectories to select the most favorable strategy during exploration. Exploration and SLAM metrics are analyzed in experiments. Running our algorithm on the Habitat dataset, we show that, while maintaining efficiency close to the state-of-the-art exploration methods, our approach effectively increases the performance of metric-semantic SLAM with a 21% reduction in average map error and a 9% improvement in average semantic classification accuracy.

2024-08-27

ArXiv (prépublication)

doi.org

arxiv.org

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Publications

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Mots-clés populaires:

Publications