Portrait de Francesco Paissan n'est pas disponible

Francesco Paissan

Visiteur de recherche indépendant
Superviseur⋅e principal⋅e

Publications

Listenable Maps for Audio Classifiers
Audio Editing with Non-Rigid Text Prompts
Francesco Paissan
Zhepei Wang
Paris Smaragdis
In this paper, we explore audio-editing with non-rigid text edits. We show that the proposed editing pipeline is able to create audio edits … (voir plus)that remain faithful to the input audio. We explore text prompts that perform addition, style transfer, and in-painting. We quantitatively and qualitatively show that the edits are able to obtain results which outperform Audio-LDM, a recently released text-prompted audio generation model. Qualitative inspection of the results points out that the edits given by our approach remain more faithful to the input audio in terms of keeping the original onsets and offsets of the audio events.
Posthoc Interpretation via Quantization
In this paper, we introduce a new approach, called Posthoc Interpretation via Quantization (PIQ), for interpreting decisions made by trained… (voir plus) classifiers. Our method utilizes vector quantization to transform the representations of a classifier into a discrete, class-specific latent space. The class-specific codebooks act as a bottleneck that forces the interpreter to focus on the parts of the input data deemed relevant by the classifier for making a prediction. Our model formulation also enables learning concepts by incorporating the supervision of pretrained annotation models such as state-of-the-art image segmentation models. We evaluated our method through quantitative and qualitative studies involving black-and-white images, color images, and audio. As a result of these studies we found that PIQ generates interpretations that are more easily understood by participants to our user studies when compared to several other interpretation methods in the literature.