Publications

Ctrl-V: Higher Fidelity Autonomous Vehicle Video Generation with Bounding-Box Controlled Object Motion

Ge Ya Luo

Zhi Hao Luo

Anthony Gosselin

Alexia Jolicoeur-Martineau

Chris Pal

2025-01-01

Trans. Mach. Learn. Res. (published)

openreview.net

A Decomposition-Based Framework for Large-Scale Multi-Period Log-Truck Routing and Scheduling: A Case Study in Canadian Forestry

Abdelhakim Abdellaoui

François Aubé

Loubna Benabbou

I. E. Hallaoui

Mouloud Amazouz

2025-01-01

IFAC-PapersOnLine (published)

doi.org

Deep Clustering with Self-Supervision using Pairwise Similarities

Mohammadreza Sadeghi

Sareh Soleimani

Narges Armanfard

Deep clustering incorporates embedding into clustering to find a lower-dimensional space appropriate for clustering. In this paper, we propo… (see more)se a novel deep clustering framework with self-supervision using pairwise similarities (DCSS). The proposed method consists of two successive phases. In the first phase, we propose to form hypersphere-like groups of similar data points, i.e. one hypersphere per cluster, employing an autoencoder that is trained using cluster-specific losses. The hyper-spheres are formed in the autoencoder's latent space. In the second phase, we propose to employ pairwise similarities to create a

2025-01-01

IEEE Access (published)

doi.org

arxiv.org

Deflated Dynamics Value Iteration

Jongmin Lee

Amin Rakhsha

Ernest K. Ryu

Amir-massoud Farahmand

The Value Iteration (VI) algorithm is an iterative procedure to compute the value function of a Markov decision process, and is the basis of… (see more) many reinforcement learning (RL) algorithms as well. As the error convergence rate of VI as a function of iteration

2025-01-01

Trans. Mach. Learn. Res. (published)

doi.org

arxiv.org

Dehumanizing Machines: Mitigating Anthropomorphic Behaviors in Text Generation Systems

Myra Cheng

Su Lin Blodgett

Alicia DeVrio

Lisa Egede

Alexandra Olteanu

As text generation systems' outputs are increasingly anthropomorphic -- perceived as human-like -- scholars have also raised increasing conc… (see more)erns about how such outputs can lead to harmful outcomes, such as users over-relying or developing emotional dependence on these systems. How to intervene on such system outputs to mitigate anthropomorphic behaviors and their attendant harmful outcomes, however, remains understudied. With this work, we aim to provide empirical and theoretical grounding for developing such interventions. To do so, we compile an inventory of interventions grounded both in prior literature and a crowdsourced study where participants edited system outputs to make them less human-like. Drawing on this inventory, we also develop a conceptual framework to help characterize the landscape of possible interventions, articulate distinctions between different types of interventions, and provide a theoretical basis for evaluating the effectiveness of different interventions.

2025-01-01

ACL (1) (published)

doi.org

arxiv.org

Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints

Lingkai Kong

Yuanqi Du

Wenhao Mu

Kirill Neklyudov

Valentin De Bortoli

Haorui Wang

Dongxia Wu

Aaron Ferber

Yi-An Ma

Carla P. Gomes

Chao Zhang

Addressing real-world optimization problems becomes particularly challenging when analytic objective functions or constraints are unavailabl… (see more)e. While numerous studies have addressed the issue of unknown objectives, limited research has focused on scenarios where feasibility constraints are not given explicitly. Overlooking these constraints can lead to spurious solutions that are unrealistic in practice. To deal with such unknown constraints, we propose to perform optimization within the data manifold using diffusion models. To constrain the optimization process to the data manifold, we reformulate the original optimization problem as a sampling problem from the product of the Boltzmann distribution defined by the objective function and the data distribution learned by the diffusion model. Depending on the differentiability of the objective function, we propose two different sampling methods. For differentiable objectives, we propose a two-stage framework that begins with a guided diffusion process for warm-up, followed by a Langevin dynamics stage for further correction. For non-differentiable objectives, we propose an iterative importance sampling strategy using the diffusion model as the proposal distribution. Comprehensive experiments on a synthetic dataset, six real-world black-box optimization datasets, and a multi-objective molecule optimization dataset show that our method achieves better or comparable performance with previous state-of-the-art baselines.

2025-01-01

AISTATS (published)

doi.org

arxiv.org

Diffusion Tree Sampling: Scalable inference-time alignment of diffusion models

2025-01-01

arXiv.org (preprint)

doi.org

Diffusion-Based Adversarial Purification for Intrusion Detection

Mohamed Amine Merzouk

Erwan Beurier

Reda Yaich

N. Cuppens-Boulahia

Frédéric Cuppens

Foutse Khomh

2025-01-01

Database Security (published)

doi.org

Discrete Audio Tokens: More Than a Survey!

Pooneh Mousavi

Gallil Maimon

Adel Moumen

Darius Petermann

Jiatong Shi

Haibin Wu

Haici Yang

Anastasia Kuznetsova

Artem Ploujnikov

Ricard Marxer

Bhuvana Ramabhadran

Benjamin Elizalde

Loren Lugosch

Jinyu Li

Cem Subakan

Phil Woodland

Minje Kim

Hung-yi Lee

Shinji Watanabe

Yossi Adi … (see 1 more)

Mirco Ravanelli

Discrete audio tokens are compact representations that aim to preserve perceptual quality, phonetic content, and speaker characteristics whi… (see more)le enabling efficient storage and inference, as well as competitive performance across diverse downstream tasks. They provide a practical alternative to continuous features, enabling the integration of speech and audio into modern large language models (LLMs). As interest in token-based audio processing grows, various tokenization methods have emerged, and several surveys have reviewed the latest progress in the field. However, existing studies often focus on specific domains or tasks and lack a unified comparison across various benchmarks. This paper presents a systematic review and benchmark of discrete audio tokenizers, covering three domains: speech, music, and general audio. We propose a taxonomy of tokenization approaches based on encoder-decoder, quantization techniques, training paradigm, streamability, and application domains. We evaluate tokenizers on multiple benchmarks for reconstruction, downstream performance, and acoustic language modeling, and analyze trade-offs through controlled ablation studies. Our findings highlight key limitations, practical considerations, and open challenges, providing insight and guidance for future research in this rapidly evolving area. For more information, including our main results and tokenizer database, please refer to our website: https://poonehmousavi.github.io/dates-website/.

2025-01-01

Trans. Mach. Learn. Res. (published)

doi.org

openreview.net

A Distributed ADMM-Based Deep Learning Approach for Thermal Control in Multi-Zone Buildings Under Demand Response Events.

Vincent Taboga

Hanane Dagdougui

2025-01-01

IEEE Trans Autom. Sci. Eng. (published)

doi.org

arxiv.org

A Distributed ADMM-based Deep Learning Approach for Thermal Control in Multi-Zone Buildings

Vincent Taboga

Hanane Dagdougui

The surge in electricity use, coupled with the dependency on intermittent renewable energy sources, poses significant hurdles to effectively… (see more) managing power grids, particularly during times of peak demand. Demand Response programs and energy conservation measures are essential to operate energy grids while ensuring a responsible use of our resources This research combines distributed optimization using ADMM with Deep Learning models to plan indoor temperature setpoints effectively. A two-layer hierarchical structure is used, with a central building coordinator at the upper layer and local controllers at the thermal zone layer. The coordinator must limit the building's maximum power by translating the building's total power to local power targets for each zone. Local controllers can modify the temperature setpoints to meet the local power targets. The resulting control algorithm, called Distributed Planning Networks, is designed to be both adaptable and scalable to many types of buildings, tackling two of the main challenges in the development of such systems. The proposed approach is tested on an 18-zone building modeled in EnergyPlus. The algorithm successfully manages Demand Response peak events.

2025-01-01

IEEE Transactions on Automation Science and Engineering (published)

doi.org

arxiv.org

Does Language Matter for Early Detection of Parkinson's Disease from Speech?

Peter Plantinga

Briac Cordelle

Dominique Louër

Mirco Ravanelli

Denise Klein

Using speech samples as a biomarker is a promising avenue for detecting and monitoring the progression of Parkinson's disease (PD), but ther… (see more)e is considerable disagreement in the literature about how best to collect and analyze such data. Early research in detecting PD from speech used a sustained vowel phonation (SVP) task, while some recent research has explored recordings of more cognitively demanding tasks. To assess the role of language in PD detection, we tested pretrained models with varying data types and pretraining objectives and found that (1) text-only models match the performance of vocal-feature models, (2) multilingual Whisper outperforms self-supervised models whereas monolingual Whisper does worse, and (3) AudioSet pretraining improves performance on SVP but not spontaneous speech. These findings together highlight the critical role of language for the early detection of Parkinson's disease.

2025-01-01

2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP) (published)

doi.org

Speed Science

Leading in a New Era

Supervision Requests

Publications

Speed Science

Leading in a New Era

Supervision Requests

Popular keywords:

Publications