Une nouvelle initiative pour renforcer les liens entre la communauté de recherche, les partenaires et les expert·e·s en IA à travers le Québec et le Canada, grâce à des rencontres et événements en présentiel axés sur l’adoption de l’IA dans l’industrie.
Mila organise son premier hackathon en informatique quantique le 21 novembre. Une journée unique pour explorer le prototypage quantique et l’IA, collaborer sur les plateformes de Quandela et IBM, et apprendre, échanger et réseauter dans un environnement stimulant au cœur de l’écosystème québécois en IA et en quantique.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Ctrl-V: Higher Fidelity Autonomous Vehicle Video Generation with Bounding-Box Controlled Object Motion
Deep clustering incorporates embedding into clustering to find a lower-dimensional space appropriate for clustering. In this paper, we propo… (voir plus)se a novel deep clustering framework with self-supervision using pairwise similarities (DCSS). The proposed method consists of two successive phases. In the first phase, we propose to form hypersphere-like groups of similar data points, i.e. one hypersphere per cluster, employing an autoencoder that is trained using cluster-specific losses. The hyper-spheres are formed in the autoencoder's latent space. In the second phase, we propose to employ pairwise similarities to create a
The Value Iteration (VI) algorithm is an iterative procedure to compute the value function of a Markov decision process, and is the basis of… (voir plus) many reinforcement learning (RL) algorithms as well. As the error convergence rate of VI as a function of iteration
As text generation systems' outputs are increasingly anthropomorphic -- perceived as human-like -- scholars have also raised increasing conc… (voir plus)erns about how such outputs can lead to harmful outcomes, such as users over-relying or developing emotional dependence on these systems. How to intervene on such system outputs to mitigate anthropomorphic behaviors and their attendant harmful outcomes, however, remains understudied. With this work, we aim to provide empirical and theoretical grounding for developing such interventions. To do so, we compile an inventory of interventions grounded both in prior literature and a crowdsourced study where participants edited system outputs to make them less human-like. Drawing on this inventory, we also develop a conceptual framework to help characterize the landscape of possible interventions, articulate distinctions between different types of interventions, and provide a theoretical basis for evaluating the effectiveness of different interventions.
Addressing real-world optimization problems becomes particularly challenging when analytic objective functions or constraints are unavailabl… (voir plus)e. While numerous studies have addressed the issue of unknown objectives, limited research has focused on scenarios where feasibility constraints are not given explicitly. Overlooking these constraints can lead to spurious solutions that are unrealistic in practice. To deal with such unknown constraints, we propose to perform optimization within the data manifold using diffusion models. To constrain the optimization process to the data manifold, we reformulate the original optimization problem as a sampling problem from the product of the Boltzmann distribution defined by the objective function and the data distribution learned by the diffusion model. Depending on the differentiability of the objective function, we propose two different sampling methods. For differentiable objectives, we propose a two-stage framework that begins with a guided diffusion process for warm-up, followed by a Langevin dynamics stage for further correction. For non-differentiable objectives, we propose an iterative importance sampling strategy using the diffusion model as the proposal distribution. Comprehensive experiments on a synthetic dataset, six real-world black-box optimization datasets, and a multi-objective molecule optimization dataset show that our method achieves better or comparable performance with previous state-of-the-art baselines.
Discrete audio tokens are compact representations that aim to preserve perceptual quality, phonetic content, and speaker characteristics whi… (voir plus)le enabling efficient storage and inference, as well as competitive performance across diverse downstream tasks. They provide a practical alternative to continuous features, enabling the integration of speech and audio into modern large language models (LLMs). As interest in token-based audio processing grows, various tokenization methods have emerged, and several surveys have reviewed the latest progress in the field. However, existing studies often focus on specific domains or tasks and lack a unified comparison across various benchmarks. This paper presents a systematic review and benchmark of discrete audio tokenizers, covering three domains: speech, music, and general audio. We propose a taxonomy of tokenization approaches based on encoder-decoder, quantization techniques, training paradigm, streamability, and application domains. We evaluate tokenizers on multiple benchmarks for reconstruction, downstream performance, and acoustic language modeling, and analyze trade-offs through controlled ablation studies. Our findings highlight key limitations, practical considerations, and open challenges, providing insight and guidance for future research in this rapidly evolving area. For more information, including our main results and tokenizer database, please refer to our website: https://poonehmousavi.github.io/dates-website/.
The surge in electricity use, coupled with the dependency on intermittent renewable energy sources, poses significant hurdles to effectively… (voir plus) managing power grids, particularly during times of peak demand. Demand Response programs and energy conservation measures are essential to operate energy grids while ensuring a responsible use of our resources This research combines distributed optimization using ADMM with Deep Learning models to plan indoor temperature setpoints effectively. A two-layer hierarchical structure is used, with a central building coordinator at the upper layer and local controllers at the thermal zone layer. The coordinator must limit the building's maximum power by translating the building's total power to local power targets for each zone. Local controllers can modify the temperature setpoints to meet the local power targets. The resulting control algorithm, called Distributed Planning Networks, is designed to be both adaptable and scalable to many types of buildings, tackling two of the main challenges in the development of such systems. The proposed approach is tested on an 18-zone building modeled in EnergyPlus. The algorithm successfully manages Demand Response peak events.
2025-01-01
IEEE Transactions on Automation Science and Engineering (publié)
Using speech samples as a biomarker is a promising avenue for detecting and monitoring the progression of Parkinson's disease (PD), but ther… (voir plus)e is considerable disagreement in the literature about how best to collect and analyze such data. Early research in detecting PD from speech used a sustained vowel phonation (SVP) task, while some recent research has explored recordings of more cognitively demanding tasks. To assess the role of language in PD detection, we tested pretrained models with varying data types and pretraining objectives and found that (1) text-only models match the performance of vocal-feature models, (2) multilingual Whisper outperforms self-supervised models whereas monolingual Whisper does worse, and (3) AudioSet pretraining improves performance on SVP but not spontaneous speech. These findings together highlight the critical role of language for the early detection of Parkinson's disease.
2025-01-01
2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP) (publié)