Développez des compétences fondamentales en intelligence artificielle (IA) responsable grâce à des cours autodirigés, animés par des expert·e·s de Mila reconnu·e·s à l’échelle internationale.
Le Fellowship Mila en politiques de l'IA transforme l'expertise approfondie en IA en politiques rigoureuses d'intérêt public. Découvrez la dernière publication Combler la disparité en matière d’expertise : mécanismes de transfert des connaissances pour la réglementation de l’IA par Moritz von Knebel.
Ce programme soutient les startups spécialisées en IA à tout moment de l'année. Bénéficiez de ressources de pointe et d'un accompagnement sur mesure pour accélérer le développement de votre technologie.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection
Understanding the impact of IoT security patterns on CPU usage and energy consumption: a dynamic approach for selecting patterns with deep reinforcement learning
Large Language Models (LLMs) are widely adopted for automated code generation with promising results. Although prior research has assessed L… (voir plus)LM-generated code and identified various quality issues- such as redundancy, poor maintainability, and sub-optimal performance- a systematic understanding and categorization of these inefficiencies remain unexplored. Therefore, we empirically investigate inefficiencies in LLM-generated Python code by state-of-the-art models, i.e., CodeLlama, DeepSeek-Coder, and CodeGemma. To do so, we manually analyze 492 generated Python code snippets in the HumanEval+ dataset. We then construct a taxonomy of inefficiencies in LLM-generated Python code that includes 5 categories (General Logic, Performance, Readability, Maintainability, and Errors) and 19 subcategories of inefficiencies. We validate the obtained taxonomy through an online survey with 58 LLM practitioners and researchers. The surveyed participants affirmed the completeness of the proposed taxonomy, and the relevance and the popularity of the identified code inefficiency patterns. Our qualitative findings indicate that inefficiencies are diverse and interconnected, affecting multiple aspects of code quality, with logic and performance-related inefficiencies being the most frequent and often co-occurring while impacting overall code quality. Our taxonomy provides a structured basis for evaluating the quality of LLM-generated code and guiding future research to improve code generation efficiency.
We introduce NNetscape Navigator (NNetnav), a method for training web agents entirely through synthetic demonstrations. These demonstrations… (voir plus) are collected by first interacting with a browser to generate trajectory rollouts, which are then retroactively labeled into instructions using a language model. Most work on training browser agents has relied on expensive human supervision, and the limited previous work on such interaction-first synthetic data techniques has failed to provide effective search through the exponential space of exploration. In contrast, NNetnav exploits the hierarchical structure of language instructions to make this search more tractable: complex instructions are typically decomposable into simpler subtasks, allowing NNetnav to automatically prune interaction episodes when an intermediate trajectory cannot be annotated with a meaningful sub-task. We use NNetnav demonstrations from a language model for supervised fine-tuning of a smaller language model policy, and find improvements of 6 points on WebArena and over 20 points on MiniWoB++, two popular environments for web-agents. Notably, on WebArena, we observe that language model policies can be further enhanced when fine-tuned with NNetnav demonstrations derived from the same language model. Finally, we collect and release a dataset of over 6k NNetnav demonstrations on WebArena, spanning a diverse and complex set of instructions.
Recent advances in integrating positional and structural encodings (PSEs) into graph neural networks (GNNs) have significantly enhanced thei… (voir plus)r performance across various graph learning tasks. However, the general applicability of these encodings and their potential to serve as foundational representations for graphs remain uncertain. This paper investigates the fine-tuning efficiency, scalability with sample size, and generalization capability of learnable PSEs across diverse graph datasets. Specifically, we evaluate their potential as universal pre-trained models that can be easily adapted to new tasks with minimal fine-tuning and limited data. Furthermore, we assess the expressivity of the learned representations, particularly, when used to augment downstream GNNs. We demonstrate through extensive benchmarking and empirical analysis that PSEs generally enhance downstream models. However, some datasets may require specific PSE-augmentations to achieve optimal performance. Nevertheless, our findings highlight their significant potential to become integral components of future graph foundation models. We provide new insights into the strengths and limitations of PSEs, contributing to the broader discourse on foundation models in graph learning.
Earthquake monitoring is a fundamental task to unravel the underlying physics of earthquakes and mitigate associated hazards for public safe… (voir plus)ty. Distributed acoustic sensing, or DAS, which transforms pre-existing telecommunication cables into ultra-dense seismic networks, offers a cost-effective and scalable solution for next-generation earthquake monitoring. However, current approaches for earthquake monitoring like PhaseNet and PhaseNet-2 primarily rely on supervised learning, while manually labeled DAS data is quite limited and it is difficult to obtain more annotated datasets. In this paper, we present DASFormer, a novel self-supervised pretraining technique on DAS data with a coarse-to-fine framework that models spatial-temporal signal correlation. We treat earthquake monitoring as an anomaly detection task and demonstrate DASFormer can be directly utilized as a seismic phase detector. Experimental results demonstrate that DASFormer is effective in terms of several evaluation metrics and outperforms state-of-the-art time-series forecasting, anomaly detection, and foundation models on the unsupervised seismic detection task. We also demonstrate the potential of fine-tuning DASFormer to downstream tasks through case studies.
GradTune: Last-layer Fine-tuning for Group Robustness Without Group Annotation
Patrik Joslin Kenfack
Ulrich Matchi Aïvodji
S Ebrahimi Kahou
This work addresses the limitations of deep neural networks (DNNs) in generalizing beyond training data due to spurious correlations. Recent… (voir plus) research has demonstrated that models trained with empirical risk minimization learn both core and spurious features, often upweighting spurious ones in the final classification, which can frequently lead to poor performance on minority groups. Deep Feature Reweighting alleviates this issue by retraining the model's last classification layer using a group-balanced held-out validation set. However, relying on spurious feature labels during training or validation limits practical application, as spurious features are not always known or costly to annotate. Our preliminary experiments reveal that ERM-trained models exhibit higher gradient norms on minority group samples in the hold-out dataset. Leveraging these insights, we propose an alternative approach called GradTune, which fine-tunes the last classification layer using high-gradient norm samples. Our results on four well-established benchmarks demonstrate that the proposed method can achieve competitive performance compared to existing methods without requiring group labels during training or validation.