Découvrez le dernier rapport d'impact de Mila, qui met en lumière les réalisations exceptionnelles des membres de notre communauté au cours de la dernière année.
Rapport et guide politique GPAI: Vers une réelle égalité en IA
Rejoignez-nous à Mila le 26 novembre pour le lancement du rapport et du guide politique qui présente des recommandations concrètes pour construire des écosystèmes d'IA inclusifs.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Empirical Analysis of Model Selection for Heterogenous Causal Effect Estimation
We study the problem of model selection in causal inference, specifically for the case of conditional average treatment effect (CATE) estima… (voir plus)tion under binary treatments. Unlike model selection in machine learning, there is no perfect analogue of cross-validation as we do not observe the counterfactual potential outcome for any data point. Towards this, there have been a variety of proxy metrics proposed in the literature, that depend on auxiliary nuisance models estimated from the observed data (propensity score model, outcome regression model). However, the effectiveness of these metrics has only been studied on synthetic datasets as we can access the counterfactual data for them. We conduct an extensive empirical analysis to judge the performance of these metrics introduced in the literature, and novel ones introduced in this work, where we utilize the latest advances in generative modeling to incorporate multiple realistic datasets. Our analysis suggests novel model selection strategies based on careful hyperparameter tuning of CATE estimators and causal ensembling.
The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Com… (voir plus)mon dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE, combining information learned from the random forest model with the function-learning capabilities of autoencoders. Through quantitative assessment of various autoencoder architectures, we identify that networks that reconstruct random forest proximities are more robust for the embedding extension problem. Furthermore, by leveraging proximity-based prototypes, we achieve a 40% reduction in training time without compromising extension quality. Our method does not require label information for out-of-sample points, thus serving as a semi-supervised method, and can achieve consistent quality using only 10% of the training data.
2024-01-01
2024 IEEE 34th International Workshop on Machine Learning for Signal Processing (MLSP) (publié)
Large Language Models (LLMs) have shown significant promise in various tasks, including identifying the political beliefs of English-speakin… (voir plus)g social media users from their posts. However, assessing LLMs for this task in non-English languages remains unexplored. In this work, we ask to what extent LLMs can predict the political ideologies of users in Persian social media. To answer this question, we first acknowledge that political parties are not well-defined among Persian users, and therefore, we simplify the task to a much simpler task of hyperpartisan ideology detection. We create a new benchmark and show the potential and limitations of both open-source and commercial LLMs in classifying the hyper-partisan ideologies of users. We compare these models with smaller fine-tuned models, both on the Persian language (ParsBERT) and translated data (RoBERTa), showing that they considerably outperform generative LLMs in this task. We further demonstrate that the performance of the generative LLMs degrades when classifying users based on their tweets instead of their bios and even when tweets are added as additional information, whereas the smaller fine-tuned models are robust and achieve similar performance for all classes. This study is a first step toward political ideology detection in Persian Twitter, with implications for future research to understand the dynamics of ideologies in Persian social media.
Large Language Models (LLMs) have shown significant promise in various tasks, including identifying the political beliefs of English-speakin… (voir plus)g social media users from their posts. However, assessing LLMs for this task in non-English languages remains unexplored. In this work, we ask to what extent LLMs can predict the political ideologies of users in Persian social media. To answer this question, we first acknowledge that political parties are not well-defined among Persian users, and therefore, we simplify the task to a much simpler task of hyperpartisan ideology detection. We create a new benchmark and show the potential and limitations of both open-source and commercial LLMs in classifying the hyper-partisan ideologies of users. We compare these models with smaller fine-tuned models, both on the Persian language (ParsBERT) and translated data (RoBERTa), showing that they considerably outperform generative LLMs in this task. We further demonstrate that the performance of the generative LLMs degrades when classifying users based on their tweets instead of their bios and even when tweets are added as additional information, whereas the smaller fine-tuned models are robust and achieve similar performance for all classes. This study is a first step toward political ideology detection in Persian Twitter, with implications for future research to understand the dynamics of ideologies in Persian social media.
High-throughput satellite (HTS), with its digital payload technology, is expected to play a key role as an enabler of the upcoming sixth-gen… (voir plus)eration (6G) networks. HTS is mainly designed to provide higher data rates and capacities. Fueled by technological advancements, including beamforming, advanced modulation techniques, reconfigurable phased array technologies, and electronically steerable antennas, HTS has emerged as a fundamental component for future network generations. This paper offers a comprehensive state-of-the-art on HTS systems, focusing on standardization, patents, channel multiple access techniques, routing, load balancing, and the role of software-defined networking (SDN). In addition, we provide a vision for next-generation satellite systems that we have named Extremely-HTS (EHTS) toward autonomous satellites supported by the main requirements and key technologies expected for these systems. The EHTS system will be designed to maximize spectrum reuse and data rates and to flexibly steer the capacity to satisfy user demand. We introduce a novel architecture for future programmable regenerative payloads as well.
The increasing scale of Transformer models has led to an increase in their pre-training computational requirements. While quantization has p… (voir plus)roven to be effective after pre-training and during fine-tuning, applying quantization in Transformers during pre-training has remained largely unexplored at scale for language modeling. This study aims to explore the impact of quantization for efficient pre-training of Transformers, with a focus on linear layer components. By systematically applying straightforward linear quantization to weights, activations, gradients, and optimizer states, we assess its effects on model efficiency, stability, and performance during training. By offering a comprehensive recipe of effective quantization strategies to be applied during the pre-training of Transformers, we promote high training efficiency from scratch while retaining language modeling ability. Code is available at https://github.com/chandar-lab/EfficientLLMs.
Introduction Mobile health apps risk widening health disparities if they overlook digital inclusion. The digital divide, encompassing access… (voir plus), familiarity, and readiness, poses a significant barrier to medical interventions. Existing literature lacks exploration of the digital divide's contributing factors. Hence, data are needed to comprehend the challenges in developing inclusive health apps. Methods We created a survey to gauge internet and smartphone access, smartphone familiarity, and readiness for using mobile health apps among caregivers of pediatric patients in tertiary care. Open-ended questions solicited feedback and suggestions on mobile health applications. Responses were categorized by similarity and compared. Developed with patient partners, the survey underwent cognitive testing and piloting for accuracy. Results Data from 209 respondents showed that 23% were affected by the digital divide, mainly due to unfamiliarity with digital skills. Among 49 short text responses about health app concerns, 31 mentioned security and confidentiality, with 7 mentioning the impersonal nature of such apps. Desired features included messaging healthcare providers, scheduling, task reminders, and simplicity. Conclusions This study underscores a digital divide among caregivers of pediatric patients, with nearly a quarter affected primarily due to a lack of digital comfort. Respondents emphasized user-friendliness and online security for health apps. Future apps should prioritize digital inclusion by addressing the significant barriers and carefully considering patient and family concerns.
In offline model-based optimisation (MBO) we are interested in using machine learning to de-sign candidates that maximise some measure of d… (voir plus)esirability through an expensive but real-world scoring process. Offline MBO tries to approximate this expensive scoring function and use that to evaluate generated designs, however evaluation is non-exact because one approximation is being evaluated with another. Instead, we ask ourselves: if we did have the real world scoring function at hand, what cheap-to-compute validation metrics would correlate best with this? Since the real-world scoring function is available for simulated MBO datasets, insights obtained from this can be transferred over to real-world offline MBO tasks where the real-world scoring function is expensive to compute. To address this, we propose a conceptual evaluation framework that is amenable to measuring extrapolation, and apply this to conditional denoising diffusion models. Empirically, we find that two validation metrics – agreement and Frechet distance – correlate quite well with the ground truth. When there is high variability in conditional generation, feedback is required in the form of an approximated version of the real-world scoring function. Furthermore, we find that generating high-scoring samples may require heavily weighting the generative model in favour of sample quality, potentially at the cost of sample diversity.
As online music platforms grow, music recommender systems play a vital role in helping users navigate and discover content within their vast… (voir plus) musical databases. At odds with this larger goal, is the presence of popularity bias, which causes algorithmic systems to favor mainstream content over, potentially more relevant, but niche items. In this work we explore the intrinsic relationship between music discovery and popularity bias. To mitigate this issue we propose a domain-aware, individual fairness-based approach which addresses popularity bias in graph neural network (GNNs) based recommender systems. Our approach uses individual fairness to reflect a ground truth listening experience, i.e., if two songs sound similar, this similarity should be reflected in their representations. In doing so, we facilitate meaningful music discovery that is robust to popularity bias and grounded in the music domain. We apply our BOOST methodology to two discovery based tasks, performing recommendations at both the playlist level and user level. Then, we ground our evaluation in the cold start setting, showing that our approach outperforms existing fairness benchmarks in both performance and recommendation of lesser-known content. Finally, our analysis explains why our proposed methodology is a novel and promising approach to mitigating popularity bias and improving the discovery of new and niche content in music recommender systems.