Le traitement du langage naturel à l'ère de l'IA générative
Rejoignez-nous à Mila en octobre pour un atelier de trois jour visant à explorer le potentiel de transformation des technologies langagières et leurs implications pour la société.
Ce programme est conçu pour fournir aux professionnel·le·s travaillant dans le domaine de la politique une compréhension fondamentale de la technologie de l'IA.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Uncertain Evidence in Probabilistic Models and Stochastic Simulators
We consider the problem of performing Bayesian inference in probabilistic models where observations are accompanied by uncertainty, referred… (voir plus) to as "uncertain evidence.'' We explore how to interpret uncertain evidence, and by extension the importance of proper interpretation as it pertains to inference about latent variables. We consider a recently-proposed method "distributional evidence'' as well as revisit two older methods: Jeffrey's rule and virtual evidence. We devise guidelines on how to account for uncertain evidence and we provide new insights, particularly regarding consistency. To showcase the impact of different interpretations of the same uncertain evidence, we carry out experiments in which one interpretation is defined as "correct.'' We then compare inference results from each different interpretation illustrating the importance of careful consideration of uncertain evidence.
2023-07-03
Proceedings of the 40th International Conference on Machine Learning (publié)
Slot attention is a powerful method for object-centric modeling in images and videos. However, its set-equivariance limits its ability to ha… (voir plus)ndle videos with a dynamic number of objects because it cannot break ties. To overcome this limitation, we first establish a connection between slot attention and optimal transport. Based on this new perspective we propose **MESH** (Minimize Entropy of Sinkhorn): a cross-attention module that combines the tiebreaking properties of unregularized optimal transport with the speed of regularized optimal transport. We evaluate slot attention using MESH on multiple object-centric learning benchmarks and find significant improvements over slot attention in every setting.
2023-07-03
Proceedings of the 40th International Conference on Machine Learning (publié)
Stochastic min-max optimization has gained interest in the machine learning community with the advancements in GANs and adversarial training… (voir plus). Although game optimization is fairly well understood in the deterministic setting, some issues persist in the stochastic regime. Recent work has shown that stochastic gradient descent-ascent methods such as the optimistic gradient are highly sensitive to noise or can fail to converge. Although alternative strategies exist, they can be prohibitively expensive. We introduce Omega, a method with optimistic-like updates that mitigates the impact of noise by incorporating an EMA of historic gradients in its update rule. We also explore a variation of this algorithm that incorporates momentum. Although we do not provide convergence guarantees, our experiments on stochastic games show that Omega outperforms the optimistic gradient method when applied to linear players.
The potential of automatic code generation through Model-Driven Engineering (MDE) frameworks has yet to be realized. Beyond their ability to… (voir plus) help software professionals write more accurate, reusable code, MDE frameworks could make programming accessible for a new class of domain experts. However, domain experts have been slow to embrace these tools, as they still need to learn how to specify their applications' requirements using the concrete syntax (i.e., textual or graphical) of the new and unified domain-specific language. Conversational interfaces (chatbots) could smooth the learning process and offer a more interactive way for domain experts to specify their application requirements and generate the desired code. If integrated with MDE frameworks, chatbots may offer domain experts with richer domain vocabulary without sacrificing the power of agnosticism that unified modelling frameworks provide. In this paper, we discuss the challenges of integrating chatbots within MDE frameworks and then examine a specific application: the auto-generation of smart contract code based on conversational syntax. We demonstrate how this can be done and evaluate our approach by conducting a user experience survey to assess the usability and functionality of the chatbot framework. The paper concludes by drawing attention to the potential benefits of leveraging Language Models (LLMs) in this context.
2023-07-01
2023 IEEE International Conference on Software Services Engineering (SSE) (publié)
Curriculum frameworks and educational programs in artificial intelligence for medical students, residents, and practicing physicians: a scoping review protocol.
OBJECTIVE
The aim of this scoping review is to synthesize knowledge from the literature on curriculum frameworks and current educational pro… (voir plus)grams that focus on the teaching and learning of artificial intelligence (AI) for medical students, residents, and practicing physicians.
INTRODUCTION
To advance the implementation of AI in clinical practice, physicians need to have a better understanding of AI and how to use it within clinical practice. Consequently, medical education must introduce AI topics and concepts into the curriculum. Curriculum frameworks are educational road maps to teaching and learning. Therefore, any existing AI curriculum frameworks must be reviewed and, if none exist, such a framework must be developed.
INCLUSION CRITERIA
This review will include articles that describe curriculum frameworks for teaching and learning AI in medicine, irrespective of country. All types of articles and study designs will be included, except conference abstracts and protocols.
METHODS
This review will follow the JBI methodology for scoping reviews. Keywords will first be identified from relevant articles. Another search will then be conducted using the identified keywords and index terms. The following databases will be searched: MEDLINE (Ovid), Embase (Ovid), Cochrane Central Register of Controlled Trials (CENTRAL), CINAHL (EBSCOhost), and Scopus. Gray literature will also be searched. Articles will be limited to the English and French languages, commencing from the year 2000. The reference lists of all included articles will be screened for additional articles. Data will then be extracted from included articles and the results will be presented in a table.
It is critical to measure and mitigate fairness-related harms caused by AI text generation systems, including stereotyping and demeaning har… (voir plus)ms. To that end, we introduce FairPrism, a dataset of 5,000 examples of AI-generated English text with detailed human annotations covering a diverse set of harms relating to gender and sexuality. FairPrism aims to address several limitations of existing datasets for measuring and mitigating fairness-related harms, including improved transparency, clearer specification of dataset coverage, and accounting for annotator disagreement and harms that are context-dependent. FairPrism’s annotations include the extent of stereotyping and demeaning harms, the demographic groups targeted, and appropriateness for different applications. The annotations also include specific harms that occur in interactive contexts and harms that raise normative concerns when the “speaker” is an AI system. Due to its precision and granularity, FairPrism can be used to diagnose (1) the types of fairness-related harms that AI text generation systems cause, and (2) the potential limitations of mitigation methods, both of which we illustrate through case studies. Finally, the process we followed to develop FairPrism offers a recipe for building improved datasets for measuring and mitigating harms caused by AI systems.
2023-07-01
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (publié)
Intrusion detection systems (IDSs) are crucial in the security monitoring for the smart grid with increasing machine-to-machine communicatio… (voir plus)ns and cyber threats thereafter. However, the multi-sourced, correlated, and heterogeneous smart grid data pose significant challenges to the accurate attack detection by IDSs. To improve the attack detection, this paper proposes Reinforcement Learning-based Adaptive Feature Boosting, which aims to leverage a series of AutoEncoders (AEs) to capture critical features from the multi-sourced smart grid data for the classification of normal, fault, and attack events. Multiple AEs are utilized to extract representative features from different feature sets that are automatically generated through a weighted feature sampling process; each AE-extracted feature set is then applied to build a Random Forest (RF) base classifier. In the feature sampling process, Deep Deterministic Policy Gradient (DDPG) is introduced to dynamically determine the feature sampling probability based on the classification accuracy. The critical features that improve the classification accuracy are assigned larger sampling probabilities and increasingly participate in the training of next AE. The presence of critical features is increased in the event classification over the multi-sourced smart grid data. Considering potential different alarms among base classifiers, an ensemble classifier is further built to distinguish normal, fault, and attack events. Our proposed approach is evaluated on the two realistic datasets collected from Hardware-In-the-Loop (HIL) and WUSTIL-IIOT-2021 security testbeds, respectively. The evaluation on the HIL security dataset shows that our proposed approach achieves the classification accuracy with 97.28%, an effective 5.5% increase over the vanilla Adaptive Feature Boosting. Moreover, the proposed approach not only accurately and stably selects critical features on the WUSTIL-IIOT-2021 dataset based on the significant difference of feature sampling probabilities between critical and uncritical features, i.e., the probabilities greater than 0.08 and less than 0.01, but also outperforms the other best-performing approaches with the increasing Matthew Correlation Coefficient (MCC) of 8.03%.