Le traitement du langage naturel à l'ère de l'IA générative
Rejoignez-nous à Mila en octobre pour un atelier de trois jour visant à explorer le potentiel de transformation des technologies langagières et leurs implications pour la société.
Ce programme est conçu pour fournir aux professionnel·le·s travaillant dans le domaine de la politique une compréhension fondamentale de la technologie de l'IA.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
DoMoBOT: An AI-Empowered Bot for Automated and Interactive Domain Modelling
Domain modelling transforms informal requirements written in natural language in the form of problem descriptions into concise and analyzabl… (voir plus)e domain models. As the manual construction of these domain models is often time-consuming, error-prone, and labor-intensive, several approaches already exist to automate domain modelling. However, the current approaches suffer from lower accuracy of extracted domain models and the lack of support for system-modeller interactions. To better assist modellers, we introduce DoMoBOT, a web-based Domain Modelling BOT. Our proposed bot combines artificial intelligence techniques such as natural language processing and machine learning to extract domain models with higher accuracy. More importantly, our bot incorporates a set of features to bring synergy between automated model extraction and bot-modeller interactions. During these interactions, the bot presents multiple possible solutions to a modeller for modelling scenarios present in a given problem description. The bot further enables modellers to switch to a particular solution and updates the other parts of the domain model proactively. In this tool demo paper, we demonstrate how the implementation and architecture of DoMoBOT support the paradigm of automated and interactive domain modelling for assisting modellers.
2021-10-10
2021 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C) (publié)
We investigate the impact of aliasing on generalization in Deep Convolutional Networks and show that data augmentation schemes alone are una… (voir plus)ble to prevent it due to structural limitations in widely used architectures. Drawing insights from frequency analysis theory, we take a closer look at ResNet and EfficientNet architectures and review the trade-off between aliasing and information loss in each of their major components. We show how to mitigate aliasing by inserting non-trainable low-pass filters at key locations, particularly where networks lack the capacity to learn them. These simple architectural changes lead to substantial improvements in generalization on i.i.d. and even more on out-of-distribution conditions, such as image classification under natural corruptions on ImageNet-C [11] and few-shot learning on Meta-Dataset [26]. State-of-the art results are achieved on both datasets without introducing additional trainable parameters and using the default hyper-parameters of open source codebases.
2021-10-10
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (publié)
Model‐based development is a popular development approach in which software is implemented and verified based on a model of the required s… (voir plus)ystem. Finite state machines (FSMs) are widely used as models for systems in several domains. Validating that a model accurately represents the required behaviour involves the generation and execution of a large number of input sequences, which is often an expensive and time‐consuming process. In this paper, we speed up the execution of input sequences for FSM validation, by leveraging the high degree of parallelism of modern graphics processing units (GPUs) for the automatic execution of FSM input sequences in parallel on the GPU threads. We expand our existing work by providing techniques that improve the performance and scalability of this approach. We conduct extensive empirical evaluation using 15 large FSMs from the networking domain and measure GPU speed‐up over a 16‐core CPU, taking into account total GPU time, which includes both data transfer and kernel execution time. We found that GPUs execute FSM input sequences up to 9.28× faster than a 16‐core CPU, with an average speed‐up of 4.53× across all subjects. Our optimizations achieve an average improvement over existing work of 58.95% for speed‐up and scalability to large FSMs with over 2K states and 500K transitions. We also found that techniques aimed at reducing the number of required input sequences for large FSMs with high density were ineffective when applied to all‐transition pair coverage, thus emphasizing the need for approaches like ours that speed up input execution.
2021-10-08
Software Testing, Verification and Reliability (publié)
Model‐based development is a popular development approach in which software is implemented and verified based on a model of the required s… (voir plus)ystem. Finite state machines (FSMs) are widely used as models for systems in several domains. Validating that a model accurately represents the required behaviour involves the generation and execution of a large number of input sequences, which is often an expensive and time‐consuming process. In this paper, we speed up the execution of input sequences for FSM validation, by leveraging the high degree of parallelism of modern graphics processing units (GPUs) for the automatic execution of FSM input sequences in parallel on the GPU threads. We expand our existing work by providing techniques that improve the performance and scalability of this approach. We conduct extensive empirical evaluation using 15 large FSMs from the networking domain and measure GPU speed‐up over a 16‐core CPU, taking into account total GPU time, which includes both data transfer and kernel execution time. We found that GPUs execute FSM input sequences up to 9.28× faster than a 16‐core CPU, with an average speed‐up of 4.53× across all subjects. Our optimizations achieve an average improvement over existing work of 58.95% for speed‐up and scalability to large FSMs with over 2K states and 500K transitions. We also found that techniques aimed at reducing the number of required input sequences for large FSMs with high density were ineffective when applied to all‐transition pair coverage, thus emphasizing the need for approaches like ours that speed up input execution.
We introduce the problem of jointly increasing school capacities and finding a student-optimal assignment in the expanded market. Due to the… (voir plus) impossibility of efficiently solving the problem with classical methods, we generalize existent mathematical programming formulations of stability constraints to our setting, most of which result in integer quadratically-constrained programs. In addition, we propose a novel mixed-integer linear programming formulation that is exponentially large on the problem size. We show that its stability constraints can be separated by exploiting the objective function, leading to an effective cutting-plane algorithm. We conclude the theoretical analysis of the problem by discussing some mechanism properties. On the computational side, we evaluate the performance of our approaches in a detailed study, and we find that our cutting-plane method outperforms our generalization of existing mixed-integer approaches. We also propose two heuristics that are effective for large instances of the problem. Finally, we use the Chilean school choice system data to demonstrate the impact of capacity planning under stability conditions. Our results show that each additional seat can benefit multiple students and that we can effectively target the assignment of previously unassigned students or improve the assignment of several students through improvement chains. These insights empower the decision-maker in tuning the matching algorithm to provide a fair application-oriented solution.
Many population exposures in time-series analysis, including food marketing, exhibit a time-lagged association with population health outcom… (voir plus)es such as food purchasing. A common approach to measuring patterns of associations over different time lags relies on a finite-lag model, which requires correct specification of the maximum duration over which the lagged association extends. However, the maximum lag is frequently unknown due to the lack of substantive knowledge or the geographic variation of lag length. We describe a time-series analytical approach based on an infinite lag specification under a transfer function model that avoids the specification of an arbitrary maximum lag length. We demonstrate its application to estimate the lagged exposure-outcome association in food environmental research: display promotion of sugary beverages with lagged sales.
Natural language processing (NLP) and understanding aim to read from unformatted text to accomplish different tasks. While word embeddings l… (voir plus)earned by deep neural networks are widely used, the underlying linguistic and semantic structures of text pieces cannot be fully exploited in these representations. Graph is a natural way to capture the connections between different text pieces, such as entities, sentences, and documents. To overcome the limits in vector space models, researchers combine deep learning models with graph-structured representations for various tasks in NLP and text mining. Such combinations help to make full use of both the structural information in text and the representation learning ability of deep neural networks. In this chapter, we introduce the various graph representations that are extensively used in NLP, and show how different NLP tasks can be tackled from a graph perspective. We summarize recent research works on graph-based NLP, and discuss two case studies related to graph-based text clustering, matching, and multihop machine reading comprehension in detail. Finally, we provide a synthesis about the important open problems of this subfield.