TRAIL : IA responsable pour les professionnels et les leaders
Apprenez à intégrer des pratique d'IA responsable dans votre organisation avec le programme TRAIL. Inscrivez-vous à la prochaine cohorte qui débutera le 15 avril.
Avantage IA : productivité dans la fonction publique
Apprenez à tirer parti de l’IA générative pour soutenir et améliorer votre productivité au travail. La prochaine cohorte se déroulera en ligne les 28 et 30 avril 2026.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach
It is commonly believed that knowledge of syntactic structure should improve language modeling. However, effectively and computationally eff… (voir plus)iciently incorporating syntactic structure into neural language models has been a challenging topic. In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called "syntactic distances", where information between these two separate objectives shares the same intermediate representation. Experimental results on the Penn Treebank and Chinese Treebank datasets show that when ground truth parse trees are provided as additional training signals, the model is able to achieve lower perplexity and induce trees with better quality.
2020-06-30
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (publié)
The recent development of sequencing technologies revolutionized our understanding of the inner workings of the cell as well as the way dise… (voir plus)ase is treated. A single RNA sequencing (RNA-Seq) experiment, however, measures tens of thousands of parameters simultaneously. While the results are information rich, data analysis provides a challenge. Dimensionality reduction methods help with this task by extracting patterns from the data by compressing it into compact vector representations.
We present the factorized embeddings (FE) model, a self-supervised deep learning algorithm that learns simultaneously, by tensor factorization, gene and sample representation spaces. We ran the model on RNA-Seq data from two large-scale cohorts and observed that the sample representation captures information on single gene and global gene expression patterns. Moreover, we found that the gene representation space was organized such that tissue-specific genes, highly correlated genes as well as genes participating in the same GO terms were grouped. Finally, we compared the vector representation of samples learned by the FE model to other similar models on 49 regression tasks. We report that the representations trained with FE rank first or second in all of the tasks, surpassing, sometimes by a considerable margin, other representations.
A toy example in the form of a Jupyter Notebook as well as the code and trained embeddings for this project can be found at: https://github.com/TrofimovAssya/FactorizedEmbeddings.
Supplementary data are available at Bioinformatics online.
By virtue of their expressive power, neural networks (NNs) are well suited to fitting large, complex datasets, yet they are also known to
… (voir plus)produce similar predictions for points outside the training distribution.
As such, they are, like humans, under the influence of the Black Swan theory: models tend to be extremely "surprised" by rare events, leading to potentially disastrous consequences, while justifying these same events in hindsight.
To avoid this pitfall, we introduce DENN, an ensemble approach building a set of Diversely Extrapolated Neural Networks that fits the training data and is able to generalize more diversely when extrapolating to novel data points.
This leads DENN to output highly uncertain predictions for unexpected inputs.
We achieve this by adding a diversity term in the loss function used to train the model, computed at specific inputs.
We first illustrate the usefulness of the method on a low-dimensional regression problem.
Then, we show how the loss can be adapted to tackle anomaly detection during classification, as well as safe imitation learning problems.
2020-06-30
International Joint Conference on Artificial Intelligence (publié)
Existing machine reading comprehension (MRC) models do not scale effectively to real-world applications like web-level information retrieval… (voir plus) and question answering (QA). We argue that this stems from the nature of MRC datasets: most of these are static environments wherein the supporting documents and all necessary information are fully observed. In this paper, we propose a simple method that reframes existing MRC datasets as interactive, partially observable environments. Specifically, we “occlude” the majority of a document’s text and add context-sensitive commands that reveal “glimpses” of the hidden text to a model. We repurpose SQuAD and NewsQA as an initial case study, and then show how the interactive corpora can be used to train a model that seeks relevant information through sequential decision making. We believe that this setting can contribute in scaling models to web-level QA scenarios.
2020-06-30
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (publié)
When an agent has limited information on its environment, the suboptimality of an RL algorithm can be decomposed into the sum of two terms: … (voir plus)a term related to an asymptotic bias (suboptimality with unlimited data) and a term due to overfitting (additional suboptimality due to limited data). In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources. In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.
2020-06-30
International Joint Conference on Artificial Intelligence (publié)
Visual referring expression recognition is a challenging task that requires natural language understanding in the context of an image. We cr… (voir plus)itically examine RefCOCOg, a standard benchmark for this task, using a human study and show that 83.7% of test instances do not require reasoning on linguistic structure, i.e., words are enough to identify the target object, the word order doesn’t matter. To measure the true progress of existing models, we split the test set into two sets, one which requires reasoning on linguistic structure and the other which doesn’t. Additionally, we create an out-of-distribution dataset Ref-Adv by asking crowdworkers to perturb in-domain examples such that the target object changes. Using these datasets, we empirically show that existing methods fail to exploit linguistic structure and are 12% to 23% lower in performance than the established progress for this task. We also propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT, the current state-of-the-art model for this task. Our datasets are publicly available at https://github.com/aws/aws-refcocog-adv.
2020-06-30
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (publié)
The manner through which individual differences in brain network organization track population-level behavioral variability is a fundamental… (voir plus) question in systems neuroscience. Recent work suggests that resting-state and task-state functional connectivity can predict specific traits at the individual level. However, the focus of most studies on single behavioral traits has come at the expense of capturing broader relationships across behaviors. Here, we utilized a large-scale dataset of 1858 typically developing children to estimate whole-brain functional network organization that is predictive of individual differences in cognition, impulsivity-related personality, and mental health during rest and task states. Predictive network features were distinct across the broad behavioral domains: cognition, personality and mental health. On the other hand, traits within each behavioral domain were predicted by highly similar network features. This is surprising given decades of research emphasizing that distinct brain networks support different mental processes. Although tasks are known to modulate the functional connectome, we found that predictive network features were similar between resting and task states. Overall, our findings reveal shared brain network features that account for individual variation within broad domains of behavior in childhood, yet are unique to different behavioral domains.
Domain adaptation (DA) is a technique that transfers predictive models trained on a labeled source domain to an unlabeled target domain, wit… (voir plus)h the core difficulty of resolving distributional shift between domains. Currently, most popular DA algorithms are based on distributional matching (DM). However in practice, realistic domain shifts (RDS) may violate their basic assumptions and as a result these methods will fail. In this paper, in order to devise robust DA algorithms, we first systematically analyze the limitations of DM based methods, and then build new benchmarks with more realistic domain shifts to evaluate the well-accepted DM methods. We further propose InstaPBM, a novel Instance-based Predictive Behavior Matching method for robust DA. Extensive experiments on both conventional and RDS benchmarks demonstrate both the limitations of DM methods and the efficacy of InstaPBM: Compared with the best baselines, InstaPBM improves the classification accuracy respectively by