NLP in the era of generative AI, cognitive sciences, and societal transformation
Join us at Mila in October for a three-day workshop to explore the transformative potential of language technologies and their implications for society.
This program is designed to provide decision-makers, policymakers and professional working in policy with a foundational understanding of AI technology.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining
Multi-document summarization is a challenging task for which there exists little large-scale datasets. We propose Multi-XScience, a large-sc… (see more)ale multi-document summarization dataset created from scientific articles. Multi-XScience introduces a challenging multi-document summarization task: writing the related-work section of a paper based on its abstract and the articles it references. Our work is inspired by extreme summarization, a dataset construction protocol that favours abstractive modeling approaches. Descriptive statistics and empirical results—using several state-of-the-art models trained on the Multi-XScience dataset—reveal that Multi-XScience is well suited for abstractive models.
2020-11-01
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (published)
Inferring missing facts in temporal knowledge graphs (TKGs) is a fundamental and challenging task. Previous works have approached this probl… (see more)em by augmenting methods for static knowledge graphs to leverage time-dependent representations. However, these methods do not explicitly leverage multi-hop structural information and temporal facts from recent time steps to enhance their predictions. Additionally, prior work does not explicitly address the temporal sparsity and variability of entity distributions in TKGs. We propose the Temporal Message Passing (TeMP) framework to address these challenges by combining graph neural networks, temporal dynamics models, data imputation and frequency-based gating techniques. Experiments on standard TKG tasks show that our approach provides substantial gains compared to the previous state of the art, achieving a 10.7% average relative improvement in Hits@10 across three standard benchmarks. Our analysis also reveals important sources of variability both within and across TKG datasets, and we introduce several simple but strong baselines that outperform the prior state of the art in certain settings.
2020-11-01
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (published)
Human-written texts contain frequent generalizations and semantic aggregation of content. In a document, they may refer to a pair of named e… (see more)ntities such as ‘London’ and ‘Paris’ with different expressions: “the major cities”, “the capital cities” and “two European cities”. Yet generation, especially, abstractive summarization systems have so far focused heavily on paraphrasing and simplifying the source content, to the exclusion of such semantic abstraction capabilities. In this paper, we present a new dataset and task aimed at the semantic aggregation of entities. TESA contains a dataset of 5.3K crowd-sourced entity aggregations of Person, Organization, and Location named entities. The aggregations are document-appropriate, meaning that they are produced by annotators to match the situational context of a given news article from the New York Times. We then build baseline models for generating aggregations given a tuple of entities and document context. We finetune on TESA an encoder-decoder language model and compare it with simpler classification methods based on linguistically informed features. Our quantitative and qualitative evaluations show reasonable performance in making a choice from a given list of expressions, but free-form expressions are understandably harder to generate and evaluate.
2020-11-01
Conference on Empirical Methods in Natural Language Processing (published)
Domain modelling transforms domain problem descriptions written in natural language (NL) into analyzable and concise domain models (class di… (see more)agrams) during requirements analysis or the early stages of design in software development. Since the practice of domain modelling requires time in addition to modelling skills and experience, several approaches have been proposed to automate or semi-automate the construction of domain models from problem descriptions expressed in NL. Despite the existing work on domain model extraction, some significant challenges remain unaddressed: (i) the extracted domain models are not accurate enough to be used directly or with minor modifications in software development, (ii) existing approaches do not facilitate the tracing of the rationale behind the modelling decisions taken by the model extractor, and (iii) existing approaches do not provide interactive interfaces to update the extracted domain models. Therefore, in this paper, we introduce a domain modelling bot called DoMoBOT, explain its architecture, and implement it in the form of a web-based prototype tool. The bot automatically extracts a domain model from a problem description written in NL with an accuracy higher than existing approaches. Furthermore, the bot enables modellers to update a part of the extracted domain model and in response the bot re-configures the other parts of the domain model pro-actively. To improve the accuracy of extracted domain models, we combine the techniques of Natural Language Processing and Machine Learning. Finally, we evaluate the accuracy of the extracted domain models.
2020-10-26
Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings (published)
Background: The Canadian epidemics of COVID-19 exhibit distinct early trajectories, with Quebec bearing a very high initial burden. The sema… (see more)ine de relache, or March break, took place two weeks earlier in Quebec as compared to the rest of Canada. This event may have played a role in the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). We aimed to examine the role of case importation in the early transmission dynamics of SARS-CoV-2 in Quebec. Methods: Using detailed surveillance data, we developed and calibrated a deterministic SEIR-type compartmental model of SARS-CoV-2 transmission. We explored the impact of altering the number of imported cases on hospitalizations. Specifically, we investigated scenarios without case importation after March break, and as scenarios where cases were imported with the same frequency/timing as neighboring Ontario. Results: A total of 1,544 and 1,150 returning travelers were laboratory-confirmed in Quebec and Ontario, respectively (with symptoms onset before 2020-03-25). The cumulative number of hospitalizations could have been reduced by 55% (95% credible interval [95%CrI]: 51-59%) had no cases been imported after Quebec's March break. However, had Quebec experienced Ontario's number of imported cases, cumulative hospitalizations would have only been reduced by 12% (95%CrI: 8-16%). Interpretation: Our results suggest that case importation played an important role in the early spread of COVID-19 in Quebec. Yet, heavy importation of SARS-CoV-2 in early March could be insufficient to resolve interprovincial heterogeneities in cumulative hospitalizations. The importance of other factors -public health preparedness, responses, and capacity- should be investigated.
2020-10-25
International Journal of Infectious Diseases (published)
The role of case importation in explaining differences in early SARS-CoV-2 transmission dynamics in Canada—A mathematical modeling study of surveillance data
Syntax is fundamental to our thinking about language. Although neural networks are very successful in many tasks, they do not explicitly mod… (see more)el syntactic structure. Failing to capture the structure of inputs could lead to generalization problems and over-parametrization. In the present work, we propose a new syntax-aware language model: Syntactic Ordered Memory (SOM). The model explicitly models the structure with a one-step look-ahead parser and maintains the conditional probability setting of the standard language model. Experiments show that SOM can achieve strong results in language modeling and syntactic generalization tests, while using fewer parameters then other models.
Modeling joint probability distributions over sequences has been studied from many perspectives. The physics community developed matrix prod… (see more)uct states, a tensor-train decomposition for probabilistic modeling, motivated by the need to tractably model many-body systems. But similar models have also been studied in the stochastic processes and weighted automata literature, with little work on how these bodies of work relate to each other. We address this gap by showing how stationary or uniform versions of popular quantum tensor network models have equivalent representations in the stochastic processes and weighted automata literature, in the limit of infinitely long sequences. We demonstrate several equivalence results between models used in these three communities: (i) uniform variants of matrix product states, Born machines and locally purified states from the quantum tensor networks literature, (ii) predictive state representations, hidden Markov models, norm-observable operator models and hidden quantum Markov models from the stochastic process literature,and (iii) stochastic weighted automata, probabilistic automata and quadratic automata from the formal languages literature. Such connections may open the door for results and methods developed in one area to be applied in another.