Publications

Horizontal gene transfer and recombination analysis of SARS-CoV-2 genes helps discover its close relatives and shed light on its origin

Pierre Legendre

The SARS-CoV-2 pandemic is one of the greatest global medical and social challenges that have emerged in recent history. Human coronavirus s… (see more)trains discovered during previous SARS outbreaks have been hypothesized to pass from bats to humans using intermediate hosts, e.g. civets for SARS-CoV and camels for MERS-CoV. The discovery of an intermediate host of SARS-CoV-2 and the identification of specific mechanism of its emergence in humans are topics of primary evolutionary importance. In this study we investigate the evolutionary patterns of 11 main genes of SARS-CoV-2. Previous studies suggested that the genome of SARS-CoV-2 is highly similar to the horseshoe bat coronavirus RaTG13 for most of the genes and to some Malayan pangolin coronavirus (CoV) strains for the receptor binding (RB) domain of the spike protein. We provide a detailed list of statistically significant horizontal gene transfer and recombination events (both intergenic and intragenic) inferred for each of 11 main genes of the SARS-CoV-2 genome. Our analysis reveals that two continuous regions of genes S and N of SARS-CoV-2 may result from intragenic recombination between RaTG13 and Guangdong (GD) Pangolin CoVs. Statistically significant gene transfer-recombination events between RaTG13 and GD Pangolin CoV have been identified in region [1215–1425] of gene S and region [534–727] of gene N. Moreover, some statistically significant recombination events between the ancestors of SARS-CoV-2, RaTG13, GD Pangolin CoV and bat CoV ZC45-ZXC21 coronaviruses have been identified in genes ORF1ab, S, ORF3a, ORF7a, ORF8 and N. Furthermore, topology-based clustering of gene trees inferred for 25 CoV organisms revealed a three-way evolution of coronavirus genes, with gene phylogenies of ORF1ab, S and N forming the first cluster, gene phylogenies of ORF3a, E, M, ORF6, ORF7a, ORF7b and ORF8 forming the second cluster, and phylogeny of gene ORF10 forming the third cluster. The results of our horizontal gene transfer and recombination analysis suggest that SARS-CoV-2 could not only be a chimera virus resulting from recombination of the bat RaTG13 and Guangdong pangolin coronaviruses but also a close relative of the bat CoV ZC45 and ZXC21 strains. They also indicate that a GD pangolin may be an intermediate host of this dangerous virus.

2020-12-03

BMC Ecology and Evolution (published)

doi.org

An Analysis of Dataset Overlap on Winograd-Style Tasks

Ali Emami

Adam Trischler

Kaheer Suleman

Jackie Cheung

The Winograd Schema Challenge (WSC) and variants inspired by it have become important benchmarks for common-sense reasoning (CSR). Model per… (see more)formance on the WSC has quickly progressed from chance-level to near-human using neural language models trained on massive corpora. In this paper, we analyze the effects of varying degrees of overlaps that occur between these corpora and the test instances in WSC-style tasks. We find that a large number of test instances overlap considerably with the pretraining corpora on which state-of-the-art models are trained, and that a significant drop in classification accuracy occurs when models are evaluated on instances with minimal overlap. Based on these results, we provide the WSC-Web dataset, consisting of over 60k pronoun disambiguation problems scraped from web data, being both the largest corpus to date, and having a significantly lower proportion of overlaps with current pretraining corpora.

2020-12-01

Proceedings of the 28th International Conference on Computational Linguistics (published)

doi.org

arxiv.org

Autonomous navigation of stratospheric balloons using reinforcement learning

Marc Gendron-Bellemare

S. Candido

Pablo Samuel Castro

Jun Gong

Marlos C. Machado

Subhodeep Moitra

Sameera S. Ponda

Ziyun Wang

2020-12-01

Nature (published)

doi.org

Learning Efficient Task-Specific Meta-Embeddings with Word Prisms

Jingyi He

Kc Tsiolis

Kian Kenyon-Dean

Jackie Cheung

Word embeddings are trained to predict word cooccurrence statistics, which leads them to possess different lexical properties (syntactic, se… (see more)mantic, etc.) depending on the notion of context defined at training time. These properties manifest when querying the embedding space for the most similar vectors, and when used at the input layer of deep neural networks trained to solve downstream NLP problems. Meta-embeddings combine multiple sets of differently trained word embeddings, and have been shown to successfully improve intrinsic and extrinsic performance over equivalent models which use just one set of source embeddings. We introduce word prisms: a simple and efficient meta-embedding method that learns to combine source embeddings according to the task at hand. Word prisms learn orthogonal transformations to linearly combine the input source embeddings, which allows them to be very efficient at inference time. We evaluate word prisms in comparison to other meta-embedding methods on six extrinsic evaluations and observe that word prisms offer improvements in performance on all tasks.

2020-12-01

Proceedings of the 28th International Conference on Computational Linguistics (published)

doi.org

arxiv.org

Learning Lexical Subspaces in a Distributional Vector Space

Kushal Arora

Aishik Chakraborty

Jackie Cheung

Abstract In this paper, we propose LexSub, a novel approach towards unifying lexical and distributional semantics. We inject knowledge about… (see more) lexical-semantic relations into distributional word embeddings by defining subspaces of the distributional vector space in which a lexical relation should hold. Our framework can handle symmetric attract and repel relations (e.g., synonymy and antonymy, respectively), as well as asymmetric relations (e.g., hypernymy and meronomy). In a suite of intrinsic benchmarks, we show that our model outperforms previous approaches on relatedness tasks and on hypernymy classification and detection, while being competitive on word similarity tasks. It also outperforms previous systems on extrinsic classification tasks that benefit from exploiting lexical relational cues. We perform a series of analyses to understand the behaviors of our model.1 Code available at https://github.com/aishikchakraborty/LexSub.

2020-12-01

Transactions of the Association for Computational Linguistics (published)

doi.org

Recommandations pratiques pour une utilisation responsable de l’intelligence artificielle en santé mentale en contexte de pandémie

Carl-Maria Mörch

Pascale Lehoux

Marc-Antoine Dilhac

Catherine Régis

Xavier Dionne

La pandémie actuelle a provoqué une onde de choc dont les conséquences se font sentir dans tous les aspects de notre vie. Alors que la sa… (see more)nté physique a été généralement au cœur de l’attention scientifique et politique, il est devenu clair que la pandémie de COVID-19 a influé significativement sur la santé mentale de nombreux individus. Plus encore, elle aurait accentué les fragilités déjà existantes dans nos systèmes de santé mentale. Souvent moins financé ou soutenu que la santé physique, le domaine de la santé mentale pourrait-il bénéficier d’innovations en intelligence artificielle en période de pandémie ? Et si oui comment ? Que vous soyez développeur.e.s en IA, chercheur.e.s ou entrepreneur.e.s, ce document vise à vous fournir une synthèse des pistes d’actions et des ressources pour prévenir les principaux risques éthiques liés au développement d’applications d’IA dans le champ de la santé mentale. Pour illustrer ces principes, ce document propose de découvrir quatre cas fictif, à visée réaliste, à partir desquels il vous sera proposé de porter attention aux enjeux éthiques potentiels dans cette situation, aux enjeux d’innovation responsable à envisager, aux pistes d’action possibles inspirées de la liste de contrôle (Protocole Canadien conçu pour favoriser une utilisation responsable de l’IA en santé mentale et prévention du suicide, Mörch et al., 2020), aux ressources pratiques et à certains enjeux juridiques pertinents. Ce document a été élaboré par Carl-Maria Mörch, PhD, Algora Lab, Université de Montréal, Observatoire International sur les impacts sociétaux de l’Intelligence Artificielle et du Numérique (OBVIA), Mila – Institut Québécois d’Intelligence Artificielle, avec les contributions de Pascale Lehoux, Marc-Antoine Dilhac, Catherine Régis et Xavier Dionne.

2020-12-01

(published)

doi.org

The default network of the human brain is associated with perceived social isolation

R. Nathan Spreng

Emile Dimas

Laetitia Mwilambwe-Tshilobo

Alain Dagher

Philipp Koellinger

Gideon Nave

Anthony Ong

Julius M Kernbach

Thomas V. Wiecki

Tian Ge

Yue Li

Avram J. Holmes

B.T. Thomas Yeo

Gary R. Turner

Robin I. M. Dunbar

Danilo Bzdok

2020-12-01

Nature Communications (published)

doi.org

Inductive biases for deep learning of higher-level cognition

Anirudh Goyal

Yoshua Bengio

A fascinating hypothesis is that human and animal intelligence could be explained by a few principles (rather than an encyclopaedic list of … (see more)heuristics). If that hypothesis was correct, we could more easily both understand our own intelligence and build intelligent machines. Just like in physics, the principles themselves would not be sufficient to predict the behaviour of complex systems like brains, and substantial computation might be needed to simulate human-like intelligence. This hypothesis would suggest that studying the kind of inductive biases that humans and animals exploit could help both clarify these principles and provide inspiration for AI research and neuroscience theories. Deep learning already exploits several key inductive biases, and this work considers a larger list, focusing on those which concern mostly higher-level and sequential conscious processing. The objective of clarifying these particular principles is that they could potentially help us build AI systems benefiting from humans’ abilities in terms of flexible out-of-distribution and systematic generalization, which is currently an area where a large gap exists between state-of-the-art machine learning and human intelligence.

2020-11-30

ArXiv (preprint)

doi.org

arxiv.org

#EEGManyLabs: Investigating the replicability of influential EEG experiments

Yuri G Pavlov

N. Adamian

Stefan Appelhoff

Mahnaz Arvaneh

C. Benwell

Christian Beste

A. Bland

Daniel E. Bradford

Florian Bublatzky

Niko A. Busch

Peter E. Clayson

Damian Cruse

Artur Czeszumski

Anna Dreber

Guillaume Dumas

Benedikt V. Ehinger

Giorgio Ganis

Xun He

J. Hinojosa

Christoph Huber-Huber … (see 39 more)

Michael Inzlicht

B. Jack

Magnus Johannesson

Rhiannon Jones

Evgenii Kalenkovich

Laura Kaltwasser

Hamid Karimi-rouzbahani

And Andreas Keil

P. König

Layla Kouara

Louisa V. Kulke

C. Ladouceur

Nicolas Langer

Heinrich R. Liesefeld

David Luque

Annmarie MacNamara

Liad Mudrik

Muthuraman Muthuraman

Lauren Browning Neal

Gustav Nilsonne

Guiomar Niso

Sebastian Ocklenburg

Robert Oostenveld

Cyril R. Pernet

G. Pourtois

Manuela Ruzzoli

S. Sass

Alexandre Schaefer

Magdalena Senderecka

Joel S. Snyder

Christian Krog Tamnes

E Tognoli

M. V. Vugt

Edelyn Verona

Robin Vloeberghs

Dominik Welke

J. Wessel

Ilya V Zakharov

Faisal Mushtaq

2020-11-27

Cortex (published)

doi.org

Human attachments shape interbrain synchrony toward efficient performance of social goals

Amir Djalovski

Guillaume Dumas

Sivan Kinreich

Ruth Pinkenson Feldman

2020-11-26

NeuroImage (published)

doi.org

Interactive Psychometrics for Autism With the Human Dynamic Clamp: Interpersonal Synchrony From Sensorimotor to Sociocognitive Domains

Florence Baillin

Aline Lefebvre

Amandine Pedoux

Yann Beauxis

Denis-Alexander Engemann

Anna Maruani

Frederique Amsellem

J. A. Scott Kelso

Thomas Bourgeron

Richard Delorme

Guillaume Dumas

The human dynamic clamp (HDC) is a human–machine interface designed on the basis of coordination dynamics for studying realistic social in… (see more)teraction under controlled and reproducible conditions. Here, we propose to probe the validity of the HDC as a psychometric instrument for quantifying social abilities in children with autism spectrum disorder (ASD) and neurotypical development. To study interpersonal synchrony with the HDC, we derived five standardized scores following a gradient from sensorimotor and motor to higher sociocognitive skills in a sample of 155 individuals (113 participants with ASD, 42 typically developing participants; aged 5 to 25 years; IQ > 70). Regression analyses were performed using normative modeling on global scores according to four subconditions (HDC behavior “cooperative/competitive,” human task “in-phase/anti-phase,” diagnosis, and age at inclusion). Children with ASD had lower scores than controls for motor skills. HDC motor coordination scores were the best candidates for stratification and diagnostic biomarkers according to exploratory analyses of hierarchical clustering and multivariate classification. Independently of phenotype, sociocognitive skills increased with developmental age while being affected by the ongoing task and HDC behavior. Weaker performance in ASD for motor skills suggests the convergent validity of the HDC for evaluating social interaction. Results provided additional evidence of a relationship between sensorimotor and sociocognitive skills. HDC may also be used as a marker of maturation of sociocognitive skills during real-time social interaction. Through its standardized and objective evaluation, the HDC not only represents a valid paradigm for the study of interpersonal synchrony but also offers a promising, clinically relevant psychometric instrument for the evaluation and stratification of sociomotor dysfunctions.

2020-11-26

Frontiers in Psychiatry (published)

doi.org

RetroGNN: Approximating Retrosynthesis by Graph Neural Networks for De Novo Drug Design

Cheng-Hao Liu

Maksym Korablyov

Stanisław Jastrzębski

Paweł Włodarczyk-Pruszyński

Yoshua Bengio

Marwin Segler

De novo molecule generation often results in chemically unfeasible molecules. A natural idea to mitigate this problem is to bias the search … (see more)process towards more easily synthesizable molecules using a proxy for synthetic accessibility. However, using currently available proxies still results in highly unrealistic compounds. We investigate the feasibility of training deep graph neural networks to approximate the outputs of a retrosynthesis planning software, and their use to bias the search process. We evaluate our method on a benchmark involving searching for drug-like molecules with antibiotic properties. Compared to enumerating over five million existing molecules from the ZINC database, our approach finds molecules predicted to be more likely to be antibiotics while maintaining good drug-like properties and being easily synthesizable. Importantly, our deep neural network can successfully filter out hard to synthesize molecules while achieving a

2020-11-25

ArXiv (preprint)

arxiv.org

AI Advantage

Mila AI Policy Fellowship

Strategic Priorities

AI Advantage

Mila AI Policy Fellowship

Publications

AI Advantage

Mila AI Policy Fellowship

Strategic Priorities

AI Advantage

Mila AI Policy Fellowship

Popular keywords:

Publications