Publications

Genetic Interplay Between White Matter Hyperintensities and Alzheimer's Disease: A Brain-Body Perspective
Manpreet Singh
Kimia Shafighi
Flavie E. Detcheverry
Fanta Dabo
Ikrame Housni
Sridar Narayanan
Sarah A. Gagliano Taliun
AmanPreet Badhwar
MRI-detected white matter hyperintensities (WMH) are often recognized as markers of cerebrovascular abnormalities and an index of vascular b… (voir plus)rain injury, and are frequently present in individuals with Alzheimer’s disease (AD). Given the emerging bidirectional communication between the brain-body axis in both WMHs and AD, it is important to understand their genetic underpinnings across the whole body. However, literature on this is scarce. We investigated the brain-body axis by breaking down heritability estimates of these phenotypes across the whole body, – i.e., partitioning heritability. Our aims were to identify genetic underpinnings specific to WMHs, and common between WMHs and AD, by assessing (a) the partitioned heritability of WMHs and AD across the brain-body axis with tissue-specific annotations, (b) the partitioned heritability of WMHs and AD across the brain-body axis with cell-specific annotations, and (c) the genes associated with WMHs and AD, and verifying their expression levels across the whole body. Our tissue-specific analysis revealed that WMH-associated SNPs were significantly enriched in tissues beyond the brain, namely liver, cardiovascular, and kidney – with liver being a common tissue enriched for both WMHs and AD. Our cell-specific analysis showed enrichment of vascular endothelial cells across the tissue types enriched for WMHs, highlighting their central role in the development of WMHs. Additionally, our gene-level analysis highlighted overlapping patterns of tissue enrichment for both WMHs and AD, and showed interactions between WMH and AD associated genes. Our findings provide new insights into the systemic influences potentially contributing to WMH pathology, in particular, multi-system endothelial disorder. We hope that our multisystemic genetic findings will stimulate future WMH-research into specific pathways across the brain-body axis.
Refining SARS-CoV-2 intra-host variation by leveraging large-scale sequencing data
Jean-Christophe Grenier
Raphaël Poujol
Understanding viral genome evolution during host infection is crucial for grasping viral diversity and evolution. Analyzing intra-host singl… (voir plus)e nucleotide variants (iSNVs) offers insights into new lineage emergence, which is important for predicting and mitigating future viral threats. Despite next-generation sequencing’s potential, challenges persist, notably sequencing artifacts leading to false iSNVs. We developed a workflow to enhance iSNV detection in large NGS libraries, using over 130 000 SARS-CoV-2 libraries to distinguish mutations from errors. Our approach integrates bioinformatics protocols, stringent quality control, and dimensionality reduction to tackle batch effects and improve mutation detection reliability. Additionally, we pioneer the application of the PHATE visualization approach to genomic data and introduce a methodology that quantifies how related groups of data points are represented within a two-dimensional space, enhancing clustering structure explanation based on genetic similarities. This workflow advances accurate intra-host mutation detection, facilitating a deeper understanding of viral diversity and evolution.
Longitudinal bi-criteria framework for assessing national healthcare responses to pandemic outbreaks
Adel Guitouni
Nabil Belacel
Belaid Moa
Munire Erman
Halim Abdul
Replication of a GWAS signal near
<i>HLA-DQA2</i>
with acute myeloid leukemia using a disease-only cohort and external population-based controls
Rose Laflamme
Véronique Lisi
Josée Hébert
Guy Sauvageau
Vincent-Philippe Lavallee
Guillaume Lettre
Acute myeloid leukemia (AML) is the most common type of acute leukemia in adults. Its risk factors include rare and highly penetrant somatic… (voir plus) mutations. Genome-wide association studies (GWAS) have also identified four common inherited variants associated with AML risk, but these findings have not yet been confirmed in many independent datasets. Here, we performed a replication study with 567 AML cases from the Leucegene cohort and 1,865 controls from the population-based cohort CARTaGENE (CaG). Because genotypes were generated using different technologies in the two datasets (e.g. low- vs. high-coverage whole-genome sequencing), we applied stringent quality-control filters to minimize type I errors. We showed using data reduction methods (e.g. principal component analysis [PCA] and uniform manifold approximation and projection [UMAP]) that our approach successfully integrated the Leucegene and CaG genetic data. We replicated the association between cytogenetically normal (CN)-AML and rs3916765, a variant located near HLA-DQA2 (odds ratio [95% confidence interval] = 1.88 [1.21-2.93], P- value=0.005). The effect size of this association was stronger when we restricted the analyses to AML patients with NPM1 mutations (odds ratios >2.35). We found HLA- DOB to be the most significantly upregulated gene in Leucegene participants with the CN-AML protective A-allele at rs3916765. We further found that several HLA class II genes are also differentially expressed albeit at lower statistical significance. Our results confirm that a common genetic variant at the HLA locus associates with AML risk, providing new opportunities to improve disease prognosis and treatment.
CALE: Continuous Arcade Learning Environment
We introduce the Continuous Arcade Learning Environment (CALE), an extension of the well-known Arcade Learning Environment (ALE) [Bellemare … (voir plus)et al., 2013]. The CALE uses the same underlying emulator of the Atari 2600 gaming system (Stella), but adds support for continuous actions. This enables the benchmarking and evaluation of continuous-control agents (such as PPO [Schulman et al., 2017] and SAC [Haarnoja et al., 2018]) and value-based agents (such as DQN [Mnih et al., 2015] and Rainbow [Hessel et al., 2018]) on the same environment suite. We provide a series of open questions and research directions that CALE enables, as well as initial baseline results using Soft Actor-Critic. CALE is available as part of the ALE athttps://github.com/Farama-Foundation/Arcade-Learning-Environment.
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
David LE MEUR
David Orlando Romero Mogrovejo
Chenyang Lyu
Haryo Akbarianto Wibowo
Teresa Lynn
Injy Hamed
Aditya Nanda Kishore Khandavally
Aishik Mandal
Alina Dragonetti
Artem Abzaliev
Atnafu Lambebo Tonja
Bontu Fufa Balcha
Chenxi Whitehouse
Christian Salamea-Palacios
Dan John Velasco
D. Meur
Emilio Villa Cueva
Fajri Koto
Fauzan Farooqui … (voir 57 de plus)
Frederico Belcavello
Ganzorig Batnasan
Gisela Vallejo
Gráinne Caulfield
Guido Ivetta
Haiyue Song
Henok Biadglign Ademtew
Hernán Maina
Holy Lovenia
Israel Abebe Azime
Jan Christian Blaise Cruz
Jiahui Geng
Jesus-German Ortiz-Barajas
Jinheon Baek
Jocelyn Dunstan
Laura Alonso Alemany
Teresa Clifford
Kumaranage Ravindu Yasas Nagasinghe
Luciana Benotti
Luis Fernando D'Haro
Marcelo Viridiano
Marcos Estecha-Garitagoitia
Maria Camila Buitrago Cabrera
Mario Rodríguez-Cantelar
Mélanie Jouitteau
Mihail Minkov Mihaylov
Mohamed Fazli Mohamed Imam
Muhammad Farid Adilazuarda
Munkhjargal Gochoo
Munkh-Erdene Otgonbold
Naome Etori
Olivier NIYOMUGISHA
Paula Mónica Silva
Pranjal A Chitale
Raj Dabre
Rendi Chevi
Ruochen Zhang
Ryandito Diandaru
Samuel Cahyawijaya
Santiago Góngora
Soyeong Jeong
Sukannya Purkayastha
Tatsuki Kuribayashi
Thanmay Jayakumar
Tiago Timponi Torrent
Toqeer Ehsan
Vladimir Araujo
Yova Kementchedjhieva
Zara Burzo
Zheng Wei Lim
Zheng Xin Yong
Oana Ignat
Joan Nwatu
Rada Mihalcea
Thamar Solorio
Alham Fikri Aji
Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection
Pierre-Andre Noel
Joao Monteiro
Improving the reliability of deployed machine learning systems often involves developing methods to detect out-of-distribution (OOD) inputs.… (voir plus) However, existing research often narrowly focuses on samples from classes that are absent from the training set, neglecting other types of plausible distribution shifts. This limitation reduces the applicability of these methods in real-world scenarios, where systems encounter a wide variety of anomalous inputs. In this study, we categorize five distinct types of distribution shifts and critically evaluate the performance of recent OOD detection methods on each of them. We publicly release our benchmark under the name BROAD (Benchmarking Resilience Over Anomaly Diversity). Our findings reveal that while these methods excel in detecting unknown classes, their performance is inconsistent when encountering other types of distribution shifts. In other words, they only reliably detect unexpected inputs that they have been specifically designed to expect. As a first step toward broad OOD detection, we learn a generative model of existing detection scores with a Gaussian mixture. By doing so, we present an ensemble approach that offers a more consistent and comprehensive solution for broad OOD detection, demonstrating superior performance compared to existing methods. Our code to download BROAD and reproduce our experiments is publicly available.
Learning Action and Reasoning-Centric Image Editing from Videos and Simulation
Dheeraj Vattikonda
Varun Jampani
Christopher Pal
LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban Simulation
Bowen Li
Zhaoyu Li
Qiwei Du
Jinqi Luo
Wenshan Wang
Yaqi Xie
Simon Stepputtis
Chen Wang
Katia P. Sycara
Pradeep Kumar Ravikumar
Alexander G. Gray
Sebastian Scherer
Recent years have witnessed the rapid development of Neuro-Symbolic (NeSy) AI systems, which integrate symbolic reasoning into deep neural n… (voir plus)etworks. However, most of the existing benchmarks for NeSy AI fail to provide long-horizon reasoning tasks with complex multi-agent interactions. Furthermore, they are usually constrained by fixed and simplistic logical rules over limited entities, making them far from real-world complexities. To address these crucial gaps, we introduce LogiCity, the first simulator based on customizable first-order logic (FOL) for an urban-like environment with multiple dynamic agents. LogiCity models diverse urban elements using semantic and spatial concepts, such as
ReactZyme: A Benchmark for Enzyme-Reaction Prediction
Bozitao Zhong
Liang Hong
Shuangjia Zheng
Enzymes, with their specific catalyzed reactions, are necessary for all aspects of life, enabling diverse biological processes and adaptatio… (voir plus)ns. Predicting enzyme functions is essential for understanding biological pathways, guiding drug development, enhancing bioproduct yields, and facilitating evolutionary studies. Addressing the inherent complexities, we introduce a new approach to annotating enzymes based on their catalyzed reactions. This method provides detailed insights into specific reactions and is adaptable to newly discovered reactions, diverging from traditional classifications by protein family or expert-derived reaction classes. We employ machine learning algorithms to analyze enzyme reaction datasets, delivering a much more refined view on the functionality of enzymes. Our evaluation leverages the largest enzyme-reaction dataset to date, derived from the SwissProt and Rhea databases with entries up to January 8, 2024. We frame the enzyme-reaction prediction as a retrieval problem, aiming to rank enzymes by their catalytic ability for specific reactions. With our model, we can recruit proteins for novel reactions and predict reactions in novel proteins, facilitating enzyme discovery and function annotation (https://github.com/WillHua127/ReactZyme).
Reconstructing Spatio-Temporal Trajectories of Visual Object Memories in the Human Brain
Julia Lifanov
Benjamin J. Griffiths
Juan Linde-Domingo
Catarina S. Ferreira
Martin Wilson
Stephen D. Mayhew
Maria Wimber
RedPajama: an Open Dataset for Training Large Language Models
Maurice Weber
Daniel Y Fu
Quentin Gregory Anthony
Yonatan Oren
Shane Adams
Anton Alexandrov
Xiaozhong Lyu
Huu Nguyen
Xiaozhe Yao
Virginia Adams
Ben Athiwaratkun
Rahul Chalamala
Kezhen Chen
Max Ryabinin
Tri Dao
Percy Liang
Christopher Re
Ce Zhang