Portrait de Amin Emad

Amin Emad

Membre académique associé
Professeur agrégé, McGill University, Département de génie électrique et informatique
Sujets de recherche
Apprentissage de représentations
Apprentissage multimodal
Apprentissage profond
Apprentissage sur graphes
Biologie computationnelle
Causalité
Modèles génératifs
Modèles probabilistes
Modélisation moléculaire
Réseaux de neurones en graphes

Biographie

Amin Emad est professeur adjoint au Département de génie électrique et informatique de l'Université McGill. Il est également affilié à l'Institut du cancer Rosalind et Morris Goodman, à l'Initiative de McGill en médecine computationnelle (MiCM), au programme des Sciences quantitatives de la vie et aux Laboratoires Meakins-Christie de l'Université McGill. Il est aussi membre associé de Mila – Institut québécois d'intelligence artificielle. Avant de se joindre à McGill, il a été associé de recherche postdoctorale au centre d'excellence KnowEnG des National Institutes of Health (NIH) en informatique des mégadonnées, un centre lié au Département d'informatique et à l'Institut de biologie génomique (IGB) de l'Université de l'Illinois à Urbana-Champaign (UIUC). Il a obtenu un doctorat de l'UIUC en 2015, une maîtrise de l'Université de l'Alberta en 2009 et une licence de l'Université technologique Sharif en 2007. Ses recherches se situent à l'intersection de l'IA et de la biologie informatique.

Étudiants actuels

Doctorat - McGill
Doctorat - McGill
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Doctorat - McGill
Doctorat - McGill
Doctorat - McGill
Doctorat - McGill
Doctorat - McGill

Publications

Circulating IL-17F, but not IL-17A, is elevated in severe COVID-19 and leads to an ERK1/2 and p38 MAPK-dependent increase in ICAM-1 cell surface expression and neutrophil adhesion on endothelial cells
Jérôme Bédard-Matteau
Antoine Soulé
Katelyn Yixiu Liu
Lyvia Fourcade
Douglas D. Fraser
Simon Rousseau
Severe COVID-19 is associated with neutrophilic inflammation and immunothrombosis. Several members of the IL-17 cytokine family have been as… (voir plus)sociated with neutrophilic inflammation and activation of the endothelium. Therefore, we investigated whether these cytokines were associated with COVID-19.We investigated the association between COVID-19 and circulating plasma levels of IL-17 cytokine family members in participants to the Biobanque québécoise de la COVID-19 (BQC19), a prospective observational cohort and an independent cohort from Western University (London, Ontario). We measured the in vitro impact of IL-17F on intercellular adhesion molecule 1 (ICAM-1) cell surface expression and neutrophil adhesion on endothelial cells in culture. The contribution of two Mitogen Activated Protein Kinase (MAPK) pathways was determined using small molecule inhibitors PD184352 (a MKK1/MKK2 inhibitor) and BIRB0796 (a p38 MAPK inhibitor).We found increased IL-17D and IL-17F plasma levels when comparing SARS-CoV-2-positive vs negative hospitalized participants. Moreover, increased plasma levels of IL-17D, IL-17E and IL-17F were noted when comparing severe versus mild COVID-19. IL-17F, but not IL-17A, was significantly elevated in people with COVID-19 compared to healthy controls and with more severe disease. In vitro work on endothelial cells treated with IL-17F for 24h showed an increase cell surface expression of ICAM-1 accompanied by neutrophil adhesion. The introduction of two MAPK inhibitors significantly reduced the binding of neutrophils while also reducing ICAM-1 expression at the surface level of endothelial cells, but not its intracellular expression.Overall, these results have identified an association between two cytokines of the IL-17 family (IL-17D and IL-17F) with COVID-19 and disease severity. Considering that IL-17F stimulation promotes neutrophil adhesion to the endothelium in a MAPK-dependent manner, it is attractive to speculate that this pathway may contribute to pathogenic immunothrombosis in concert with other molecular effectors.
Deciphering lineage-relevant gene regulatory networks during endoderm formation by InPheRNo-ChIP.
Chen Su
William A Pastor
Deciphering the underlying gene regulatory networks (GRNs) that govern early human embryogenesis is critical for understanding developmental… (voir plus) mechanisms yet remains challenging due to limited sample availability and the inherent complexity of the biological processes involved. To address this, we developed InPheRNo-ChIP, a computational framework that integrates multimodal data, including RNA-seq, transcription factor (TF)-specific ChIP-seq, and phenotypic labels, to reconstruct phenotype-relevant GRNs associated with endoderm development. The core of this method is a probabilistic graphical model that models the simultaneous effect of TFs on their putative target genes to influence a particular phenotypic outcome. Unlike the majority of existing GRN inference methods that are agnostic to the phenotypic outcomes, InPheRNo-ChIP directly incorporates phenotypic information during GRN inference, enabling the distinction between lineage-specific and general regulatory interactions. We integrated data from three experimental studies and applied InPheRNo-ChIP to infer the GRN governing the differentiation of human embryonic stem cells into definitive endoderm. Benchmarking against a scRNA-seq CRISPRi study demonstrated InPheRNo-ChIP's ability to identify regulatory interactions involving endoderm markers FOXA2, SMAD2, and SOX17, outperforming other methods. This highlights the importance of incorporating the phenotypic context during network inference. Furthermore, an ablation study confirms the synergistic contribution of ChIP-seq, RNA-seq, and phenotypic data, highlighting the value of multimodal integration for accurate phenotype-relevant GRN reconstruction.
A long-context RNA foundation model for predicting transcriptome architecture
Ali Saberi
Benedict Choi
Sean Wang
Aldo Hernández-Corchado
Mohsen Naghipourfar
Arsham Mikaeili Namini
Vijay Ramani
Hamed S. Najafabadi
Hani Goodarzi
Linking DNA sequence to genomic function remains one of the grand challenges in genetics and genomics. Here, we combine large-scale single-m… (voir plus)olecule transcriptome sequencing of diverse cancer cell lines with cutting-edge machine learning to build LoRNASH, an RNA foundation model that learns how the nucleotide sequence of unspliced pre-mRNA dictates transcriptome architecture—the relative abundances and molecular structures of mRNA isoforms. Owing to its use of the StripedHyena architecture, LoRNASH handles extremely long sequence inputs (∼65 kilobase pairs), allowing for quantitative, zero-shot prediction of all aspects of transcriptome architecture, including isoform abundance, isoform structure, and the impact of DNA sequence variants on transcript structure and abundance. We anticipate that our public data release and proof-of-concept model will accelerate varying aspects of RNA biotechnology. More broadly, we envision the use of LoRNASH as a foundation for fine-tuning of any transcriptome-related downstream prediction task, including cell-type specific gene expression, splicing, and general RNA processing.
INTREPPPID - An Orthologue-Informed Quintuplet Network for Cross-Species Prediction of Protein-Protein Interaction
Joseph Szymborski
An overwhelming majority of protein-protein interaction (PPI) studies are conducted in a select few model organisms largely due to constrain… (voir plus)ts in time and cost of the associated “wet lab” experiments. In silico PPI inference methods are ideal tools to overcome these limitations, but often struggle with cross-species predictions. We present INTREPPPID, a method which incorporates orthology data using a new “quintuplet” neural network, which is constructed with five parallel encoders with shared parameters. INTREPPPID incorporates both a PPI classification task and an orthologous locality task. The latter learns embeddings of orthologues that have small Euclidean distances between them and large distances between embeddings of all other proteins. INTREPPPID outperforms all other leading PPI inference methods tested on both the intra-species and cross-species tasks using strict evaluation datasets. We show that INTREPPPID’s orthologous locality loss increases performance because of the biological relevance of the orthologue data, and not due to some other specious aspect of the architecture. Finally, we introduce PPI.bio and PPI Origami, a web server interface for INTREPPPID and a software tool for creating strict evaluation datasets, respectively. Together, these two initiatives aim to make both the use and development of PPI inference tools more accessible to the community. GRAPHICAL ABSTRACT
GRouNdGAN: GRN-guided simulation of single-cell RNA-seq data using causal generative adversarial networks
Yazdan Zinati
Abdulrahman Takiddeen
We introduce GRouNdGAN, a gene regulatory network (GRN)-guided causal implicit generative model for simulating single-cell RNA-seq data, in-… (voir plus)silico perturbation experiments, and benchmarking GRN inference methods. Through the imposition of a user-defined GRN in its architecture, GRouNdGAN simulates steady-state and transient-state single-cell datasets where genes are causally expressed under the control of their regulating transcription factors (TFs). Training on three experimental datasets, we show that our model captures non-linear TF-gene dependences and preserves gene identities, cell trajectories, pseudo-time ordering, and technical and biological noise, with no user manipulation and only implicit parameterization. Despite imposing rigid causality constraints, it outperforms state-of-the-art simulators in generating realistic cells. GRouNdGAN learns meaningful causal regulatory dynamics, allowing sampling from both observational and interventional distributions. This enables it to synthesize cells under conditions that do not occur in the dataset at inference time, allowing to perform in-silico TF knockout experiments. Our results show that in-silico knockout of cell type-specific TFs significantly reduces cells of that type being generated. Interactions imposed through the GRN are emphasized in the simulated datasets, resulting in GRN inference algorithms assigning them much higher scores than interactions not imposed but of equal importance in the experimental training dataset. Benchmarking various GRN inference algorithms reveals that GRouNdGAN effectively bridges the existing gap between simulated and biological data benchmarks of GRN inference algorithms, providing gold standard ground truth GRNs and realistic cells corresponding to the biological system of interest. Our results show that GRouNdGAN is a stable, realistic, and effective simulator with various applications in single-cell RNA-seq analysis.
Validation of ANG-1 and P-SEL as biomarkers of post-COVID-19 conditions using data from the Biobanque québécoise de la COVID-19 (BQC-19)
Eric Yamga
Antoine Soulé
Alain Piché
Madeleine Durand
Simon Rousseau
Interpretable deep learning architectures for improving drug response prediction performance: myth or reality?
Yihui Li
David Earl Hostallero
Motivation: Recent advances in deep learning model development have enabled more accurate prediction of drug response in cancer. However, th… (voir plus)e black-box nature of these models still remains a hurdle in their adoption for precision cancer medicine. Recent efforts have focused on making these models interpretable by incorporating signaling pathway information in model architecture. While these models improve interpretability, it is unclear whether this higher interpretability comes at the cost of less accurate predictions, or a prediction improvement can also be obtained. Results: In this study, we comprehensively and systematically assessed four state-of-the-art interpretable models developed for drug response prediction to answer this question using three pathway collections. Our results showed that models that explicitly incorporate pathway information in the form of a latent layer perform worse compared to models that incorporate this information implicitly. Moreover, in most evaluation setups the best performance is achieved using a simple black-box model. In addition, replacing the signaling pathways with randomly generated pathways shows a comparable performance for the majority of these interpretable models. Our results suggest that new interpretable models are necessary to improve the drug response prediction performance. In addition, the current study provides different baseline models and evaluation setups necessary for such new models to demonstrate their superior prediction performance. Availability and Implementation: Implementation of all methods are provided in https://github.com/Emad-COMBINE-lab/InterpretableAI_for_DRP. Generated uniform datasets are in https://zenodo.org/record/7101665#.YzS79HbMKUk. Contact: amin.emad@mcgill.ca Supplementary Information: Online-only supplementary data is available at the journal’s website.
MARSY: a multitask deep-learning framework for prediction of drug combination synergy scores
Mohamed Reda El Khili
Safyan Aman Memon
Motivation Combination therapies have emerged as a treatment strategy for cancers to reduce the probability of drug resistance and to improv… (voir plus)e outcome. Large databases curating the results of many drug screening studies on preclinical cancer cell lines have been developed, capturing the synergistic and antagonistic effects of combination of drugs in different cell lines. However, due to the high cost of drug screening experiments and the sheer size of possible drug combinations, these databases are quite sparse. This necessitates the development of transductive computational models to accurately impute these missing values. Results Here, we developed MARSY, a deep learning multi-task model that incorporates information on gene expression profile of cancer cell lines, as well as the differential expression signature induced by each drug to predict drug-pair synergy scores. By utilizing two encoders to capture the interplay between the drug-pairs, as well as the drug-pairs and cell lines, and by adding auxiliary tasks in the predictor, MARSY learns latent embeddings that improve the prediction performance compared to state-of-the-art and traditional machine learning models. Using MARSY, we then predicted the synergy scores of 133,722 new drug-pair cell line combinations, which we have made available to the community as part of this study. Moreover, we validated various insights obtained from these novel predictions using independent studies, confirming the ability of MARSY in making accurate novel predictions. Availability and Implementation An implementation of the algorithms in Python and cleaned input datasets are provided in https://github.com/Emad-COMBINE-lab/MARSY. Contact amin.emad@mcgill.ca Supplementary Information Online-only supplementary data is available at the journal’s website.
Analysis of gene expression and use of connectivity mapping to identify drugs for treatment of human glomerulopathies
Chen-Fang Chung
Joan Papillon
José R. Navarro-Betancourt
Julie Guillemette
Ameya Bhope
Andrey V. Cybulsky
Preclinical-to-clinical Anti-cancer Drug Response Prediction and Biomarker Identification Using TINDL
David Earl Hostallero
Lixuan Wei
Liewei Wang
Junmei Cairns
A circulating proteome-informed prognostic model of COVID-19 disease activity that relies on 1 routinely available clinical laboratories 2
William Ma
Antoine Soulé
Karine Tremblay
Simon Rousseau
Abstract
Poisson Group Testing: A Probabilistic Model for Boolean Compressed Sensing
Olgica Milenkovic
We introduce a novel probabilistic group testing framework, termed Poisson group testing, in which the number of defectives follows a right-… (voir plus)truncated Poisson distribution. The Poisson model has a number of new applications, including dynamic testing with diminishing relative rates of defectives. We consider both nonadaptive and semi-adaptive identification methods. For nonadaptive methods, we derive a lower bound on the number of tests required to identify the defectives with a probability of error that asymptotically converges to zero; in addition, we propose test matrix constructions for which the number of tests closely matches the lower bound. For semiadaptive methods, we describe a lower bound on the expected number of tests required to identify the defectives with zero error probability. In addition, we propose a stage-wise reconstruction algorithm for which the expected number of tests is only a constant factor away from the lower bound. The methods rely only on an estimate of the average number of defectives, rather than on the individual probabilities of subjects being defective.