Portrait of Assya Trofimov is unavailable

Assya Trofimov

Alumni

Publications

Transposable elements regulate thymus development and function
Jean-David Larouche
Céline M Laumont
Krystel Vincent
Leslie Hesnard
Sylvie Brochu
Caroline Côté
Juliette F Humeau
Éric Bonneil
Joël Lanoix
Chantal Durette
Patrick Gendron
Jean-Philippe Laverdure
Ellen R Richie
Pierre Thibault
Claude Perreault
Abstract Transposable elements (TE) are repetitive sequences representing ∼45% of the human and mouse genomes and are high… (see more)ly expressed by medullary thymic epithelial cells (mTEC). In this study, we investigated the role of transposable elements (TE), which are highly expressed by medullary thymic epithelial cells (mTEC), on T-cell development in the thymus. We performed multi-omic analyses of TEs in human and mouse thymic cells to elucidate their role in T cell development. We report that TE expression in the human thymus is high and shows extensive age- and cell lineage-related variations. TEs interact with multiple transcription factors in all cell types of the human thymus. Two cell types express particularly broad TE repertoires: mTECs and plasmacytoid dendritic cells (pDC). In mTECs, TEs interact with transcription factors essential for mTEC development and function (e.g., PAX1 and RELB) and generate MHC-I-associated peptides implicated in thymocyte education. Notably, AIRE, FEZF2, and CHD4 regulate non-redundant sets of TEs in murine mTECs. Human thymic pDCs homogenously express large numbers of TEs that lead to the formation of dsRNA, triggering RIG-I and MDA5 signaling and explaining why thymic pDCs constitutively secrete IFN ɑ/β. This study illustrates the diversity of interactions between TEs and the adaptive immune system. TEs are genetic parasites, and the two thymic cell types most affected by TEs (mTEcs and pDCs) are essential to establishing central T-cell tolerance. Therefore, we propose that the orchestration of TE expression in thymic cells is critical to prevent autoimmunity in vertebrates.
Transposable elements regulate thymus development and function 1
Jean-David Larouche
Céline M. Laumont
Krystel Vincent
Leslie Hesnard
Sylvie Brochu
Caroline Côté
Juliette Humeau
Éric Bonneil
Joël Lanoix
Chantal Durette
Patrick Gendron
Jean-Philippe Laverdure
Ellen Rothman Richie
S. Lemieux
Pierre Thibault
Claude Perreault
21 Transposable elements (TE) are repetitive sequences representing ~45% of the human and mouse genomes 22 and are highly expressed by medul… (see more)lary thymic epithelial cells (mTEC). In this study, we investigated the 23 role of transposable elements (TE), which are highly expressed by medullary thymic epithelial cells 24 (mTEC), on T-cell development in the thymus. We performed multi-omic analyses of TEs in human and 25 mouse thymic cells to elucidate their role in T cell development. We report that TE expression in the 26 human thymus is high and shows extensive ageand cell lineage-related variations. TEs interact with 27 multiple transcription factors in all cell types of the human thymus. Two cell types express particularly 28 broad TE repertoires: mTECs and plasmacytoid dendritic cells (pDC). In mTECs, TEs interact with 29 transcription factors essential for mTEC development and function (e.g., PAX1 and RELB) and generate 30 MHC-I-associated peptides implicated in thymocyte education. Notably, AIRE, FEZF2, and CHD4 31 regulate non-redundant sets of TEs in murine mTECs. Human thymic pDCs homogenously express large 32 numbers of TEs that lead to the formation of dsRNA, triggering RIG-I and MDA5 signaling and 33 explaining why thymic pDCs constitutively secrete IFN ɑ/β. This study illustrates the diversity of 34 interactions between TEs and the adaptive immune system. TEs are genetic parasites, and the two thymic 35 cell types most affected by TEs (mTEcs and pDCs) are essential to establishing central T-cell tolerance. 36 Therefore, we propose that the orchestration of TE expression in thymic cells is critical to prevent 37 autoimmunity in vertebrates. 38
Two types of human TCR differentially regulate reactivity to self and non-self antigens
Jean-David Larouche
Jonathan Séguin
Jean-Philippe Laverdure
Ann Brasey
Gregory Ehx
Denis-Claude Roy
Lambert Busque
Silvy Lachance
Claude Perreault
Based on analyses of TCR sequences from over 1,000 individuals, we report that the TCR repertoire is composed of two ontogenically and funct… (see more)ionally distinct types of TCRs. Their production is regulated by variations in thymic output and terminal deoxynucleotidyl transferase (TDT) activity. Neonatal TCRs derived from TDT-negative progenitors persist throughout life, are highly shared among subjects, and are reported as disease-associated. Thus, 10%–30% of most frequent cord blood TCRs are associated with common pathogens and autoantigens. TDT-dependent TCRs present distinct structural features and are less shared among subjects. TDT-dependent TCRs are produced in maximal numbers during infancy when thymic output and TDT activity reach a summit, are more abundant in subjects with AIRE mutations, and seem to play a dominant role in graft-versus-host disease. Factors decreasing thymic output (age, male sex) negatively impact TCR diversity. Males compensate for their lower repertoire diversity via hyperexpansion of selected TCR clonotypes.
Accounting for Variance in Machine Learning Benchmarks
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the l… (see more)earning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons.
Factorized embeddings learns rich and biologically meaningful embedding spaces using factorized tensor decomposition
The recent development of sequencing technologies revolutionized our understanding of the inner workings of the cell as well as the way dise… (see more)ase is treated. A single RNA sequencing (RNA-Seq) experiment, however, measures tens of thousands of parameters simultaneously. While the results are information rich, data analysis provides a challenge. Dimensionality reduction methods help with this task by extracting patterns from the data by compressing it into compact vector representations. We present the factorized embeddings (FE) model, a self-supervised deep learning algorithm that learns simultaneously, by tensor factorization, gene and sample representation spaces. We ran the model on RNA-Seq data from two large-scale cohorts and observed that the sample representation captures information on single gene and global gene expression patterns. Moreover, we found that the gene representation space was organized such that tissue-specific genes, highly correlated genes as well as genes participating in the same GO terms were grouped. Finally, we compared the vector representation of samples learned by the FE model to other similar models on 49 regression tasks. We report that the representations trained with FE rank first or second in all of the tasks, surpassing, sometimes by a considerable margin, other representations. A toy example in the form of a Jupyter Notebook as well as the code and trained embeddings for this project can be found at: https://github.com/TrofimovAssya/FactorizedEmbeddings. Supplementary data are available at Bioinformatics online.
Towards the Latent Transcriptome
In this work we propose a method to compute continuous embeddings for kmers from raw RNA-seq data, in a reference-free fashion. We report th… (see more)at our model captures information of both DNA sequence similarity as well as DNA sequence abundance in the embedding latent space. We confirm the quality of these vectors by comparing them to known gene sub-structures and report that the latent space recovers exon information from raw RNA-Seq data from acute myeloid leukemia patients. Furthermore we show that this latent space allows the detection of genomic abnormalities such as translocations as well as patient-specific mutations, making this representation space both useful for visualization as well as analysis.