Portrait of Vladimir Makarenkov

Vladimir Makarenkov

Affiliate Member
Full Professor, UQAM, Department of Computer Science
Research Topics
Clustering
Computational Biology
Deep Learning
Medical Machine Learning

Biography

Vladimir Makarenkov is a full professor and director of the graduate program in bioinformatics at Université du Québec à Montréal (UQAM). He holds a master's degree in applied mathematics from Lomonosov Moscow State University and a PhD in computer science and mathematics from the École des hautes études en sciences sociales (EHESS) in Paris. Before joining the computer science department at UQAM, he completed a three-year postdoctoral fellowship at the Digital Ecology Lab at Université de Montréal.

He is the author of 80 journal articles and 67 conference papers, and the recipient of the prestigious Simon Régnier Prize and the Chikio Hayashi Prize awarded by the International Society for Mathematical Classification. His research focuses on AI, bioinformatics and data mining. This encompasses the design and development of novel unsupervised and supervised machine learning methods, as well as the use of machine learning techniques, including clustering and deep learning, for the analysis of biological and biomedical data.

Makarenkov’s current research also involves the development of an automated recommendation system based on deep learning to recommend the best clustering algorithm for a given input dataset. Additionally, he is working on creating a generic machine learning model to define the concept of cluster, and on comparing various auto-encoding approaches and clustering algorithms to achieve better clustering results.

Publications

BayTTA: Uncertainty-aware medical image classification with optimized test-time augmentation using Bayesian model averaging
Zeinab Sherkatghanad
Moloud Abdar
Mohammadreza Bakhtyari
Test-time augmentation (TTA) is a well-known technique employed during the testing phase of computer vision tasks. It involves aggregating m… (see more)ultiple augmented versions of input data. Combining predictions using a simple average formulation is a common and straightforward approach after performing TTA. This paper introduces a novel framework for optimizing TTA, called BayTTA (Bayesian-based TTA), which is based on Bayesian Model Averaging (BMA). First, we generate a model list associated with different variations of the input data created through TTA. Then, we use BMA to combine model predictions weighted by their respective posterior probabilities. Such an approach allows one to take into account model uncertainty, and thus to enhance the predictive performance of the related machine learning or deep learning model. We evaluate the performance of BayTTA on various public data, including three medical image datasets comprising skin cancer, breast cancer, and chest X-ray images and two well-known gene editing datasets, CRISPOR and GUIDE-seq. Our experimental results indicate that BayTTA can be effectively integrated into state-of-the-art deep learning models used in medical image analysis as well as into some popular pre-trained CNN models such as VGG-16, MobileNetV2, DenseNet201, ResNet152V2, and InceptionRes-NetV2, leading to the enhancement in their accuracy and robustness performance.
A self-attention-based CNN-Bi-LSTM model for accurate state-of-charge estimation of lithium-ion batteries
Zeinab Sherkatghanad
Amin Ghazanfari
Assessing the emergence time of SARS-CoV-2 zoonotic spillover
Stéphane Samson
Étienne Lord
Inertia-Based Indices to Determine the Number of Clusters in K-Means: An Experimental Evaluation
Andrei Rykov
Renato Cordeiro De Amorim
Boris Mirkin
This paper gives an experimentally supported review and comparison of several indices based on the conventional K-means inertia criterion fo… (see more)r determining the number of clusters,
Cache-Efficient Dynamic Programming MDP Solver
Jaël Champagne Gareau
Guillaume Gosset
Éric Beaudry
Inferring multiple consensus trees and supertrees using clustering: a review
Gayane S. Barseghyan
Nadia Tahiri
Low-Rank Representation of Reinforcement Learning Policies
We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional… (see more) embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability and convergence guarantees. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly represented in a low-dimensional space while the embedded policy incurs almost no decrease in returns.
Horizontal gene transfer and recombination analysis of SARS-CoV-2 genes helps discover its close relatives and shed light on its origin
Bogdan Mazoure
Pierre Legendre
The SARS-CoV-2 pandemic is one of the greatest global medical and social challenges that have emerged in recent history. Human coronavirus s… (see more)trains discovered during previous SARS outbreaks have been hypothesized to pass from bats to humans using intermediate hosts, e.g. civets for SARS-CoV and camels for MERS-CoV. The discovery of an intermediate host of SARS-CoV-2 and the identification of specific mechanism of its emergence in humans are topics of primary evolutionary importance. In this study we investigate the evolutionary patterns of 11 main genes of SARS-CoV-2. Previous studies suggested that the genome of SARS-CoV-2 is highly similar to the horseshoe bat coronavirus RaTG13 for most of the genes and to some Malayan pangolin coronavirus (CoV) strains for the receptor binding (RB) domain of the spike protein. We provide a detailed list of statistically significant horizontal gene transfer and recombination events (both intergenic and intragenic) inferred for each of 11 main genes of the SARS-CoV-2 genome. Our analysis reveals that two continuous regions of genes S and N of SARS-CoV-2 may result from intragenic recombination between RaTG13 and Guangdong (GD) Pangolin CoVs. Statistically significant gene transfer-recombination events between RaTG13 and GD Pangolin CoV have been identified in region [1215–1425] of gene S and region [534–727] of gene N. Moreover, some statistically significant recombination events between the ancestors of SARS-CoV-2, RaTG13, GD Pangolin CoV and bat CoV ZC45-ZXC21 coronaviruses have been identified in genes ORF1ab, S, ORF3a, ORF7a, ORF8 and N. Furthermore, topology-based clustering of gene trees inferred for 25 CoV organisms revealed a three-way evolution of coronavirus genes, with gene phylogenies of ORF1ab, S and N forming the first cluster, gene phylogenies of ORF3a, E, M, ORF6, ORF7a, ORF7b and ORF8 forming the second cluster, and phylogeny of gene ORF10 forming the third cluster. The results of our horizontal gene transfer and recombination analysis suggest that SARS-CoV-2 could not only be a chimera virus resulting from recombination of the bat RaTG13 and Guangdong pangolin coronaviruses but also a close relative of the bat CoV ZC45 and ZXC21 strains. They also indicate that a GD pangolin may be an intermediate host of this dangerous virus.
Provably efficient reconstruction of policy networks
Recent research has shown that learning poli-cies parametrized by large neural networks can achieve significant success on challenging reinf… (see more)orcement learning problems. However, when memory is limited, it is not always possible to store such models exactly for inference, and com-pressing the policy into a compact representation might be necessary. We propose a general framework for policy representation, which reduces this problem to finding a low-dimensional embedding of a given density function in a separable inner product space. Our framework allows us to de-rive strong theoretical guarantees, controlling the error of the reconstructed policies. Such guaran-tees are typically lacking in black-box models, but are very desirable in risk-sensitive tasks. Our experimental results suggest that the reconstructed policies can use less than 10%of the number of parameters in the original networks, while incurring almost no decrease in rewards.
Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces.
We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional… (see more) embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly embedded in a low-dimensional space while the embedded policy incurs almost no decrease in return.