Portrait de Vladimir Makarenkov

Vladimir Makarenkov

Membre affilié
Professeur titulaire, UQAM, Département d'informatique
Sujets de recherche
Apprentissage automatique médical
Apprentissage profond
Biologie computationnelle
Regroupement (Clustering)

Biographie

Vladimir Makarenkov est professeur titulaire et directeur du programme de diplôme d'études supérieures spécialisées (DESS) en bio-informatique à l’Université du Québec à Montréal (UQAM). Il est titulaire d’une maîtrise en mathématiques appliquées de l'Université d'État Lomonossov de Moscou et d’un doctorat en informatique et mathématiques de l’École des hautes études en sciences sociales (EHESS). Avant de se joindre au Département d’informatique de l’UQAM, il a fait un stage postdoctoral de trois ans au Laboratoire d’écologie numérique de l’Université de Montréal, dirigé par Pierre Legendre. Vladimir Makarenkov a publié 80 articles de revue et 67 articles de conférence. Il a également reçu les prestigieux prix Simon-Régnier et Chikio-Hayashi de l’International Federation of Classification Societies (IFCS).

Ses recherches portent sur l'intelligence artificielle, la bio-informatique et l'exploration de données. Elles incluent la conception et le développement de nouvelles méthodes d'apprentissage automatique non supervisées et supervisées, de même que l’utilisation de méthodes d’apprentissage automatique, dont le regroupement des données et l’apprentissage profond, pour l’analyse des données biologiques et biomédicales.

Ses recherches actuelles portent également sur le développement d’un système de recommandation automatisé basé sur l’apprentissage profond et servant à recommander le meilleur algorithme de clustering pour un jeu de données fourni en entrée, sur le développement d’un modèle d’apprentissage automatique générique servant à définir la notion de cluster, ainsi que sur la comparaison de différentes approches d’autoencodage et de différents algorithmes en vue d’obtenir un meilleur regroupement de données.

Publications

BayTTA: Uncertainty-aware medical image classification with optimized test-time augmentation using Bayesian model averaging
Zeinab Sherkatghanad
Moloud Abdar
Mohammadreza Bakhtyari
Test-time augmentation (TTA) is a well-known technique employed during the testing phase of computer vision tasks. It involves aggregating m… (voir plus)ultiple augmented versions of input data. Combining predictions using a simple average formulation is a common and straightforward approach after performing TTA. This paper introduces a novel framework for optimizing TTA, called BayTTA (Bayesian-based TTA), which is based on Bayesian Model Averaging (BMA). First, we generate a model list associated with different variations of the input data created through TTA. Then, we use BMA to combine model predictions weighted by their respective posterior probabilities. Such an approach allows one to take into account model uncertainty, and thus to enhance the predictive performance of the related machine learning or deep learning model. We evaluate the performance of BayTTA on various public data, including three medical image datasets comprising skin cancer, breast cancer, and chest X-ray images and two well-known gene editing datasets, CRISPOR and GUIDE-seq. Our experimental results indicate that BayTTA can be effectively integrated into state-of-the-art deep learning models used in medical image analysis as well as into some popular pre-trained CNN models such as VGG-16, MobileNetV2, DenseNet201, ResNet152V2, and InceptionRes-NetV2, leading to the enhancement in their accuracy and robustness performance.
A self-attention-based CNN-Bi-LSTM model for accurate state-of-charge estimation of lithium-ion batteries
Zeinab Sherkatghanad
Amin Ghazanfari
Assessing the emergence time of SARS-CoV-2 zoonotic spillover
Stéphane Samson
Étienne Lord
Inertia-Based Indices to Determine the Number of Clusters in K-Means: An Experimental Evaluation
Andrei Rykov
Renato Cordeiro De Amorim
Boris Mirkin
This paper gives an experimentally supported review and comparison of several indices based on the conventional K-means inertia criterion fo… (voir plus)r determining the number of clusters,
Cache-Efficient Dynamic Programming MDP Solver
Jaël Champagne Gareau
Guillaume Gosset
Éric Beaudry
Inferring multiple consensus trees and supertrees using clustering: a review
Gayane S. Barseghyan
Nadia Tahiri
Low-Rank Representation of Reinforcement Learning Policies
We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional… (voir plus) embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability and convergence guarantees. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly represented in a low-dimensional space while the embedded policy incurs almost no decrease in returns.
Horizontal gene transfer and recombination analysis of SARS-CoV-2 genes helps discover its close relatives and shed light on its origin
Bogdan Mazoure
Pierre Legendre
The SARS-CoV-2 pandemic is one of the greatest global medical and social challenges that have emerged in recent history. Human coronavirus s… (voir plus)trains discovered during previous SARS outbreaks have been hypothesized to pass from bats to humans using intermediate hosts, e.g. civets for SARS-CoV and camels for MERS-CoV. The discovery of an intermediate host of SARS-CoV-2 and the identification of specific mechanism of its emergence in humans are topics of primary evolutionary importance. In this study we investigate the evolutionary patterns of 11 main genes of SARS-CoV-2. Previous studies suggested that the genome of SARS-CoV-2 is highly similar to the horseshoe bat coronavirus RaTG13 for most of the genes and to some Malayan pangolin coronavirus (CoV) strains for the receptor binding (RB) domain of the spike protein. We provide a detailed list of statistically significant horizontal gene transfer and recombination events (both intergenic and intragenic) inferred for each of 11 main genes of the SARS-CoV-2 genome. Our analysis reveals that two continuous regions of genes S and N of SARS-CoV-2 may result from intragenic recombination between RaTG13 and Guangdong (GD) Pangolin CoVs. Statistically significant gene transfer-recombination events between RaTG13 and GD Pangolin CoV have been identified in region [1215–1425] of gene S and region [534–727] of gene N. Moreover, some statistically significant recombination events between the ancestors of SARS-CoV-2, RaTG13, GD Pangolin CoV and bat CoV ZC45-ZXC21 coronaviruses have been identified in genes ORF1ab, S, ORF3a, ORF7a, ORF8 and N. Furthermore, topology-based clustering of gene trees inferred for 25 CoV organisms revealed a three-way evolution of coronavirus genes, with gene phylogenies of ORF1ab, S and N forming the first cluster, gene phylogenies of ORF3a, E, M, ORF6, ORF7a, ORF7b and ORF8 forming the second cluster, and phylogeny of gene ORF10 forming the third cluster. The results of our horizontal gene transfer and recombination analysis suggest that SARS-CoV-2 could not only be a chimera virus resulting from recombination of the bat RaTG13 and Guangdong pangolin coronaviruses but also a close relative of the bat CoV ZC45 and ZXC21 strains. They also indicate that a GD pangolin may be an intermediate host of this dangerous virus.
Provably efficient reconstruction of policy networks
Recent research has shown that learning poli-cies parametrized by large neural networks can achieve significant success on challenging reinf… (voir plus)orcement learning problems. However, when memory is limited, it is not always possible to store such models exactly for inference, and com-pressing the policy into a compact representation might be necessary. We propose a general framework for policy representation, which reduces this problem to finding a low-dimensional embedding of a given density function in a separable inner product space. Our framework allows us to de-rive strong theoretical guarantees, controlling the error of the reconstructed policies. Such guaran-tees are typically lacking in black-box models, but are very desirable in risk-sensitive tasks. Our experimental results suggest that the reconstructed policies can use less than 10%of the number of parameters in the original networks, while incurring almost no decrease in rewards.
Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces.
We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional… (voir plus) embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly embedded in a low-dimensional space while the embedded policy incurs almost no decrease in return.