Alexandre Drouin

Associate Industry Member

Adjunct professor, Université Laval, Department of Electrical Engineering and Computer Engineering

Research Scientist, ServiceNow

Research Topics

Causality

Computational Biology

Deep Learning

LLM Agent

Time Series Forecasting

Website

Google Scholar

Biography

Alexandre Drouin is a research scientist at ServiceNow Research in Montréal, and an adjunct professor of computer science at Université Laval. He also leads ServiceNow’s Human Decision Support research program, which explores the use of machine learning for decision-making in complex dynamic environments.

Droiun’s main research interest is causal decision-making under uncertainty, where the goal is to answer questions of causal nature (interventions, counterfactual), while accounting for sources of uncertainty, such as ambiguity in causal structures and unmeasured variables. He is also interested in probabilistic time series forecasting and its use in foreseeing the long-term effect of actions. His PhD in computer science from Université Laval was on machine learning algorithms for biomarker discovery in large genomic datasets and their application to the global problem of antibiotic resistance.

Current Students

Arjun Ashok

PhD - Université de Montréal

Principal supervisor :

PhD - Polytechnique Montréal

Co-supervisor :

Quentin Cappart

Philippe Brouillard

PhD - Université de Montréal

Principal supervisor :

Publications

Invariant Causal Set Covering Machines

Thibaud Godon

Baptiste Bauvin

Pascal Germain

J. Corbeil

Alexandre Drouin

2023-06-06

ArXiv (preprint)

doi.org

arxiv.org

RandomSCM: interpretable ensembles of sparse classifiers tailored for omics data

Thibaud Godon

Pier-Luc Plante

Baptiste Bauvin

Élina Francovic-Fontaine

Alexandre Drouin

J. Corbeil

Background: Understanding the relationship between the Omics and the phenotype is a central problem in precision medicine. The high dimensio… (see more)nality of metabolomics data challenges learning algorithms in terms of scalability and generalization. Most learning algorithms do not produce interpretable models -- Method: We propose an ensemble learning algorithm based on conjunctions or disjunctions of decision rules. -- Results : Applications on metabolomics data shows that it produces models that achieves high predictive performances. The interpretability of the models makes them useful for biomarker discovery and patterns discovery in high dimensional data.

2022-08-10

ArXiv (preprint)

doi.org

arxiv.org

TACTiS: Transformer-Attentional Copulas for Time Series

Alexandre Drouin

Étienne Marcotte

Nicolas Chapados

The estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance. However, t… (see more)he practical utility of such estimates is limited by how accurately they quantify predictive uncertainty. In this work, we address the problem of estimating the joint predictive distribution of high-dimensional multivariate time series. We propose a versatile method, based on the transformer architecture, that estimates joint distributions using an attention-based decoder that provably learns to mimic the properties of non-parametric copulas. The resulting model has several desirable properties: it can scale to hundreds of time series, supports both forecasting and interpolation, can handle unaligned and non-uniformly sampled data, and can seamlessly adapt to missing data during training. We demonstrate these properties empirically and show that our model produces state-of-the-art predictions on multiple real-world datasets.

2021-12-31

ICML (published)

proceedings.mlr.press

Phylogenetic Manifold Regularization: A semi-supervised approach to predict transcription factor binding sites

Francois Laviolette

The computational prediction of transcription factor binding sites remains a challenging problems in bioinformatics, despite significant m e… (see more)thodological d evelopments f rom t he field of machine learning. Such computational models are essential to help interpret the non-coding portion of human genomes, and to learn more about the regulatory mechanisms controlling gene expression. In parallel, massive genome sequencing efforts have produced assembled genomes for hundred of vertebrate species, but this data is underused. We present PhyloReg, a new semi-supervised learning approach that can be used for a wide variety of sequence-to-function prediction problems, and that takes advantage of hundreds of millions of years of evolution to regularize predictors and improve accuracy. We demonstrate that PhyloReg can be used to better train a previously proposed deep learning model of transcription factor binding. Simulation studies further help delineate the benefits o f t he a pproach. G ains in prediction accuracy are obtained over a broad set of transcription factors and cell types.

2020-12-15

2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (published)

doi.org

Differentiable Causal Discovery from Interventional Data

Alexandre Lacoste

Learning a causal directed acyclic graph from data is a challenging task that involves solving a combinatorial problem for which the solutio… (see more)n is not always identifiable. A new line of work reformulates this problem as a continuous constrained optimization one, which is solved via the augmented Lagrangian method. However, most methods based on this idea do not make use of interventional data, which can significantly alleviate identifiability issues. This work constitutes a new step in this direction by proposing a theoretically-grounded method based on neural networks that can leverage interventional data. We illustrate the flexibility of the continuous-constrained framework by taking advantage of expressive neural architectures such as normalizing flows. We show that our approach compares favorably to the state of the art in a variety of settings, including perfect and imperfect interventions for which the targeted nodes may even be unknown.

2019-12-31

NeurIPS (published)

doi.org

arxiv.org

G RADIENT -B ASED N EURAL DAG L EARNING WITH I NTERVENTIONS

Alexandre Lacoste

Decision making based on statistical association alone can be a dangerous endeavor due to non-causal associations. Ideally, one would rely o… (see more)n causal relationships that enable reasoning about the effect of interventions. Several methods have been proposed to discover such relationships from observational and inter-ventional data. Among them, GraN-DAG, a method that relies on the constrained optimization of neural networks, was shown to produce state-of-the-art results among algorithms relying purely on observational data. However, it is limited to observational data and cannot make use of interventions. In this work, we extend GraN-DAG to support interventional data and show that this improves its ability to infer causal structures

2019-12-31

(published)

www.semanticscholar.org

In Search of Robust Measures of Generalization

Gintare Karolina Dziugaite

Brady Neal

Linbo Wang

Daniel M. Roy

One of the principal scientific challenges in deep learning is explaining generalization, i.e., why the particular way the community now tra… (see more)ins networks to achieve small training error also leads to small error on held-out data from the same population. It is widely appreciated that some worst-case theories -- such as those based on the VC dimension of the class of predictors induced by modern neural network architectures -- are unable to explain empirical performance. A large volume of work aims to close this gap, primarily by developing bounds on generalization error, optimization error, and excess risk. When evaluated empirically, however, most of these bounds are numerically vacuous. Focusing on generalization bounds, this work addresses the question of how to evaluate such bounds empirically. Jiang et al. (2020) recently described a large-scale empirical study aimed at uncovering potential causal relationships between bounds/measures and generalization. Building on their study, we highlight where their proposed methods can obscure failures and successes of generalization measures in explaining generalization. We argue that generalization measures should instead be evaluated within the framework of distributional robustness.

2019-12-31

NeurIPS (published)

doi.org

arxiv.org

Synbols: Probing Learning Algorithms with Synthetic Datasets

Alexandre Lacoste

Pau Rodríguez

Frédéric Branchaud-Charron

Parmida Atighehchian

Massimo Caccia

Matt Craddock

Progress in the field of machine learning has been fueled by the introduction of benchmark datasets pushing the limits of existing algorithm… (see more)s. Enabling the design of datasets to test specific properties and failure modes of learning algorithms is thus a problem of high interest, as it has a direct impact on innovation in the field. In this sense, we introduce Synbols -- Synthetic Symbols -- a tool for rapidly generating new datasets with a rich composition of latent features rendered in low resolution images. Synbols leverages the large amount of symbols available in the Unicode standard and the wide range of artistic font provided by the open font community. Our tool's high-level interface provides a language for rapidly generating new distributions on the latent features, including various types of textures and occlusions. To showcase the versatility of Synbols, we use it to dissect the limitations and flaws in standard learning algorithms in various learning setups including supervised learning, active learning, out of distribution generalization, unsupervised representation learning, and object counting.

2019-12-31

Advances in Neural Information Processing Systems 33 (NeurIPS 2020) (published)

doi.org

arxiv.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Alexandre Drouin

Biography

Current Students

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Alexandre Drouin

Biography

Current Students

Publications