Portrait de Alexandre Drouin

Alexandre Drouin

Membre industriel associé
Professeur adjoint, Université Laval, Département de génie électrique et de génie informatique
Chercheur scientifique, ServiceNow
Sujets de recherche
Agent basé sur un LLM
Apprentissage profond
Biologie computationnelle
Causalité
Prévision des séries temporelles

Biographie

Alexandre Drouin est chercheur en intelligence artificielle chez ServiceNow Research à Montréal et professeur associé au Département d’informatique et de génie logiciel de l’Université Laval. Il dirige une équipe de recherche qui explore l’utilisation de l’apprentissage automatique pour la prise de décision dans des environnements dynamiques complexes. Son intérêt de recherche principal est la prise de décision causale, dont le but est de répondre à des questions interventionnelles et contrefactuelles en tenant compte des sources d’incertitude potentielles, par exemple l’ambiguïté des relations causales sous-jacentes à un système et l’effet de variables latentes. Il s’intéresse aussi aux modèles de prédiction probabiliste pour les séries temporelles et à leur utilisation pour prédire l’effet à long terme d’actions.

Il est détenteur d’un doctorat en informatique de l’Université Laval, qu’il a reçu pour son travail sur le développement d’algorithmes d’apprentissage automatique pour la découverte de biomarqueurs en génomique et leur application au problème de résistance aux antibiotiques.

Étudiants actuels

Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - Polytechnique
Co-superviseur⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :

Publications

The Unsolved Challenges of LLMs as Generalist Web Agents: A Case Study
Massimo Caccia
Issam Hadj Laradji
Sai Rajeswar
Hector Palacios
Maxime Gasse
Alexandre Lacoste
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
Kashif Rasul
Andrew Robert Williams
Marin Biloš
Hena Ghonia
Anderson Schneider
Sahil Garg
Yuriy Nevmyvaka
Over the past years, foundation models have caused a paradigm shift in machine learning due to their unprecedented capabilities for zero-sho… (voir plus)t and few-shot generalization. However, despite the success of foundation models in modalities such as natural language processing and computer vision, the development of foundation models for time series forecasting has lagged behind. We present Lag-Llama, a general-purpose foundation model for univariate probabilistic time series forecasting based on a decoder-only transformer architecture that uses lags as covariates. Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities compared to a wide range of forecasting models on downstream datasets across domains. Moreover, when fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance, outperforming prior deep learning approaches, emerging as the best general-purpose model on average. Lag-Llama serves as a strong contender to the current state-of-art in time series forecasting and paves the way for future advancements in foundation models tailored to time series data.
GEO-Bench: Toward Foundation Models for Earth Monitoring
Alexandre Lacoste
Nils Lehmann
Pau Rodríguez
Evan David Sherwin
Hannah Kerner
Björn Lütjens
Jeremy Irvin
David Dao
Hamed Alemohammad
Mehmet Gunturkun
Dava Newman
Stefano Ermon
Xiao Xiang Zhu
Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to subst… (voir plus)antial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote sensing tasks is limited. To stimulate the development of foundation models for Earth monitoring, we propose a benchmark comprised of six classification and six segmentation tasks, which were carefully curated and adapted to be both relevant to the field and well-suited for model evaluation. We accompany this benchmark with a robust methodology for evaluating models and reporting aggregated results to enable a reliable assessment of progress. Finally, we report results for 20 baselines to gain information about the performance of existing models. We believe that this benchmark will be a driver of progress across a variety of Earth monitoring tasks.
Regions of Reliability in the Evaluation of Multivariate Probabilistic Forecasts
Étienne Marcotte
Valentina Zantedeschi
Multivariate probabilistic time series forecasts are commonly evaluated via proper scoring rules, i.e., functions that are minimal in expect… (voir plus)ation for the ground-truth distribution. However, this property is not sufficient to guarantee good discrimination in the non-asymptotic regime. In this paper, we provide the first systematic finite-sample study of proper scoring rules for time-series forecasting evaluation. Through a power analysis, we identify the"region of reliability"of a scoring rule, i.e., the set of practical conditions where it can be relied on to identify forecasting errors. We carry out our analysis on a comprehensive synthetic benchmark, specifically designed to test several key discrepancies between ground-truth and forecast distributions, and we gauge the generalizability of our findings to real-world tasks with an application to an electricity production problem. Our results reveal critical shortcomings in the evaluation of multivariate probabilistic forecasts as commonly performed in the literature.
Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation
Causal Discovery with Language Models as Imperfect Experts
Stephanie Long
Valentina Zantedeschi
Tibor Schuster
Understanding the causal relationships that underlie a system is a fundamental prerequisite to accurate decision-making. In this work, we ex… (voir plus)plore how expert knowledge can be used to improve the data-driven identification of causal graphs, beyond Markov equivalence classes. In doing so, we consider a setting where we can query an expert about the orientation of causal relationships between variables, but where the expert may provide erroneous information. We propose strategies for amending such expert knowledge based on consistency properties, e.g., acyclicity and conditional independencies in the equivalence class. We then report a case study, on real data, where a large language model is used as an imperfect expert.
Invariant Causal Set Covering Machines
Baptiste Bauvin
Pascal Germain
J. Corbeil
RandomSCM: interpretable ensembles of sparse classifiers tailored for omics data
Pier-Luc Plante
Baptiste Bauvin
Élina Francovic-Fontaine
J. Corbeil
Background: Understanding the relationship between the Omics and the phenotype is a central problem in precision medicine. The high dimensio… (voir plus)nality of metabolomics data challenges learning algorithms in terms of scalability and generalization. Most learning algorithms do not produce interpretable models -- Method: We propose an ensemble learning algorithm based on conjunctions or disjunctions of decision rules. -- Results : Applications on metabolomics data shows that it produces models that achieves high predictive performances. The interpretability of the models makes them useful for biomarker discovery and patterns discovery in high dimensional data.
TACTiS: Transformer-Attentional Copulas for Time Series
The estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance. However, t… (voir plus)he practical utility of such estimates is limited by how accurately they quantify predictive uncertainty. In this work, we address the problem of estimating the joint predictive distribution of high-dimensional multivariate time series. We propose a versatile method, based on the transformer architecture, that estimates joint distributions using an attention-based decoder that provably learns to mimic the properties of non-parametric copulas. The resulting model has several desirable properties: it can scale to hundreds of time series, supports both forecasting and interpolation, can handle unaligned and non-uniformly sampled data, and can seamlessly adapt to missing data during training. We demonstrate these properties empirically and show that our model produces state-of-the-art predictions on multiple real-world datasets.
Phylogenetic Manifold Regularization: A semi-supervised approach to predict transcription factor binding sites
The computational prediction of transcription factor binding sites remains a challenging problems in bioinformatics, despite significant m e… (voir plus)thodological d evelopments f rom t he field of machine learning. Such computational models are essential to help interpret the non-coding portion of human genomes, and to learn more about the regulatory mechanisms controlling gene expression. In parallel, massive genome sequencing efforts have produced assembled genomes for hundred of vertebrate species, but this data is underused. We present PhyloReg, a new semi-supervised learning approach that can be used for a wide variety of sequence-to-function prediction problems, and that takes advantage of hundreds of millions of years of evolution to regularize predictors and improve accuracy. We demonstrate that PhyloReg can be used to better train a previously proposed deep learning model of transcription factor binding. Simulation studies further help delineate the benefits o f t he a pproach. G ains in prediction accuracy are obtained over a broad set of transcription factors and cell types.
Differentiable Causal Discovery from Interventional Data
Learning a causal directed acyclic graph from data is a challenging task that involves solving a combinatorial problem for which the solutio… (voir plus)n is not always identifiable. A new line of work reformulates this problem as a continuous constrained optimization one, which is solved via the augmented Lagrangian method. However, most methods based on this idea do not make use of interventional data, which can significantly alleviate identifiability issues. This work constitutes a new step in this direction by proposing a theoretically-grounded method based on neural networks that can leverage interventional data. We illustrate the flexibility of the continuous-constrained framework by taking advantage of expressive neural architectures such as normalizing flows. We show that our approach compares favorably to the state of the art in a variety of settings, including perfect and imperfect interventions for which the targeted nodes may even be unknown.
G RADIENT -B ASED N EURAL DAG L EARNING WITH I NTERVENTIONS
Decision making based on statistical association alone can be a dangerous endeavor due to non-causal associations. Ideally, one would rely o… (voir plus)n causal relationships that enable reasoning about the effect of interventions. Several methods have been proposed to discover such relationships from observational and inter-ventional data. Among them, GraN-DAG, a method that relies on the constrained optimization of neural networks, was shown to produce state-of-the-art results among algorithms relying purely on observational data. However, it is limited to observational data and cannot make use of interventions. In this work, we extend GraN-DAG to support interventional data and show that this improves its ability to infer causal structures