Publications

Over the past years, foundation models have caused a paradigm shift in machine learning due to their unprecedented capabilities for zero-sho… (voir plus)t and few-shot generalization. However, despite the success of foundation models in modalities such as natural language processing and computer vision, the development of foundation models for time series forecasting has lagged behind. We present Lag-Llama, a general-purpose foundation model for univariate probabilistic time series forecasting based on a decoder-only transformer architecture that uses lags as covariates. Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities compared to a wide range of forecasting models on downstream datasets across domains. Moreover, when fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance, outperforming prior deep learning approaches, emerging as the best general-purpose model on average. Lag-Llama serves as a strong contender to the current state-of-art in time series forecasting and paves the way for future advancements in foundation models tailored to time series data.

2023-10-12

ArXiv (prépublication)

arxiv.org

Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

Kashif Rasul

Arjun Ashok

Andrew Robert Williams

Arian Khorasani

George Adamopoulos

Rishika Bhagwatkar

Marin Bilovs

Hena Ghonia

Nadhir Hassen

Anderson Schneider

Sahil Garg

Alexandre Drouin

Nicolas Chapados

Yuriy Nevmyvaka

Irina Rish

2023-10-12

ArXiv (prépublication)

arxiv.org

PhyloGFN: Phylogenetic inference with generative flow networks

Ming Yang Zhou

Moksh J. Jain

Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities. Its long history a… (voir plus)nd numerous applications notwithstanding, inference of phylogenetic trees from sequence data remains challenging: the high complexity of tree space poses a significant obstacle for the current combinatorial and probabilistic techniques. In this paper, we adopt the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and Bayesian phylogenetic inference. Because GFlowNets are well-suited for sampling complex combinatorial structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies and evolutionary distances. We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets. PhyloGFN is competitive with prior works in marginal likelihood estimation and achieves a closer fit to the target distribution than state-of-the-art variational inference methods. Our code is available at https://github.com/zmy1116/phylogfn.

2023-10-12

ArXiv (prépublication)

doi.org

arxiv.org

AAPM Medical Physics Practice Guideline 14.a: Yttrium‐90 microsphere radioembolization

Nathan C. Busse

Muthana S. A. L. Al‐Ghazi

Nadine Abi‐Jaoudeh

Diane Alvarez

Ahmet S. Ayan

Erli Chen

Michael D. Chuong

William A. Dezarn

Shirin A. Enger

Stephen A. Graves

Robert F. Hobbs

Mary Ellen Jafari

S. Peter Kim

Nichole M. Maughan

Andrew M. Polemi

Jennifer R. Stickel

2023-10-11

Journal of Applied Clinical Medical Physics (publié)

doi.org

Explainable Attention for Few-shot Learning and Beyond

Bahareh Nikpour

Narges Armanfard

2023-10-11

ArXiv (prépublication)

doi.org

arxiv.org

A general framework for the practical disintegration of PAC-Bayesian bounds

Paul Viallard

Pascal Germain

Amaury Habrard

Emilie Morvant

2023-10-11

Machine-mediated learning (publié)

doi.org

arxiv.org

Language-Guided Reinforcement Learning for Hard Attention in Few-Shot Learning

Bahareh Nikpour

Narges Armanfard

Attention mechanisms have demonstrated significant potential in enhancing learning models by identifying key portions of input data, particu… (voir plus)larly in scenarios with limited training samples. Inspired by human perception, we propose that focusing on essential data segments, rather than the entire dataset, can improve the accuracy and reliability of the learning models. However, identifying these critical data segments, or"hard attention finding,"is challenging, especially in few-shot learning, due to the scarcity of training data and the complexity of model parameters. To address this, we introduce LaHA, a novel framework that leverages language-guided deep reinforcement learning to identify and utilize informative data regions, thereby improving both interpretability and performance. Extensive experiments on benchmark datasets validate the effectiveness of LaHA.

2023-10-11

ArXiv (prépublication)

arxiv.org

Language-Guided Reinforcement Learning for Hard Attention in Few-Shot Learning

Bahareh Nikpour

Narges Armanfard

2023-10-11

ArXiv (prépublication)

arxiv.org

Deep Learning Benchmark for First Break Detection from Hardrock Seismic Reflection Data

Pierre-Luc St-Charles

Bruno Rousseau

Joumana Ghosn

Gilles Bellefleur

E. Schetselaar

Deep learning techniques are used to tackle a variety of tasks related to seismic data processing and interpretation. While many works have … (voir plus)shown the benefits of deep learning, assessing the generalization capabilities of proposed methods to data acquired in different conditions and geological environments remains challenging. This is especially true for applications in hardrock environments where seismic surveys are still relatively rare. The primary factors that impede the adoption of machine learning in geosciences include the lack of publicly available and labeled datasets, and the use of inadequate evaluation methodologies. Since machine learning models are prone to overfit and underperform when the data used to train them is site-specific, the applicability of these models on new survey data that could be considered “out-of-distribution” is rarely addressed. This is unfortunate, as evaluating predictive models in out-of-distribution settings can provide a good insight into their usefulness in real-world use cases. To tackle these issues, we propose a simple benchmarking methodology for first break picking to evaluate the transferability of deep learning models that are trained across different environments and acquisition conditions. For this, we consider a reflection seismic survey dataset acquired at five distinct hardrock mining sites combined with annotations for first break picking. We train and evaluate a baseline deep learning solution based on a U-Net for future comparisons, and discuss potential improvements to this approach.

2023-10-10

Geophysics (published)

doi.org

Science éclair

À l’avant-garde d’une nouvelle ère

Demandes de supervision

Publications

Science éclair

À l’avant-garde d’une nouvelle ère

Demandes de supervision

Mots-clés populaires:

Publications