Stefano Massaroli

Improving *day-ahead* Solar Irradiance Time Series Forecasting by Leveraging Spatio-Temporal Context

Solar power harbors immense potential in mitigating climate change by substantially reducing CO…

openreview.net

Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions

Stefano Massaroli

Michael Poli

Daniel Y Fu

Hermann Kumbong

Rom Nishijima Parnichkun

Aman Timalsina

David W. Romero

Quinn McIntyre

Beidi Chen

Atri Rudra

Ce Zhang

Christopher Re

Stefano Ermon

Yoshua Bengio

Recent advances in attention-free sequence models rely on convolutions as alternatives to the attention operator at the core of Transformers… (see more). In particular, long convolution sequence models have achieved state-of-the-art performance in many domains, but incur a significant cost during auto-regressive inference workloads -- naively requiring a full pass (or caching of activations) over the input sequence for each generated token -- similarly to attention-based models. In this paper, we seek to enable

openreview.net

What if We Enrich day-ahead Solar Irradiance Time Series Forecasting with Spatio-Temporal Context?

2023-07-28

ICML.cc/2023/Workshop/SynS_and_ML (published)

doi.org

openreview.net

What if We Enrich day-ahead Solar Irradiance Time Series Forecasting with Spatio-Temporal Context?

The global integration of solar power into the electrical grid could have a crucial impact on climate change mitigation, yet poses a challen… (see more)ge due to solar irradiance variability. We present a deep learning architecture which uses spatio-temporal context from satellite data for highly accurate day-ahead time-series forecasting, in particular Global Horizontal Irradiance (GHI). We provide a multi-quantile variant which outputs a prediction interval for each time-step, serving as a measure of forecasting uncertainty. In addition, we suggest a testing scheme that separates easy and difficult scenarios, which appears useful to evaluate model performance in varying cloud conditions. Our approach exhibits robust performance in solar irradiance forecasting, including zero-shot generalization tests at unobserved solar stations, and holds great promise in promoting the effective use of solar power and the resulting reduction of CO

2023-07-28

ICML.cc/2023/Workshop/SynS_and_ML (published)

openreview.net

HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

Eric Nguyen

Michael Poli

Marjan Faizi

Armin W Thomas

Callum Birch-Sykes

Michael Wornow

Aman Patel

Clayton M. Rabideau

Stefano Massaroli

Yoshua Bengio

Stefano Ermon

Stephen Baccus

Christopher Re

Genomic (DNA) sequences encode an enormous amount of information for gene regulation and protein synthesis. Similar to natural language mode… (see more)ls, researchers have proposed foundation models in genomics to learn generalizable features from unlabeled genome data that can then be fine-tuned for downstream tasks such as identifying regulatory elements. Due to the quadratic scaling of attention, previous Transformer-based genomic models have used 512 to 4k tokens as context (0.001% of the human genome), significantly limiting the modeling of long-range interactions in DNA. In addition, these methods rely on toke

2023-06-27

ArXiv (preprint)

doi.org

arxiv.org

HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

Eric Nguyen

Michael Poli

Marjan Faizi

Armin W Thomas

Callum Birch-Sykes

Michael Wornow

Aman Patel

Clayton M. Rabideau

Stefano Massaroli

Yoshua Bengio

Stefano Ermon

Stephen Baccus

Christopher Re

Genomic (DNA) sequences encode an enormous amount of information for gene regulation and protein synthesis. Similar to natural language mode… (see more)ls, researchers have proposed foundation models in genomics to learn generalizable features from unlabeled genome data that can then be fine-tuned for downstream tasks such as identifying regulatory elements. Due to the quadratic scaling of attention, previous Transformer-based genomic models have used 512 to 4k tokens as context (0.001% of the human genome), significantly limiting the modeling of long-range interactions in DNA. In addition, these methods rely on toke

2023-06-27

ArXiv (preprint)

doi.org

arxiv.org

HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

Eric Nguyen

Michael Poli

Marjan Faizi

Armin W Thomas

Callum Birch-Sykes

Michael Wornow

Aman Patel

Clayton M. Rabideau

Stefano Massaroli

Yoshua Bengio

Stefano Ermon

Stephen Baccus

Christopher Re

Genomic (DNA) sequences encode an enormous amount of information for gene regulation and protein synthesis. Similar to natural language mode… (see more)ls, researchers have proposed foundation models in genomics to learn generalizable features from unlabeled genome data that can then be fine-tuned for downstream tasks such as identifying regulatory elements. Due to the quadratic scaling of attention, previous Transformer-based genomic models have used 512 to 4k tokens as context (0.001% of the human genome), significantly limiting the modeling of long-range interactions in DNA. In addition, these methods rely on toke

2023-06-27

ArXiv (preprint)

doi.org

arxiv.org

HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

Eric Nguyen

Michael Poli

Marjan Faizi

Armin W Thomas

Callum Birch-Sykes

Michael Wornow

Aman Patel

Clayton M. Rabideau

Stefano Massaroli

Yoshua Bengio

Stefano Ermon

Stephen Baccus

Christopher Re

Genomic (DNA) sequences encode an enormous amount of information for gene regulation and protein synthesis. Similar to natural language mode… (see more)ls, researchers have proposed foundation models in genomics to learn generalizable features from unlabeled genome data that can then be fine-tuned for downstream tasks such as identifying regulatory elements. Due to the quadratic scaling of attention, previous Transformer-based genomic models have used 512 to 4k tokens as context (0.001% of the human genome), significantly limiting the modeling of long-range interactions in DNA. In addition, these methods rely on toke

2023-06-27

ArXiv (preprint)

doi.org

arxiv.org

HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

Eric Nguyen

Michael Poli

Marjan Faizi

Armin W Thomas

Callum Birch-Sykes

Michael Wornow

Aman Patel

Clayton M. Rabideau

Stefano Massaroli

Yoshua Bengio

Stefano Ermon

Stephen Baccus

Christopher Re

Genomic (DNA) sequences encode an enormous amount of information for gene regulation and protein synthesis. Similar to natural language mode… (see more)ls, researchers have proposed foundation models in genomics to learn generalizable features from unlabeled genome data that can then be fine-tuned for downstream tasks such as identifying regulatory elements. Due to the quadratic scaling of attention, previous Transformer-based genomic models have used 512 to 4k tokens as context (0.001% of the human genome), significantly limiting the modeling of long-range interactions in DNA. In addition, these methods rely on toke

2023-06-27

ArXiv (preprint)

doi.org

arxiv.org

HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

Eric Nguyen

Michael Poli

Marjan Faizi

Armin W Thomas

Callum Birch-Sykes

Michael Wornow

Aman Patel

Clayton M. Rabideau

Stefano Massaroli

Yoshua Bengio

Stefano Ermon

Stephen Baccus

Christopher Re

Genomic (DNA) sequences encode an enormous amount of information for gene regulation and protein synthesis. Similar to natural language mode… (see more)ls, researchers have proposed foundation models in genomics to learn generalizable features from unlabeled genome data that can then be fine-tuned for downstream tasks such as identifying regulatory elements. Due to the quadratic scaling of attention, previous Transformer-based genomic models have used 512 to 4k tokens as context (0.001% of the human genome), significantly limiting the modeling of long-range interactions in DNA. In addition, these methods rely on toke

2023-06-27

ArXiv (preprint)

doi.org

arxiv.org

HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

Eric Nguyen

Michael Poli

Marjan Faizi

Armin W Thomas

Callum Birch-Sykes

Michael Wornow

Aman Patel

Clayton M. Rabideau

Stefano Massaroli

Yoshua Bengio

Stefano Ermon

Stephen Baccus

Christopher Re

Genomic (DNA) sequences encode an enormous amount of information for gene regulation and protein synthesis. Similar to natural language mode… (see more)ls, researchers have proposed foundation models in genomics to learn generalizable features from unlabeled genome data that can then be fine-tuned for downstream tasks such as identifying regulatory elements. Due to the quadratic scaling of attention, previous Transformer-based genomic models have used 512 to 4k tokens as context (0.001% of the human genome), significantly limiting the modeling of long-range interactions in DNA. In addition, these methods rely on toke

2023-06-27

ArXiv (preprint)

doi.org

arxiv.org

Hyena Hierarchy: Towards Larger Convolutional Language Models

Michael Poli

Stefano Massaroli

Eric Nguyen

Daniel Y Fu

Tri Dao

Stephen Baccus

Yoshua Bengio