Publications

Preventing Dimensional Collapse in Contrastive Local Learning with Subsampling

Louis Fournier

Adeetya Patel

Michael Eickenberg

Edouard Oyallon

Eugene Belilovsky

2023-06-16

ICML.cc/2023/Workshop/LLW (publié)

openreview.net

Block-State Transformers

Mahan Fathi

Jonathan Pilault

Orhan Firat

2023-06-15

ArXiv (prépublication)

Block-State Transformers

Mahan Fathi

Jonathan Pilault

Orhan Firat

2023-06-15

ArXiv (prépublication)

Block-State Transformers

Mahan Fathi

Jonathan Pilault

Orhan Firat

2023-06-15

ArXiv (prépublication)

Block-State Transformers

Mahan Fathi

Jonathan Pilault

Orhan Firat

2023-06-15

ArXiv (prépublication)

Block-State Transformers

Mahan Fathi

Jonathan Pilault

Orhan Firat

State space models (SSMs) have shown impressive results on tasks that require modeling long-range dependencies and efficiently scale to long… (voir plus) sequences owing to their subquadratic runtime complexity. Originally designed for continuous signals, SSMs have shown superior performance on a plethora of tasks, in vision and audio; however, SSMs still lag Transformer performance in Language Modeling tasks. In this work, we propose a hybrid layer named Block-State Transformer (BST), that internally combines an SSM sublayer for long-range contextualization, and a Block Transformer sublayer for short-term representation of sequences. We study three different, and completely parallelizable, variants that integrate SSMs and block-wise attention. We show that our model outperforms similar Transformer-based architectures on language modeling perplexity and generalizes to longer sequences. In addition, the Block-State Transformer demonstrates more than tenfold increase in speed at the layer level compared to the Block-Recurrent Transformer when model parallelization is employed.

2023-06-15

ArXiv (prépublication)

GEANT4-DNA simulation of temperature-dependent and pH-dependent yields of chemical radiolytic species

Jingyi Bian

Juan Duran

Wook-Geun Shin

Jose Ramos-Méndez

Jack C Sankey

Lilian Childress

Jan Seuntjens

Shirin A. Enger

2023-06-15

Physics in Medicine & Biology (publié)

A solution algorithm for chance-constrained problems with integer second-stage recourse decisions

Andrea Lodi

Enrico Malaguti

Michele Monaci

Giacomo Nannicini

Paolo

Paronuzzi

2023-06-15

Mathematical programming (publié)

A2CiD2: Accelerating Asynchronous Communication in Decentralized Deep Learning

Adel Nabli

Eugene Belilovsky

Edouard Oyallon

2023-06-14

ArXiv (prépublication)

Best-Case Retrieval Evaluation: Improving the Sensitivity of Reciprocal Rank with Lexicographic Precision

Fernando Diaz

Across a variety of ranking tasks, researchers use reciprocal rank to measure the effectiveness for users interested in exactly one relevant… (voir plus) item. Despite its widespread use, evidence suggests that reciprocal rank is brittle when discriminating between systems. This brittleness, in turn, is compounded in modern evaluation settings where current, high-precision systems may be difficult to distinguish. We address the lack of sensitivity of reciprocal rank by introducing and connecting it to the concept of best-case retrieval, an evaluation method focusing on assessing the quality of a ranking for the most satisfied possible user across possible recall requirements. This perspective allows us to generalize reciprocal rank and define a new preference-based evaluation we call lexicographic precision or lexiprecision. By mathematical construction, we ensure that lexiprecision preserves differences detected by reciprocal rank, while empirically improving sensitivity and robustness across a broad set of retrieval and recommendation tasks.

2023-06-13

ArXiv (prépublication)