Gabriele Prato

Effect of Document Packing on the Latent Multi-Hop Reasoning Capabilities of Large Language Models

A. Chandar

The standard practice for training large language models involves packing multiple documents together to optimize computational efficiency. … (voir plus)However, the impact of this process on the models' capabilities remains largely unexplored. To address this gap, we investigate how different document-packing strategies influence the latent multi-hop reasoning abilities of LLMs. Our findings indicate that packing can improve model performance compared to training on individual documents, at the expense of more compute. To further understand the underlying mechanisms, we conduct an ablation study, identifying key factors that explain the advantages of packing. Ultimately, our research deepens the understanding of LLM training dynamics and provides practical insights for optimizing model development.

2025-12-15

ArXiv (prépublication)

doi.org

arxiv.org

Small Encoders Can Rival Large Decoders in Detecting Groundedness

Istabrak Abbes

Gabriele Prato

Quentin Fournier

Fernando Rodriguez

Alaa Boukhary

Adam Elwood

A. Chandar

Augmenting large language models (LLMs) with external context significantly improves their performance in natural language processing (NLP) … (voir plus)tasks. However, LLMs struggle to answer queries reliably when the provided context lacks information, often resorting to ungrounded speculation or internal knowledge. Groundedness - generating responses strictly supported by the context - is essential for ensuring factual consistency and trustworthiness. This study focuses on detecting whether a given query is grounded in a document provided in context before the costly answer generation by LLMs. Such a detection mechanism can significantly reduce both inference time and resource consumption. We show that lightweight, task specific encoder models such as RoBERTa and NomicBERT, fine-tuned on curated datasets, can achieve accuracy comparable to state-of-the-art LLMs, such as Llama3 8B and GPT4o, in groundedness detection while reducing inference latency by orders of magnitude. The code is available at : https://github.com/chandarlab/Hallucinate-less

2025-06-25

ArXiv (prépublication)

doi.org

arxiv.org

Do Large Language Models Know How Much They Know?

Gabriele Prato

Jerry Huang

Prasanna Parthasarathi

Shagun Sodhani

A. Chandar

2023-12-31

EMNLP (publié)

doi.org

arxiv.org

EpiK-Eval: Evaluation for Language Models as Epistemic Models

Gabriele Prato

Jerry Huang

Prasanna Parthasarathi

Shagun Sodhani

A. Chandar

In the age of artificial intelligence, the role of large language models (LLMs) is becoming increasingly central. Despite their growing prev… (voir plus)alence, their capacity to consolidate knowledge from different training documents—a crucial ability in numerous applications—remains unexplored. This paper presents the first study examining the capability of LLMs to effectively combine such information within their parameter space. We introduce EpiK-Eval, a novel question-answering benchmark tailored to evaluate LLMs' proficiency in formulating a coherent and consistent knowledge representation from segmented narratives. Evaluations across various LLMs reveal significant weaknesses in this domain. We contend that these shortcomings stem from the intrinsic nature of prevailing training objectives. Consequently, we advocate for refining the approach towards knowledge consolidation, as it harbors the potential to dramatically improve their overall effectiveness and performance. The findings from this study offer insights for developing more robust and reliable LLMs. Our code and benchmark are available at https://github.com/chandar-lab/EpiK-Eval

2023-10-06

EMNLP/2023/Conference (accepté)

doi.org

openreview.net

PatchBlender: A Motion Prior for Video Transformers

Gabriele Prato

Yale Song

Janarthanan Rajendran

R Devon Hjelm

Neel Joshi

A. Chandar

2022-11-10

ArXiv (prépublication)

doi.org

openreview.net

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

A. Chandar

Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly… (voir plus) in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense that it includes new image classes not encountered during training. Our current main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers. Our key observations are that (1) such performance improvements are well-approximated by power laws (linear log-log plots) as the training set size increases, (2) this applies to both cases of target data coming from either the same or from a different domain (i.e., new classes) as the training data, and (3) few-shot performance on new classes converges at a faster rate than the standard classification performance on previously seen classes. Our findings shed new light on the relationship between scale and generalization.

2021-10-12

ArXiv (prépublication)

openreview.net

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Gabriele Prato

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Gabriele Prato

Publications