Istabrak Abbes

Doctorat - UdeM

Superviseur⋅e principal⋅e

Sarath Chandar

Sujets de recherche

Apprentissage continu

Apprentissage profond

Traitement du langage naturel

Publications

Emergent Reasoning via Recursive Latent Reinforcement Pretraining

Large language models (LLMs) often rely on explicit chain-of-thought (CoT) traces to solve multi-step reasoning problems, but these traces i… (voir plus)ncrease inference cost, expose brittle prompt dependence, and complicate training objectives. We study an alternative: \emph{latent deliberation} implemented as a small recurrent refinement module that performs multiple internal ``thinking`` steps while keeping the external sequence length fixed. We introduce \textbf{Recursive Latent Reinforcement Pretraining (RLRP)}, a training recipe that augments a base causal LLM with a shared latent head executed for

2026-03-04

LLM_Reasoning @ International Conference on Learning Representations (publié)

openreview.net

Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models

Matthew D Riemer

Tsuguchika Tabaru

Hiroaki Kingetsu

A. Chandar

Irina Rish

2025-09-21

NeurIPS.cc/2025/Workshop/WiML (publié)

doi.org

openreview.net

Small Encoders Can Rival Large Decoders in Detecting Groundedness

Istabrak Abbes

Gabriele Prato

Quentin Fournier

Fernando Rodriguez

Alaa Boukhary

Adam Elwood

A. Chandar

Augmenting large language models (LLMs) with external context significantly improves their performance in natural language processing (NLP) … (voir plus)tasks. However, LLMs struggle to answer queries reliably when the provided context lacks information, often resorting to ungrounded speculation or internal knowledge. Groundedness - generating responses strictly supported by the context - is essential for ensuring factual consistency and trustworthiness. This study focuses on detecting whether a given query is grounded in a document provided in context before the costly answer generation by LLMs. Such a detection mechanism can significantly reduce both inference time and resource consumption. We show that lightweight, task specific encoder models such as RoBERTa and NomicBERT, fine-tuned on curated datasets, can achieve accuracy comparable to state-of-the-art LLMs, such as Llama3 8B and GPT4o, in groundedness detection while reducing inference latency by orders of magnitude. The code is available at : https://github.com/chandarlab/Hallucinate-less

2025-06-25

ArXiv (prépublication)

doi.org

arxiv.org

Mila Techaide 2026

Désinformation 2.0 : quand l’IA brouille nos ondes

Avantage IA : productivité dans la fonction publique

Istabrak Abbes

Publications

Mila Techaide 2026

Désinformation 2.0 : quand l’IA brouille nos ondes

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Istabrak Abbes

Publications