Parishad BehnamGhader

Maîtrise recherche - McGill

Superviseur⋅e principal⋅e

Siva Reddy

Sujets de recherche

Traitement du langage naturel

Site web

Google Scholar

GitHub

Publications

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

Parishad BehnamGhader

2024-04-09

ArXiv (prépublication)

doi.org

arxiv.org

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

Parishad BehnamGhader

Large decoder-only language models (LLMs) are the state-of-the-art models on most of today's NLP tasks and benchmarks. Yet, the community is… (voir plus) only slowly adopting these models for text embedding tasks, which require rich contextualized representations. In this work, we introduce LLM2Vec, a simple unsupervised approach that can transform any decoder-only LLM into a strong text encoder. LLM2Vec consists of three simple steps: 1) enabling bidirectional attention, 2) masked next token prediction, and 3) unsupervised contrastive learning. We demonstrate the effectiveness of LLM2Vec by applying it to 4 popular LLMs ranging from 1.3B to 8B parameters and evaluate the transformed models on English word- and sequence-level tasks. We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB). Moreover, when combining LLM2Vec with supervised contrastive learning, we achieve state-of-the-art performance on MTEB among models that train only on publicly available data (as of May 24, 2024). Our strong empirical results and extensive analysis demonstrate that LLMs can be effectively transformed into universal text encoders in a parameter-efficient manner without the need for expensive adaptation or synthetic GPT-4 generated data.

2024-04-09

ArXiv (prépublication)

doi.org

arxiv.org

Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model

Parishad BehnamGhader

Santiago Miret

Siva Reddy

Augmenting pretrained language models with retrievers to select the supporting documents has shown promise in effectively solving common NLP… (voir plus) problems, including language modeling and question answering, in an interpretable way. In this paper, we first study the strengths and weaknesses of different retriever-augmented language models (REALM,

2023-12-01

Findings of the Association for Computational Linguistics: EMNLP 2023 (publié)

doi.org

openreview.net

Hackathon | Créer une IA plus sécuritaire pour la santé mentale des jeunes

Éclaireurs autochtones en IA

Avantage IA

Parishad BehnamGhader

Publications

Hackathon | Créer une IA plus sécuritaire pour la santé mentale des jeunes

Éclaireurs autochtones en IA

Avantage IA

Mots-clés populaires:

Parishad BehnamGhader

Publications