Kellin Pelrine

Uncertainty Resolution in Misinformation Detection

Yury Orlovskiy

Camille Thibault

Anne Imouza

2024-01-02

ArXiv (preprint)

An Evaluation of Language Models for Hyperpartisan Ideology Detection in Persian Twitter

Sahar Omidi Shayegan

Isar Nejadgholi

Hao Yu

Sacha Lévy

Zachary Yang

Large Language Models (LLMs) have shown significant promise in various tasks, including identifying the political beliefs of English-speakin… (see more)g social media users from their posts. However, assessing LLMs for this task in non-English languages remains unexplored. In this work, we ask to what extent LLMs can predict the political ideologies of users in Persian social media. To answer this question, we first acknowledge that political parties are not well-defined among Persian users, and therefore, we simplify the task to a much simpler task of hyperpartisan ideology detection. We create a new benchmark and show the potential and limitations of both open-source and commercial LLMs in classifying the hyper-partisan ideologies of users. We compare these models with smaller fine-tuned models, both on the Persian language (ParsBERT) and translated data (RoBERTa), showing that they considerably outperform generative LLMs in this task. We further demonstrate that the performance of the generative LLMs degrades when classifying users based on their tweets instead of their bios and even when tweets are added as additional information, whereas the smaller fine-tuned models are robust and achieve similar performance for all classes. This study is a first step toward political ideology detection in Persian Twitter, with implications for future research to understand the dynamics of ideologies in Persian social media.

2024-01-01

EURALI (published)

www.semanticscholar.org

An Evaluation of Language Models for Hyperpartisan Ideology Detection in Persian Twitter

Sahar Omidi Shayegan

Isar Nejadgholi

Hao Yu

Sacha Lévy

Zachary Yang

Large Language Models (LLMs) have shown significant promise in various tasks, including identifying the political beliefs of English-speakin… (see more)g social media users from their posts. However, assessing LLMs for this task in non-English languages remains unexplored. In this work, we ask to what extent LLMs can predict the political ideologies of users in Persian social media. To answer this question, we first acknowledge that political parties are not well-defined among Persian users, and therefore, we simplify the task to a much simpler task of hyperpartisan ideology detection. We create a new benchmark and show the potential and limitations of both open-source and commercial LLMs in classifying the hyper-partisan ideologies of users. We compare these models with smaller fine-tuned models, both on the Persian language (ParsBERT) and translated data (RoBERTa), showing that they considerably outperform generative LLMs in this task. We further demonstrate that the performance of the generative LLMs degrades when classifying users based on their tweets instead of their bios and even when tweets are added as additional information, whereas the smaller fine-tuned models are robust and achieve similar performance for all classes. This study is a first step toward political ideology detection in Persian Twitter, with implications for future research to understand the dynamics of ideologies in Persian social media.

Quantifying learning-style adaptation in effectiveness of LLM teaching

Ruben Weijers

Gabrielle Fidelis de Castilho

This preliminary study aims to investigate whether AI, when prompted based on individual learning styles, can effectively improve comprehens… (see more)ion and learning experiences in educational settings. It involves tailoring LLMs baseline prompts and comparing the results of a control group receiving standard content and an experimental group receiving learning style-tailored content. Preliminary results suggest that GPT-4 can generate responses aligned with various learning styles, indicating the potential for enhanced engagement and comprehension. However, these results also reveal challenges, including the model’s tendency for sycophantic behavior and variability in responses. Our findings suggest that a more sophisticated prompt engineering approach is required for integrating AI into education (AIEd) to improve educational outcomes.

SWEET - Weakly Supervised Person Name Extraction for Fighting Human Trafficking

Javin Liu

Hao Yu

In this work, we propose a weak supervision pipeline SWEET: Supervise Weakly for Entity Extraction to fight Trafficking for extracting perso… (see more)n names from noisy escort advertisements. Our method combines the simplicity of rule-matching (through antirules, i.e., negated rules) and the generalizability of large language models fine-tuned on benchmark, domain-specific and synthetic datasets, treating them as weak labels. One of the major challenges in this domain is limited labeled data. SWEET addresses this by obtaining multiple weak labels through labeling functions and effectively aggregating them. SWEET outperforms the previous supervised SOTA method for this task by 9% F1 score on domain data and better generalizes to common benchmark datasets. Furthermore, we also release HTGEN, a synthetically generated dataset of escort advertisements (built using ChatGPT) to facilitate further research within the community.

2023-12-01

Findings of the Association for Computational Linguistics: EMNLP 2023 (published)

openreview.net

Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4

Anne Imouza

Meilina Reksoprodjo

Camille Thibault

Caleb Gupta

Joel Christoph

Misinformation poses a critical societal challenge, and current approaches have yet to produce an effective solution. We propose focusing on… (see more) generalization, uncertainty, and how to leverage recent large language models, in order to create more practical tools to evaluate information veracity in contexts where perfect classification is impossible. We first demonstrate that GPT-4 can outperform prior methods in multiple settings and languages. Next, we explore generalization, revealing that GPT-4 and RoBERTa-large exhibit differences in failure modes. Third, we propose techniques to handle uncertainty that can detect impossible examples and strongly improve outcomes. We also discuss results on other language models, temperature, prompting, versioning, explainability, and web retrieval, each one providing practical insights and directions for future research. Finally, we publish the LIAR-New dataset with novel paired English and French misinformation data and Possibility labels that indicate if there is sufficient context for veracity evaluation. Overall, this research lays the groundwork for future tools that can drive real-world progress to combat misinformation.

2023-10-07

EMNLP/2023/Conference (accepted)

openreview.net

Party Prediction for Twitter

Sacha Lévy

Gabrielle Desrosiers-Brisebois

Aarash Feizi

C'ecile Amadoro

André Blais

2023-08-25

ArXiv (preprint)

Open, Closed, or Small Language Models for Text Classification?

Hao Yu

Zachary Yang

Recent advancements in large language models have demonstrated remarkable capabilities across various NLP tasks. But many questions remain, … (see more)including whether open-source models match closed ones, why these models excel or struggle with certain tasks, and what types of practical procedures can improve performance. We address these questions in the context of classification by evaluating three classes of models using eight datasets across three distinct tasks: named entity recognition, political party prediction, and misinformation detection. While larger LLMs often lead to improved performance, open-source models can rival their closed-source counterparts by fine-tuning. Moreover, supervised smaller models, like RoBERTa, can achieve similar or even greater performance in many datasets compared to generative LLMs. On the other hand, closed models maintain an advantage in hard tasks that demand the most generalizability. This study underscores the importance of model selection based on task requirements

2023-08-19

ArXiv (preprint)

Active Keyword Selection to Track Evolving Topics on Twitter

Sacha Lévy

Farimah Poursafaei

How can we study social interactions on evolving topics at a mass scale? Over the past decade, researchers from diverse fields such as econo… (see more)mics, political science, and public health have often done this by querying Twitter's public API endpoints with hand-picked topical keywords to search or stream discussions. However, despite the API's accessibility, it remains difficult to select and update keywords to collect high-quality data relevant to topics of interest. In this paper, we propose an active learning method for rapidly refining query keywords to increase both the yielded topic relevance and dataset size. We leverage a large open-source COVID-19 Twitter dataset to illustrate the applicability of our method in tracking Tweets around the key sub-topics of Vaccine, Mask, and Lockdown. Our experiments show that our method achieves an average topic-related keyword recall 2x higher than baselines. We open-source our code along with a web interface for keyword selection to make data collection from Twitter more systematic for researchers.

2022-12-01

2022 IEEE International Conference on Data Mining Workshops (ICDMW) (published)

Extracting Person Names from User Generated Text: Named-Entity Recognition for Combating Human Trafficking

2022-01-01

Findings (published)

Extracting Person Names from User Generated Text: Named-Entity Recognition for Combating Human Trafficking

Online escort advertisement websites are widely used for advertising victims of human trafficking. Domain experts agree that advertising mul… (see more)tiple people in the same ad is a strong indicator of trafficking. Thus, extracting person names from the text of these ads can provide valuable clues for further analysis. However, Named-Entity Recognition (NER) on escort ads is challenging because the text can be noisy, colloquial and often lacking proper grammar and punctuation. Most existing state-of-the-art NER models fail to demonstrate satisfactory performance in this task. In this paper, we propose NEAT (Name Extraction Against Trafficking) for extracting person names. It effectively combines classic rule-based and dictionary extractors with a contextualized language model to capture ambiguous names (e.g penny, hazel) and adapts to adversarial changes in the text by expanding its dictionary. NEAT shows 19% improvement on average in the F1 classification score for name extraction compared to previous state-of-the-art in two domain-specific datasets.

2022-01-01

Findings (published)