Portrait de Zachary Yang n'est pas disponible

Zachary Yang

Doctorat - McGill University
Superviseur⋅e principal⋅e

Publications

Party Prediction for Twitter
Kellin Pelrine
Anne Imouza
Zachary Yang
Jacob-Junqi Tian
Sacha Lévy
Gabrielle Desrosiers-Brisebois
Aarash Feizi
C'ecile Amadoro
André Blais
Jean-François Godbout
Open, Closed, or Small Language Models for Text Classification?
Hao Yu
Zachary Yang
Kellin Pelrine
Jean-François Godbout
Recent advancements in large language models have demonstrated remarkable capabilities across various NLP tasks. But many questions remain, … (voir plus)including whether open-source models match closed ones, why these models excel or struggle with certain tasks, and what types of practical procedures can improve performance. We address these questions in the context of classification by evaluating three classes of models using eight datasets across three distinct tasks: named entity recognition, political party prediction, and misinformation detection. While larger LLMs often lead to improved performance, open-source models can rival their closed-source counterparts by fine-tuning. Moreover, supervised smaller models, like RoBERTa, can achieve similar or even greater performance in many datasets compared to generative LLMs. On the other hand, closed models maintain an advantage in hard tasks that demand the most generalizability. This study underscores the importance of model selection based on task requirements
ToxBuster: In-game Chat Toxicity Buster with BERT
Zachary Yang
Yasmine Maricar
M. Davari
Nicolas Grenon-Godbout
Detecting toxicity in online spaces is challenging and an ever more pressing problem given the increase in social media and gaming consumpti… (voir plus)on. We introduce ToxBuster, a simple and scalable model trained on a relatively large dataset of 194k lines of game chat from Rainbow Six Siege and For Honor, carefully annotated for different kinds of toxicity. Compared to the existing state-of-the-art, ToxBuster achieves 82.95% (+7) in precision and 83.56% (+57) in recall. This improvement is obtained by leveraging past chat history and metadata. We also study the implication towards real-time and post-game moderation as well as the model transferability from one game to another.
Towards Detecting Contextual Real-Time Toxicity for In-Game Chat
Zachary Yang
Nicolas Grenon-Godbout
Real-time toxicity detection in online environments poses a significant challenge, due to the increasing prevalence of social media and gami… (voir plus)ng platforms. We introduce ToxBuster, a simple and scalable model that reliably detects toxic content in real-time for a line of chat by including chat history and metadata. ToxBuster consistently outperforms conventional toxicity models across popular multiplayer games, including Rainbow Six Siege, For Honor, and DOTA 2. We conduct an ablation study to assess the importance of each model component and explore ToxBuster's transferability across the datasets. Furthermore, we showcase ToxBuster's efficacy in post-game moderation, successfully flagging 82.1% of chat-reported players at a precision level of 90.0%. Additionally, we show how an additional 6% of unreported toxic players can be proactively moderated.
Online Partisan Polarization of COVID-19
Zachary Yang
Anne Imouza
Kellin Pelrine
Sacha Lévy
Jiewen Liu
Gabrielle Desrosiers-Brisebois
Jean-François Godbout
André Blais
In today’s age of (mis)information, many people utilize various social media platforms in an attempt to shape public opinion on several im… (voir plus)portant issues, including elections and the COVID-19 pandemic. These two topics have recently become intertwined given the importance of complying with public health measures related to COVID-19 and politicians’ management of the pandemic. Motivated by this, we study the partisan polarization of COVID-19 discussions on social media. We propose and utilize a novel measure of partisan polarization to analyze more than 380 million posts from Twitter and Parler around the 2020 US presidential election. We find strong correlation between peaks in polarization and polarizing events, such as the January 6th Capitol Hill riot. We further classify each post into key COVID-19 issues of lockdown, masks, vaccines, as well as miscellaneous, to investigate both the volume and polarization on these topics and how they vary through time. Parler includes more negative discussions around lockdown and masks, as expected, but not much around vaccines. We also observe more balanced discussions on Twitter and a general disconnect between the discussions on Parler and Twitter.