Kellin Pelrine

Towards Better Evaluation for Dynamic Link Prediction

Andy Huang

Despite the prevalence of recent success in learning from static graphs, learning from time-evolving graphs remains an open challenge. In th… (see more)is work, we design new, more stringent evaluation procedures for link prediction specific to dynamic graphs, which reflect real-world considerations, to better compare the strengths and weaknesses of methods. First, we create two visualization techniques to understand the reoccurring patterns of edges over time and show that many edges reoccur at later time steps. Based on this observation, we propose a pure memorization-based baseline called EdgeBank. EdgeBank achieves surprisingly strong performance across multiple settings which highlights that the negative edges used in the current evaluation are easy. To sample more challenging negative edges, we introduce two novel negative sampling strategies that improve robustness and better match real-world applications. Lastly, we introduce six new dynamic graph datasets from a diverse set of domains missing from current benchmarks, providing new challenges and opportunities for future research. Our code repository is accessible at https://github.com/fpour/DGB.git.

openreview.net

Online Partisan Polarization of COVID-19

Sacha Lévy

Gabrielle Desrosiers-Brisebois

Jean-François Godbout

André Blais

Reihaneh Rabbany

In today’s age of (mis)information, many people utilize various social media platforms in an attempt to shape public opinion on several im… (see more)portant issues, including elections and the COVID-19 pandemic. These two topics have recently become intertwined given the importance of complying with public health measures related to COVID-19 and politicians’ management of the pandemic. Motivated by this, we study the partisan polarization of COVID-19 discussions on social media. We propose and utilize a novel measure of partisan polarization to analyze more than 380 million posts from Twitter and Parler around the 2020 US presidential election. We find strong correlation between peaks in polarization and polarizing events, such as the January 6th Capitol Hill riot. We further classify each post into key COVID-19 issues of lockdown, masks, vaccines, as well as miscellaneous, to investigate both the volume and polarization on these topics and how they vary through time. Parler includes more negative discussions around lockdown and masks, as expected, but not much around vaccines. We also observe more balanced discussions on Twitter and a general disconnect between the discussions on Parler and Twitter.

2021-12-01

2021 International Conference on Data Mining Workshops (ICDMW) (published)

doi.org

The Surprising Performance of Simple Baselines for Misinformation Detection

Kellin Pelrine

Jacob Danovitch

Reihaneh Rabbany

As social media becomes increasingly prominent in our day to day lives, it is increasingly important to detect informative content and preve… (see more)nt the spread of disinformation and unverified rumours. While many sophisticated and successful models have been proposed in the literature, they are often compared with older NLP baselines such as SVMs, CNNs, and LSTMs. In this paper, we examine the performance of a broad set of modern transformer-based language models and show that with basic fine-tuning, these models are competitive with and can even significantly outperform recently proposed state-of-the-art methods. We present our framework as a baseline for creating and evaluating new methods for misinformation detection. We further study a comprehensive set of benchmark datasets, and discuss potential data leakage and the need for careful design of the experiments and understanding of datasets to account for confounding variables. As an extreme case example, we show that classifying only based on the first three digits of tweet ids, which contain information on the date, gives state-of-the-art performance on a commonly used benchmark dataset for fake news detection –Twitter16. We provide a simple tool to detect this problem and suggest steps to mitigate it in future datasets.

2021-06-03

Proceedings of the Web Conference 2021 (published)

doi.org

arxiv.org

ComplexDataLab at W-NUT 2020 Task 2: Detecting Informative COVID-19 Tweets by Attending over Linked Documents

Kellin Pelrine

Jacob Danovitch

Albert Orozco Camacho

Reihaneh Rabbany

Given the global scale of COVID-19 and the flood of social media content related to it, how can we find informative discussions? We present … (see more)Gapformer, which effectively classifies content as informative or not. It reformulates the problem as graph classification, drawing on not only the tweet but connected webpages and entities. We leverage a pre-trained language model as well as the connections between nodes to learn a pooled representation for each document network. We show it outperforms several competitive baselines and present ablation studies supporting the benefit of the linked information. Code is available on Github.

2020-11-01

WNUT (published)

doi.org

Hackathon | Building safer AI for youth mental health

Indigenous Pathfinders in AI

AI Advantage

Kellin Pelrine

Publications

Hackathon | Building safer AI for youth mental health

Indigenous Pathfinders in AI

AI Advantage

Popular keywords:

Kellin Pelrine

Publications