Portrait de Reihaneh Rabbany

Reihaneh Rabbany

Membre académique principal
Chaire en IA Canada-CIFAR
Professeure adjointe, McGill University, École d'informatique
Sujets de recherche
Apprentissage de représentations
Apprentissage sur graphes
Exploration des données
Réseaux de neurones en graphes
Traitement du langage naturel

Biographie

Reihaneh Rabbany est professeure adjointe à l'École d'informatique de l'Université McGill. Elle est membre du corps professoral de Mila – Institut québécois d’intelligence artificielle et titulaire d'une chaire en IA Canada-CIFAR. Elle est également membre du corps enseignant du Centre pour l’étude de la citoyenneté démocratique de McGill. Avant de se joindre à l’Université McGill, elle a été boursière postdoctorale à la School of Computer Science de l'Université Carnegie Mellon. Elle a obtenu un doctorat à l’Université de l’Alberta, au Département d'informatique. Elle dirige le laboratoire de données complexes, dont les recherches se situent à l'intersection de la science des réseaux, de l'exploration des données et de l'apprentissage automatique, et se concentrent sur l'analyse des données interconnectées du monde réel et sur les applications sociales.

Étudiants actuels

Postdoctorat - McGill
Stagiaire de recherche - McGill
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Stagiaire de recherche - UdeM
Superviseur⋅e principal⋅e :
Visiteur de recherche indépendant - University of Sherbrooke
Doctorat - McGill
Co-superviseur⋅e :
Collaborateur·rice alumni - McGill
Co-superviseur⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Stagiaire de recherche - McGill University
Doctorat - McGill
Co-superviseur⋅e :
Postdoctorat - McGill
Superviseur⋅e principal⋅e :
Maîtrise recherche - McGill
Co-superviseur⋅e :
Visiteur de recherche indépendant - McGill
Collaborateur·rice alumni - McGill
Collaborateur·rice alumni - McGill
Co-superviseur⋅e :
Collaborateur·rice alumni - McGill
Stagiaire de recherche - McGill
Maîtrise recherche - McGill University
Stagiaire de recherche - McGill
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Maîtrise recherche - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - McGill
Collaborateur·rice de recherche - UdeM
Superviseur⋅e principal⋅e :
Doctorat - McGill
Maîtrise recherche - UdeM
Superviseur⋅e principal⋅e :

Publications

Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation
Recent large language models (LLMs) have been shown to be effective for misinformation detection. However, the choice of LLMs for experiment… (voir plus)s varies widely, leading to uncertain conclusions. In particular, GPT-4 is known to be strong in this domain, but it is closed source, potentially expensive, and can show instability between different versions. Meanwhile, alternative LLMs have given mixed results. In this work, we show that Zephyr-7b presents a consistently viable alternative, overcoming key limitations of commonly used approaches like Llama-2 and GPT-3.5. This provides the research community with a solid open-source option and shows open-source models are gradually catching up on this task. We then highlight how GPT-3.5 exhibits unstable performance, such that this very widely used model could provide misleading results in misinformation detection. Finally, we validate new tools including approaches to structured output and the latest version of GPT-4 (Turbo), showing they do not compromise performance, thus unlocking them for future research and potentially enabling more complex pipelines for misinformation mitigation.
GPS-SSL: Guided Positive Sampling to Inject Prior Into Self-Supervised Learning
Randall Balestriero
Arantxa Casanova
Adriana Romero
We propose Guided Positive Sampling Self-Supervised Learning (GPS-SSL), a general method to inject a priori knowledge into Self-Supervised L… (voir plus)earning (SSL) positive samples selection. Current SSL methods leverage Data-Augmentations (DA) for generating positive samples and incorporate prior knowledge - an incorrect, or too weak DA will drastically reduce the quality of the learned representation. GPS-SSL proposes instead to design a metric space where Euclidean distances become a meaningful proxy for semantic relationship. In that space, it is now possible to generate positive samples from nearest neighbor sampling. Any prior knowledge can now be embedded into that metric space independently from the employed DA. From its simplicity, GPS-SSL is applicable to any SSL method, e.g. SimCLR or BYOL. A key benefit of GPS-SSL is in reducing the pressure in tailoring strong DAs. For example GPS-SSL reaches 85.58% on Cifar10 with weak DA while the baseline only reaches 37.51%. We therefore move a step forward towards the goal of making SSL less reliant on DA. We also show that even when using strong DAs, GPS-SSL outperforms the baselines on under-studied domains. We evaluate GPS-SSL along with multiple baseline SSL methods on numerous downstream datasets from different domains when the models use strong or minimal data augmentations. We hope that GPS-SSL will open new avenues in studying how to inject a priori knowledge into SSL in a principled manner.
Uncertainty Resolution in Misinformation Detection
An Evaluation of Language Models for Hyperpartisan Ideology Detection in Persian Twitter
Large Language Models (LLMs) have shown significant promise in various tasks, including identifying the political beliefs of English-speakin… (voir plus)g social media users from their posts. However, assessing LLMs for this task in non-English languages remains unexplored. In this work, we ask to what extent LLMs can predict the political ideologies of users in Persian social media. To answer this question, we first acknowledge that political parties are not well-defined among Persian users, and therefore, we simplify the task to a much simpler task of hyperpartisan ideology detection. We create a new benchmark and show the potential and limitations of both open-source and commercial LLMs in classifying the hyper-partisan ideologies of users. We compare these models with smaller fine-tuned models, both on the Persian language (ParsBERT) and translated data (RoBERTa), showing that they considerably outperform generative LLMs in this task. We further demonstrate that the performance of the generative LLMs degrades when classifying users based on their tweets instead of their bios and even when tweets are added as additional information, whereas the smaller fine-tuned models are robust and achieve similar performance for all classes. This study is a first step toward political ideology detection in Persian Twitter, with implications for future research to understand the dynamics of ideologies in Persian social media.
Game On, Hate Off: A Study of Toxicity in Online Multiplayer Environments.
Nicolas Grenon-Godbout
The advent of online spaces, particularly social media platforms and video games, has brought forth a significant challenge: the detection a… (voir plus)nd mitigation of toxic and harmful speech. This issue is not only pervasive but also detrimental to the overall user experience. In this study, we leverage small language models to reliably detect toxicity, achieving an average precision of 0.95. Analyzing eight months of chat data from two Ubisoft games, we uncover patterns and trends in toxic behavior. The insights derived from our research will contribute to the development of healthier online communities and inform preventive measures against toxicity.
Quantifying learning-style adaptation in effectiveness of LLM teaching
Ruben Weijers
Gabrielle Fidelis de Castilho
This preliminary study aims to investigate whether AI, when prompted based on individual learning styles, can effectively improve comprehens… (voir plus)ion and learning experiences in educational settings. It involves tailoring LLMs baseline prompts and comparing the results of a control group receiving standard content and an experimental group receiving learning style-tailored content. Preliminary results suggest that GPT-4 can generate responses aligned with various learning styles, indicating the potential for enhanced engagement and comprehension. However, these results also reveal challenges, including the model’s tendency for sycophantic behavior and variability in responses. Our findings suggest that a more sophisticated prompt engineering approach is required for integrating AI into education (AIEd) to improve educational outcomes.
Temporal Graph Analysis with TGX
Real-world networks, with their evolving relations, are best captured as temporal graphs. However, existing software libraries are largely d… (voir plus)esigned for static graphs where the dynamic nature of temporal graphs is ignored. Bridging this gap, we introduce TGX, a Python package specially designed for analysis of temporal networks that encompasses an automated pipeline for data loading, data processing, and analysis of evolving graphs. TGX provides access to eleven built-in datasets and eight external Temporal Graph Benchmark (TGB) datasets as well as any novel datasets in the .csv format. Beyond data loading, TGX facilitates data processing functionalities such as discretization of temporal graphs and node subsampling to accelerate working with larger datasets. For comprehensive investigation, TGX offers network analysis by providing a diverse set of measures, including average node degree and the evolving number of nodes and edges per timestamp. Additionally, the package consolidates meaningful visualization plots indicating the evolution of temporal patterns, such as Temporal Edge Appearance (TEA) and Temporal Edge Trafficc (TET) plots. The TGX package is a robust tool for examining the features of temporal graphs and can be used in various areas like studying social networks, citation networks, and tracking user interactions. We plan to continuously support and update TGX based on community feedback. TGX is publicly available on: https://github.com/ComplexData-MILA/TGX.
UTG: Towards a Unified View of Snapshot and Event Based Models for Temporal Graphs
Many real world graphs are inherently dynamic, constantly evolving with node and edge additions. These graphs can be represented by temporal… (voir plus) graphs, either through a stream of edge events or a sequence of graph snapshots. Until now, the development of machine learning methods for both types has occurred largely in isolation, resulting in limited experimental comparison and theoretical crosspollination between the two. In this paper, we introduce Unified Temporal Graph (UTG), a framework that unifies snapshot-based and event-based machine learning models under a single umbrella, enabling models developed for one representation to be applied effectively to datasets of the other. We also propose a novel UTG training procedure to boost the performance of snapshot-based models in the streaming setting. We comprehensively evaluate both snapshot and event-based models across both types of temporal graphs on the temporal link prediction task. Our main findings are threefold: first, when combined with UTG training, snapshot-based models can perform competitively with event-based models such as TGN and GraphMixer even on event datasets. Second, snapshot-based models are at least an order of magnitude faster than most event-based models during inference. Third, while event-based methods such as NAT and DyGFormer outperforms snapshot-based methods on both types of temporal graphs, this is because they leverage joint neighborhood structural features thus emphasizing the potential to incorporate these features into snapshotbased models as well. These findings highlight the importance of comparing model architectures independent of the data format and suggest the potential of combining the efficiency of snapshot-based models with the performance of event-based models in the future.
Exhaustive Evaluation of Dynamic Link Prediction
Dynamic link prediction is a crucial task in the study of evolving graphs, which serve as abstract models for various real-world application… (voir plus)s. Recent dynamic graph representation learning models have claimed near-perfect performance in this task. However, we argue that the standard evaluation strategy for dynamic link prediction overlooks the sparsity and recurrence patterns inherent in dynamic networks. Specifically, the current strategy suffers from issues such as evaluating models on a balanced set of positive and negative edges, neglecting the reassessment of frequently recurring positive edges, and lacking a comprehensive evaluation of both recurring and new edges.To address these limitations, we propose a novel evaluation strategy called EXHAUSTIVE, which takes into account all relevant negative edges and separately assesses the performance on recurring and new edges. Using our proposed evaluation strategy, we compare the performance of five state-of-the-art dynamic graph learning models on seven benchmark datasets. Compared to the previous common evaluation strategy, we observe an average drop of 62% in Average Precision for dynamic link prediction. Additionally, the ranking of the models also changes under the new evaluation setting. Furthermore, we demonstrate that while all models perform considerably worse when predicting new edges compared to recurring ones, the best performing models differ between the two scenarios. This highlights the importance of employing the proposed evaluation strategy for both the assessment and design of dynamic link prediction models. By adopting our novel evaluation strategy, researchers can obtain a more accurate understanding of model performance in dynamic link prediction, leading to improved evaluation and design of such models.
SWEET: Weakly Supervised Person Name Extraction for Fighting Human Trafficking
Laplacian Change Point Detection for Single and Multi-view Dynamic Graphs
Dynamic graphs are rich data structures that are used to model complex relationships between entities over time. In particular, anomaly dete… (voir plus)ction in temporal graphs is crucial for many real world applications such as intrusion identification in network systems, detection of ecosystem disturbances and detection of epidemic outbreaks. In this paper, we focus on change point detection in dynamic graphs and address three main challenges associated with this problem: i). how to compare graph snapshots across time, ii). how to capture temporal dependencies, and iii). how to combine different views of a temporal graph. To solve the above challenges, we first propose Laplacian Anomaly Detection (LAD) which uses the spectrum of graph Laplacian as the low dimensional embedding of the graph structure at each snapshot. LAD explicitly models short term and long term dependencies by applying two sliding windows. Next, we propose MultiLAD, a simple and effective generalization of LAD to multi-view graphs. MultiLAD provides the first change point detection method for multi-view dynamic graphs. It aggregates the singular values of the normalized graph Laplacian from different views through the scalar power mean operation. Through extensive synthetic experiments, we show that i). LAD and MultiLAD are accurate and outperforms state-of-the-art baselines and their multi-view extensions by a large margin, ii). MultiLAD's advantage over contenders significantly increases when additional views are available, and iii). MultiLAD is highly robust to noise from individual views. In five real world dynamic graphs, we demonstrate that LAD and MultiLAD identify significant events as top anomalies such as the implementation of government COVID-19 interventions which impacted the population mobility in multi-view traffic networks.
Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4
Meilina Reksoprodjo
Caleb Gupta
Joel Christoph
Published online: 24 May 2023