Portrait de Reihaneh Rabbany

Reihaneh Rabbany

Membre académique principal
Chaire en IA Canada-CIFAR
Professeure adjointe, McGill University, École d'informatique
Sujets de recherche
Apprentissage de représentations
Apprentissage sur graphes
Exploration des données
Réseaux de neurones en graphes
Traitement du langage naturel

Biographie

Reihaneh Rabbany est professeure adjointe à l'École d'informatique de l'Université McGill. Elle est membre du corps professoral de Mila – Institut québécois d’intelligence artificielle et titulaire d'une chaire en IA Canada-CIFAR. Elle est également membre du corps enseignant du Centre pour l’étude de la citoyenneté démocratique de McGill. Avant de se joindre à l’Université McGill, elle a été boursière postdoctorale à la School of Computer Science de l'Université Carnegie Mellon. Elle a obtenu un doctorat à l’Université de l’Alberta, au Département d'informatique. Elle dirige le laboratoire de données complexes, dont les recherches se situent à l'intersection de la science des réseaux, de l'exploration des données et de l'apprentissage automatique, et se concentrent sur l'analyse des données interconnectées du monde réel et sur les applications sociales.

Étudiants actuels

Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Collaborateur·rice de recherche - McGill
Collaborateur·rice de recherche - University of Mannheim
Superviseur⋅e principal⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Stagiaire de recherche - UdeM
Maîtrise recherche - McGill
Co-superviseur⋅e :
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Co-superviseur⋅e :
Collaborateur·rice de recherche
Collaborateur·rice de recherche
Superviseur⋅e principal⋅e :
Stagiaire de recherche - McGill
Maîtrise recherche - McGill
Stagiaire de recherche - Université de Montréal
Doctorat - McGill
Stagiaire de recherche - UdeM

Publications

T-NET: Weakly Supervised Graph Learning for Combatting Human Trafficking
Pratheeksha Nair
Javin Liu
Catalina Vajiac
Andreas Olligschlaeger
Duen Horng Chau
Mirela T. Cazzolato
Cara Jones
Christos Faloutsos
Human trafficking (HT) for forced sexual exploitation, often described as modern-day slavery, is a pervasive problem that affects millions o… (voir plus)f people worldwide. Perpetrators of this crime post advertisements (ads) on behalf of their victims on adult service websites (ASW). These websites typically contain hundreds of thousands of ads including those posted by independent escorts, massage parlor agencies and spammers (fake ads). Detecting suspicious activity in these ads is difficult and developing data-driven methods is challenging due to the hard-to-label, complex and sensitive nature of the data. In this paper, we propose T-Net, which unlike previous solutions, formulates this problem as weakly supervised classification. Since it takes several months to years to investigate a case and obtain a single definitive label, we design domain-specific signals or indicators that provide weak labels. T-Net also looks into connections between ads and models the problem as a graph learning task instead of classifying ads independently. We show that T-Net outperforms all baselines on a real-world dataset of ads by 7% average weighted F1 score. Given that this data contains personally identifiable information, we also present a realistic data generator and provide the first publicly available dataset in this domain which may be leveraged by the wider research community.
Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets
Shenyang Huang
Joao Alex Cunha
Zhiyi Li
Gabriela Moisescu-Pareja
Oleksandr Dymov
Samuel Maddrell-Mander
Callum McLean
Frederik Wenkel
Luis Müller
Jama Hussein Mohamud
Ali Parviz
Michael Craig
Michał Koziarski
Jiarui Lu
Zhaocheng Zhu
Cristian Gabellini
Kerstin Klaser
Josef Dean
Cas Wognum … (voir 15 de plus)
Maciej Sypetkowski
Christopher Morris
Ioannis Koutis
Prudencio Tossou
Hadrien Mary
Therence Bois
Andrew William Fitzgibbon
Blazej Banaszewski
Chad Martin
Dominic Masters
Recently, pre-trained foundation models have enabled significant advancements in multiple fields. In molecular machine learning, however, wh… (voir plus)ere datasets are often hand-curated, and hence typically small, the lack of datasets with labeled features, and codebases to manage those datasets, has hindered the development of foundation models. In this work, we present seven novel datasets categorized by size into three distinct categories: ToyMix, LargeMix and UltraLarge. These datasets push the boundaries in both the scale and the diversity of supervised labels for molecular learning. They cover nearly 100 million molecules and over 3000 sparsely defined tasks, totaling more than 13 billion individual labels of both quantum and biological nature. In comparison, our datasets contain 300 times more data points than the widely used OGB-LSC PCQM4Mv2 dataset, and 13 times more than the quantum-only QM1B dataset. In addition, to support the development of foundational models based on our proposed datasets, we present the Graphium graph machine learning library which simplifies the process of building and training molecular machine learning models for multi-task and multi-level molecular datasets. Finally, we present a range of baseline results as a starting point of multi-task and multi-level training on these datasets. Empirically, we observe that performance on low-resource biological datasets show improvement by also training on large amounts of quantum data. This indicates that there may be potential in multi-task and multi-level training of a foundation model and fine-tuning it to resource-constrained downstream tasks. The Graphium library is publicly available on Github and the dataset links are available in Part 1 and Part 2.
Combining Confidence Elicitation and Sample-based Methods for Uncertainty Quantification in Misinformation Mitigation
Mauricio Rivera
Jean-François Godbout
Kellin Pelrine
Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation
Tyler Vergho
Jean-François Godbout
Kellin Pelrine
Recent large language models (LLMs) have been shown to be effective for misinformation detection. However, the choice of LLMs for experiment… (voir plus)s varies widely, leading to uncertain conclusions. In particular, GPT-4 is known to be strong in this domain, but it is closed source, potentially expensive, and can show instability between different versions. Meanwhile, alternative LLMs have given mixed results. In this work, we show that Zephyr-7b presents a consistently viable alternative, overcoming key limitations of commonly used approaches like Llama-2 and GPT-3.5. This provides the research community with a solid open-source option and shows open-source models are gradually catching up on this task. We then highlight how GPT-3.5 exhibits unstable performance, such that this very widely used model could provide misleading results in misinformation detection. Finally, we validate new tools including approaches to structured output and the latest version of GPT-4 (Turbo), showing they do not compromise performance, thus unlocking them for future research and potentially enabling more complex pipelines for misinformation mitigation.
Laplacian Change Point Detection for Single and Multi-view Dynamic Graphs
Shenyang Huang
Samy Coulombe
Yasmeen Hitti
Dynamic graphs are rich data structures that are used to model complex relationships between entities over time. In particular, anomaly dete… (voir plus)ction in temporal graphs is crucial for many real-world applications such as intrusion identification in network systems, detection of ecosystem disturbances, and detection of epidemic outbreaks. In this article, we focus on change point detection in dynamic graphs and address three main challenges associated with this problem: (i) how to compare graph snapshots across time, (ii) how to capture temporal dependencies, and (iii) how to combine different views of a temporal graph. To solve the above challenges, we first propose Laplacian Anomaly Detection (LAD) which uses the spectrum of graph Laplacian as the low dimensional embedding of the graph structure at each snapshot. LAD explicitly models short-term and long-term dependencies by applying two sliding windows. Next, we propose MultiLAD, a simple and effective generalization of LAD to multi-view graphs. MultiLAD provides the first change point detection method for multi-view dynamic graphs. It aggregates the singular values of the normalized graph Laplacian from different views through the scalar power mean operation. Through extensive synthetic experiments, we show that (i) LAD and MultiLAD are accurate and outperforms state-of-the-art baselines and their multi-view extensions by a large margin, (ii) MultiLAD’s advantage over contenders significantly increases when additional views are available, and (iii) MultiLAD is highly robust to noise from individual views. In five real-world dynamic graphs, we demonstrate that LAD and MultiLAD identify significant events as top anomalies such as the implementation of government COVID-19 interventions which impacted the population mobility in multi-view traffic networks.
GPS-SSL: Guided Positive Sampling to Inject Prior Into Self-Supervised Learning
Aarash Feizi
Randall Balestriero
Arantxa Casanova
We propose Guided Positive Sampling Self-Supervised Learning (GPS-SSL), a general method to inject a priori knowledge into Self-Supervised L… (voir plus)earning (SSL) positive samples selection. Current SSL methods leverage Data-Augmentations (DA) for generating positive samples and incorporate prior knowledge - an incorrect, or too weak DA will drastically reduce the quality of the learned representation. GPS-SSL proposes instead to design a metric space where Euclidean distances become a meaningful proxy for semantic relationship. In that space, it is now possible to generate positive samples from nearest neighbor sampling. Any prior knowledge can now be embedded into that metric space independently from the employed DA. From its simplicity, GPS-SSL is applicable to any SSL method, e.g. SimCLR or BYOL. A key benefit of GPS-SSL is in reducing the pressure in tailoring strong DAs. For example GPS-SSL reaches 85.58% on Cifar10 with weak DA while the baseline only reaches 37.51%. We therefore move a step forward towards the goal of making SSL less reliant on DA. We also show that even when using strong DAs, GPS-SSL outperforms the baselines on under-studied domains. We evaluate GPS-SSL along with multiple baseline SSL methods on numerous downstream datasets from different domains when the models use strong or minimal data augmentations. We hope that GPS-SSL will open new avenues in studying how to inject a priori knowledge into SSL in a principled manner.
Uncertainty Resolution in Misinformation Detection
Yury Orlovskiy
Camille Thibault
Anne Imouza
Jean-François Godbout
Kellin Pelrine
An Evaluation of Language Models for Hyperpartisan Ideology Detection in Persian Twitter
Sahar Omidi Shayegan
Isar Nejadgholi
Kellin Pelrine
Hao Yu
Sacha Lévy
Zachary Yang
Jean-François Godbout
Large Language Models (LLMs) have shown significant promise in various tasks, including identifying the political beliefs of English-speakin… (voir plus)g social media users from their posts. However, assessing LLMs for this task in non-English languages remains unexplored. In this work, we ask to what extent LLMs can predict the political ideologies of users in Persian social media. To answer this question, we first acknowledge that political parties are not well-defined among Persian users, and therefore, we simplify the task to a much simpler task of hyperpartisan ideology detection. We create a new benchmark and show the potential and limitations of both open-source and commercial LLMs in classifying the hyper-partisan ideologies of users. We compare these models with smaller fine-tuned models, both on the Persian language (ParsBERT) and translated data (RoBERTa), showing that they considerably outperform generative LLMs in this task. We further demonstrate that the performance of the generative LLMs degrades when classifying users based on their tweets instead of their bios and even when tweets are added as additional information, whereas the smaller fine-tuned models are robust and achieve similar performance for all classes. This study is a first step toward political ideology detection in Persian Twitter, with implications for future research to understand the dynamics of ideologies in Persian social media.
An Evaluation of Language Models for Hyperpartisan Ideology Detection in Persian Twitter
Sahar Omidi Shayegan
Isar Nejadgholi
Kellin Pelrine
Hao Yu
Sacha Lévy
Zachary Yang
Jean-François Godbout
Large Language Models (LLMs) have shown significant promise in various tasks, including identifying the political beliefs of English-speakin… (voir plus)g social media users from their posts. However, assessing LLMs for this task in non-English languages remains unexplored. In this work, we ask to what extent LLMs can predict the political ideologies of users in Persian social media. To answer this question, we first acknowledge that political parties are not well-defined among Persian users, and therefore, we simplify the task to a much simpler task of hyperpartisan ideology detection. We create a new benchmark and show the potential and limitations of both open-source and commercial LLMs in classifying the hyper-partisan ideologies of users. We compare these models with smaller fine-tuned models, both on the Persian language (ParsBERT) and translated data (RoBERTa), showing that they considerably outperform generative LLMs in this task. We further demonstrate that the performance of the generative LLMs degrades when classifying users based on their tweets instead of their bios and even when tweets are added as additional information, whereas the smaller fine-tuned models are robust and achieve similar performance for all classes. This study is a first step toward political ideology detection in Persian Twitter, with implications for future research to understand the dynamics of ideologies in Persian social media.
Quantifying learning-style adaptation in effectiveness of LLM teaching
Ruben Weijers
Gabrielle Fidelis de Castilho
Jean-François Godbout
Kellin Pelrine
This preliminary study aims to investigate whether AI, when prompted based on individual learning styles, can effectively improve comprehens… (voir plus)ion and learning experiences in educational settings. It involves tailoring LLMs baseline prompts and comparing the results of a control group receiving standard content and an experimental group receiving learning style-tailored content. Preliminary results suggest that GPT-4 can generate responses aligned with various learning styles, indicating the potential for enhanced engagement and comprehension. However, these results also reveal challenges, including the model’s tendency for sycophantic behavior and variability in responses. Our findings suggest that a more sophisticated prompt engineering approach is required for integrating AI into education (AIEd) to improve educational outcomes.
Temporal Graph Analysis with TGX
Razieh Shirzadkhani
Shenyang Huang
Elahe Kooshafar
Farimah Poursafaei
Real-world networks, with their evolving relations, are best captured as temporal graphs. However, existing software libraries are largely d… (voir plus)esigned for static graphs where the dynamic nature of temporal graphs is ignored. Bridging this gap, we introduce TGX, a Python package specially designed for analysis of temporal networks that encompasses an automated pipeline for data loading, data processing, and analysis of evolving graphs. TGX provides access to eleven built-in datasets and eight external Temporal Graph Benchmark (TGB) datasets as well as any novel datasets in the .csv format. Beyond data loading, TGX facilitates data processing functionalities such as discretization of temporal graphs and node subsampling to accelerate working with larger datasets. For comprehensive investigation, TGX offers network analysis by providing a diverse set of measures, including average node degree and the evolving number of nodes and edges per timestamp. Additionally, the package consolidates meaningful visualization plots indicating the evolution of temporal patterns, such as Temporal Edge Appearance (TEA) and Temporal Edge Trafficc (TET) plots. The TGX package is a robust tool for examining the features of temporal graphs and can be used in various areas like studying social networks, citation networks, and tracking user interactions. We plan to continuously support and update TGX based on community feedback. TGX is publicly available on: https://github.com/ComplexData-MILA/TGX.
Exhaustive Evaluation of Dynamic Link Prediction
Farimah Poursafaei
Dynamic link prediction is a crucial task in the study of evolving graphs, which serve as abstract models for various real-world application… (voir plus)s. Recent dynamic graph representation learning models have claimed near-perfect performance in this task. However, we argue that the standard evaluation strategy for dynamic link prediction overlooks the sparsity and recurrence patterns inherent in dynamic networks. Specifically, the current strategy suffers from issues such as evaluating models on a balanced set of positive and negative edges, neglecting the reassessment of frequently recurring positive edges, and lacking a comprehensive evaluation of both recurring and new edges.To address these limitations, we propose a novel evaluation strategy called EXHAUSTIVE, which takes into account all relevant negative edges and separately assesses the performance on recurring and new edges. Using our proposed evaluation strategy, we compare the performance of five state-of-the-art dynamic graph learning models on seven benchmark datasets. Compared to the previous common evaluation strategy, we observe an average drop of 62% in Average Precision for dynamic link prediction. Additionally, the ranking of the models also changes under the new evaluation setting. Furthermore, we demonstrate that while all models perform considerably worse when predicting new edges compared to recurring ones, the best performing models differ between the two scenarios. This highlights the importance of employing the proposed evaluation strategy for both the assessment and design of dynamic link prediction models. By adopting our novel evaluation strategy, researchers can obtain a more accurate understanding of model performance in dynamic link prediction, leading to improved evaluation and design of such models.