Reihaneh Rabbany

Biography

Reihaneh Rabbany is an assistant professor at the School of Computer Science, McGill University, and a core academic member of Mila – Quebec Artificial Intelligence Institute. She is also a Canada CIFAR AI Chair and on the faculty of McGill’s Centre for the Study of Democratic Citizenship.

Before joining McGill, Rabbany was a postdoctoral fellow at the School of Computer Science, Carnegie Mellon University. She completed her PhD in the Department of Computing Science at the University of Alberta.

Rabbany heads McGill’s Complex Data Lab, where she conducts research at the intersection of network science, data mining and machine learning, with a focus on analyzing real-world interconnected data and social good applications.

Current Students

Hussein Abdallah

Postdoctorate - McGill University

Taimaa Bachi

Research Intern - McGill University

Aurélien Bück-Kaeffer

Master's Research - McGill University

Jacob Chmura

Master's Research - McGill University

Principal supervisor :

Research Intern - Université de Montréal

Principal supervisor :

Islam Eldifrawi

Independent visiting researcher - University of Sherbrooke

Aarash Feizi

PhD - McGill University

Co-supervisor :

Adriana Romero Soriano

Collaborating researcher - McGill University

Shenyang Huang

Collaborating Alumni - McGill University

Co-supervisor :

Anne Imouza

PhD - McGill University

Principal supervisor :

Emma Kondrup

PhD - McGill University

Co-supervisor :

Andrew Lin

Research Intern - McGill University University

Victor Livernoche

PhD - McGill University

Sitao Luan

Postdoctorate - McGill University

Principal supervisor :

Guillaume Rabusseau

Shahrad Mohammadzadeh

Master's Research - McGill University

Co-supervisor :

Independent visiting researcher - McGill University

Collaborating Alumni - McGill University

Soroush Omranpour

Collaborating Alumni - McGill University

Co-supervisor :

Guillaume Rabusseau

Farimah Poursafaei

Collaborating Alumni - McGill University

Master's Research - McGill University University

Dorsaf Sallami

Collaborating researcher - McGill University

Sneheel Sarangi

Master's Research - McGill University

Collaborating researcher - McGill University University

Jacob-Junqi Tian

Master's Research - McGill University

Collaborating researcher - Université de Montréal

Principal supervisor :

PhD - McGill University

Shirley Zhao

McGill University University

Sveta Zhuk

Master's Research - Université de Montréal

Principal supervisor :

Unmasking deepfakes with AI

Blog Posts

Un groupe hétéroclite de huit jeunes adultes se tient serré sur un toit, souriant et riant avec la silhouette de la ville en arrière-plan. Deux encarts circulaires mettent en évidence une caméra vintage tenue par l'un des membres du groupe, soulignant l'élément deepfake.

December 16, 2025

Victor Livernoche

Reihaneh Rabbany

Read the article

August 3, 2021

Flight-SEIR: Incorporating Flight Data to Improve Epidemiological Modelling and Disease Outbreak Prevention

Shenyang Huang

Reihaneh Rabbany

Read the article

Publications

Revisiting Hotels-50K and Hotel-ID

Aarash Feizi

Arantxa Casanova

Adriana Romero

In this paper, we propose revisited versions for two recent hotel recognition datasets: Hotels-50K and Hotel-ID. The revisited versions prov… (see more)ide evaluation setups with different levels of difﬁculty to better align with the intended real-world application, i.e. countering human trafﬁcking. Real-world scenarios involve hotels and locations that are not captured in the current data sets, therefore it is important to consider evaluation settings where classes are truly unseen. We test this setup using multiple state-of-the-art image retrieval models and show that as expected, the models’ performances decrease as the evaluation gets closer to the real-world unseen settings. The rankings of the best performing models also change across the different evaluation settings, which further motivates using the proposed revisited datasets.

2022-07-19

ArXiv (preprint)

VisPaD: Visualization and Pattern Discovery for Fighting Human Trafficking

Pratheeksha Nair

Yifei Li

Catalina Vajiac

Andreas Olligschlaeger

Meng-Chieh Lee

Namyong Park

Duen Horng Chau

Christos Faloutsos

Chieh Lee

2022-04-24

The Web Conference (published)

A Strong Node Classification Baseline for Temporal Graphs

Farimah Poursafaei

Željko Žilić

2022-04-19

Proceedings of the 2022 SIAM International Conference on Data Mining (SDM) (published)

Extracting Person Names from User Generated Text: Named-Entity Recognition for Combating Human Trafficking

2021-12-31

Findings (published)

Towards Better Evaluation for Dynamic Link Prediction

Andy Huang

Despite the prevalence of recent success in learning from static graphs, learning from time-evolving graphs remains an open challenge. In th… (see more)is work, we design new, more stringent evaluation procedures for link prediction specific to dynamic graphs, which reflect real-world considerations, to better compare the strengths and weaknesses of methods. First, we create two visualization techniques to understand the reoccurring patterns of edges over time and show that many edges reoccur at later time steps. Based on this observation, we propose a pure memorization-based baseline called EdgeBank. EdgeBank achieves surprisingly strong performance across multiple settings which highlights that the negative edges used in the current evaluation are easy. To sample more challenging negative edges, we introduce two novel negative sampling strategies that improve robustness and better match real-world applications. Lastly, we introduce six new dynamic graph datasets from a diverse set of domains missing from current benchmarks, providing new challenges and opportunities for future research. Our code repository is accessible at https://github.com/fpour/DGB.git.

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (published)

openreview.net

Curating the Twitter Election Integrity Datasets for Better Online Troll Characterization

Albert M. Orozco Camacho

In modern days, social media platforms provide accessible channels for the inter-1 action and immediate reﬂection of the most important ev… (see more)ents happening around 2 the world. In this paper, we, ﬁrstly, present a curated set of datasets whose origin 3 stem from the Twitter’s Information Operations 1 efforts. More notably, these 4 accounts, which have been already suspended, provide a notion of how state-backed 5 human trolls operate. 6 Secondly, we present detailed analyses of how these behaviours vary over time, 7 and motivate its use and abstraction in the context of deep representation learning: 8 for instance, to learn and, potentially track, troll behaviour. We present baselines 9 for such tasks and highlight the differences there may exist within the literature. 10 Finally, we utilize the representations learned for behaviour prediction to classify 11 trolls from "real" users, using a sample of non-suspended active accounts. 12

2021-12-06

LatinX in AI at Neural Information Processing Systems Conference 2021 (published)

openreview.net

Online Partisan Polarization of COVID-19

Sacha Lévy

Gabrielle Desrosiers-Brisebois

Andre Blais

In today’s age of (mis)information, many people utilize various social media platforms in an attempt to shape public opinion on sever… (see more)al important issues, including elections and the COVID-19 pandemic. These two topics have recently become intertwined given the importance of complying with public health measures related to COVID-19 and politicians’ management of the pandemic. Motivated by this, we study the partisan polarization of COVID-19 discussions on social media. We propose and utilize a novel measure of partisan polarization to analyze more than 380 million posts from Twitter and Parler around the 2020 US presidential election. We find strong correlation between peaks in polarization and polarizing events, such as the January 6^th Capitol Hill riot. We further classify each post into key COVID-19 issues of lockdown, masks, vaccines, as well as miscellaneous, to investigate both the volume and polarization on these topics and how they vary through time. Parler includes more negative discussions around lockdown and masks, as expected, but not much around vaccines. We also observe more balanced discussions on Twitter and a general disconnect between the discussions on Parler and Twitter.

2021-11-30

2021 International Conference on Data Mining Workshops (ICDMW) (unknown)

Graph Attention Networks with Positional Embeddings

Liheng Ma

Adriana Romero

2021-05-08

Advances in Knowledge Discovery and Data Mining (published)

SigTran: Signature Vectors for Detecting Illicit Activities in Blockchain Transaction Networks

Farimah Poursafaei

Željko Žilić

2021-05-08

Advances in Knowledge Discovery and Data Mining (published)

INFOSHIELD: Generalizable Information-Theoretic Human-Trafficking Detection

Meng-Chieh Lee

Catalina Vajiac

Aayushi Kulshrestha

Sacha Lévy

Namyong Park

Cara Jones

Christos Faloutsos

Given a million escort advertisements, how can we spot near-duplicates? Such micro-clusters of ads are usually signals of human trafficking.… (see more) How can we summarize them, visually, to convince law enforcement to act? Can we build a general tool that works for different languages? Spotting micro-clusters of near-duplicate documents is useful in multiple, additional settings, including spam-bot detection in Twitter ads, plagiarism, and more.We present INFOSHIELD, which makes the following contributions: (a) Practical, being scalable and effective on real data, (b) Parameter-free and Principled, requiring no user-defined parameters, (c) Interpretable, finding a document to be the cluster representative, highlighting all the common phrases, and automatically detecting "slots", i.e. phrases that differ in every document; and (d) Generalizable, beating or matching domain-specific methods in Twitter bot detection and human trafficking detection respectively, as well as being language-independent finding clusters in Spanish, Italian, and Japanese. Interpretability is particularly important for the anti human-trafficking domain, where law enforcement must visually inspect ads.Our experiments on real data show that INFOSHIELD correctly identifies Twitter bots with an F1 score over 90% and detects human-trafficking ads with 84% precision. Moreover, it is scalable, requiring about 8 hours for 4 million documents on a stock laptop.

2021-04-18

2021 IEEE 37th International Conference on Data Engineering (ICDE) (published)

The Surprising Performance of Simple Baselines for Misinformation Detection

Kellin Pelrine

Jacob Danovitch

As social media becomes increasingly prominent in our day to day lives, it is increasingly important to detect informative content and preve… (see more)nt the spread of disinformation and unverified rumours. While many sophisticated and successful models have been proposed in the literature, they are often compared with older NLP baselines such as SVMs, CNNs, and LSTMs. In this paper, we examine the performance of a broad set of modern transformer-based language models and show that with basic fine-tuning, these models are competitive with and can even significantly outperform recently proposed state-of-the-art methods. We present our framework as a baseline for creating and evaluating new methods for misinformation detection. We further study a comprehensive set of benchmark datasets, and discuss potential data leakage and the need for careful design of the experiments and understanding of datasets to account for confounding variables. As an extreme case example, we show that classifying only based on the first three digits of tweet ids, which contain information on the date, gives state-of-the-art performance on a commonly used benchmark dataset for fake news detection --Twitter16. We provide a simple tool to detect this problem and suggest steps to mitigate it in future datasets.

2021-04-18

Proceedings of the Web Conference 2021 (published)

Incorporating dynamic flight network in SEIR to model mobility between populations

Xiaoye Ding

Shenyang Huang

Abby Leung

Current efforts of modelling COVID-19 are often based on the standard compartmental models such as SEIR and their variations. As pre-symptom… (see more)atic and asymptomatic cases can spread the disease between populations through travel, it is important to incorporate mobility between populations into the epidemiological modelling. In this work, we propose to modify the commonly-used SEIR model to account for the dynamic flight network, by estimating the imported cases based on the air traffic volume as well as the test positive rate at the source. This modification, called Flight-SEIR, can potentially enable 1). early detection of outbreaks due to imported pre-symptomatic and asymptomatic cases, 2). more accurate estimation of the reproduction number and 3). evaluation of the impact of travel restrictions and the implications of lifting these measures. The proposed Flight-SEIR is essential in navigating through this pandemic and the next ones, given how interconnected our world has become.

2020-12-31

Applied Network Science (published)