Reihaneh Rabbany

Biography

Reihaneh Rabbany is an assistant professor at the School of Computer Science, McGill University, and a core academic member of Mila – Quebec Artificial Intelligence Institute. She is also a Canada CIFAR AI Chair and on the faculty of McGill’s Centre for the Study of Democratic Citizenship.

Before joining McGill, Rabbany was a postdoctoral fellow at the School of Computer Science, Carnegie Mellon University. She completed her PhD in the Department of Computing Science at the University of Alberta.

Rabbany heads McGill’s Complex Data Lab, where she conducts research at the intersection of network science, data mining and machine learning, with a focus on analyzing real-world interconnected data and social good applications.

Current Students

Hussein Abdallah

Postdoctorate - McGill University

Taimaa Bachi

Research Intern - McGill University

Aurélien Bück-Kaeffer

Master's Research - McGill University

Jacob Chmura

Master's Research - McGill University

Principal supervisor :

Research Intern - Université de Montréal

Principal supervisor :

Jean-François Godbout

Islam Eldifrawi

Independent visiting researcher - University of Sherbrooke

Aarash Feizi

PhD - McGill University

Co-supervisor :

Adriana Romero Soriano

Collaborating researcher - McGill University

Shenyang Huang

Collaborating Alumni - McGill University

Co-supervisor :

Anne Imouza

PhD - McGill University

Principal supervisor :

Jean-François Godbout

Emma Kondrup

PhD - McGill University

Co-supervisor :

Andrew Lin

Research Intern - McGill University University

Victor Livernoche

PhD - McGill University

Sitao Luan

Postdoctorate - McGill University

Principal supervisor :

Guillaume Rabusseau

Shahrad Mohammadzadeh

Master's Research - McGill University

Co-supervisor :

Independent visiting researcher - McGill University

Collaborating Alumni - McGill University

Soroush Omranpour

Collaborating Alumni - McGill University

Co-supervisor :

Guillaume Rabusseau

Farimah Poursafaei

Collaborating Alumni - McGill University

Master's Research - McGill University University

Dorsaf Sallami

Collaborating researcher - McGill University

Sneheel Sarangi

Master's Research - McGill University

Collaborating researcher - McGill University University

Jacob-Junqi Tian

Master's Research - McGill University

Collaborating researcher - Université de Montréal

Principal supervisor :

PhD - McGill University

Shirley Zhao

McGill University University

Unmasking deepfakes with AI

Sveta Zhuk

Master's Research - Université de Montréal

Principal supervisor :

Jean-François Godbout

Blog Posts

Un groupe hétéroclite de huit jeunes adultes se tient serré sur un toit, souriant et riant avec la silhouette de la ville en arrière-plan. Deux encarts circulaires mettent en évidence une caméra vintage tenue par l'un des membres du groupe, soulignant l'élément deepfake.

December 16, 2025

Victor Livernoche

Reihaneh Rabbany

Read the article

August 3, 2021

Flight-SEIR: Incorporating Flight Data to Improve Epidemiological Modelling and Disease Outbreak Prevention

Shenyang Huang

Reihaneh Rabbany

Read the article

Publications

RAFFIC V IS : Fighting Human Trafﬁcking through Visualization

Catalina Vajiac

Andreas Olligschlaeger

Yifei Li

Pratheeksha Nair

Meng-Chieh Lee

Namyong Park

Duen Horng Chau

Christos Faloutsos

Law enforcement can detect human trafficking (HT) in online escort websites by analyzing suspicious clusters of connected ads. Given such cl… (see more)usters, how can we interactively visualize potential evidence for law enforcement and domain experts? We present TRAFFICVIS, which, to our knowledge, is the first interface for cluster-level HT detection and labeling. It builds on state-of-the-art HT clustering algorithms by incorporating metadata as a signal of organized and potentially suspicious activity. Also, domain experts can label clusters as HT, spam, and more, efficiently creating labeled datasets to enable further HT research. TRAFFICVIS has been built in close collaboration with domain experts, who estimate that TRAFFICVIS provides a median 36x speedup over manual labeling.

2020-12-31

(published)

www.semanticscholar.org

Scalable Change Point Detection for Dynamic Graphs

Shenyang Huang

Guillaume Rabusseau

Real world networks often evolve in complex ways over time. Understanding anomalies in dynamic networks is crucial for applications such as … (see more)traffic accident detection, intrusion identification and detection of ecosystem disturbances. In this work, we focus on the problem of change point detection in dynamic graphs. The goal is to identify time steps where the graph structure deviates significantly from the norm. Despite empirical success of recent methods, building a change point detection method for real world dynamic graphs, which often scale to millions of nodes, remains an open question. To fill this gap, we propose LADdos, a scalable method for change point detection in dynamic graphs. LADdos brings together ideas from two recent works: an accurate change point detection method for graphs called LAD [10] which detects the changes in the full Laplacian spectrum of the graph in each timestamp, and the general framework of network density of states (DOS) [5] which models the distribution of the singular values through efficient approximation methods. In experiments with two common graph models –the Stochastic Block Model (SBM) and the Barabási-Albert (BA) model – we show that LADdos has equal performance to LAD, which is the current state-of-the-art, while being orders of magnitude faster. For instance, on a dynamic graph with total 21 million edges over 150 timestamps, LADdos achieves 100x speedup when compared to LAD.

2020-12-31

(published)

www.semanticscholar.org

Graph Neural Networks Learn Twitter Bot Behaviour

Albert M. Orozco Camacho

Sacha Lévy

Social media trends are increasingly taking a significant role for the understanding of modern social dynamics. In this work, we take a look… (see more) at how the Twitter landscape gets constantly shaped by automatically generated content. Twitter bot activity can be traced via network abstractions which, we hypothesize, can be learned through state-of-the-art graph neural network techniques. We employ a large bot database, continuously updated by Twitter, to learn how likely is that a user is mentioned by a bot, as well as, for a hashtag. Thus, we model this likelihood as a link prediction task between the set of users and hashtags. Moreover, we contrast our results by performing similar experiments on a crawled data set of real users.

2020-12-11

LatinX in AI at Neural Information Processing Systems Conference 2020 (published)

ComplexDataLab at WNUT-2020 Task 2: Detecting Informative COVID-19 Tweets by Attending over Linked Documents

Kellin Pelrine

Jacob Danovitch

Albert Orozco Camacho

Given the global scale of COVID-19 and the flood of social media content related to it, how can we find informative discussions? We present … (see more)Gapformer, which effectively classifies content as informative or not. It reformulates the problem as graph classification, drawing on not only the tweet but connected webpages and entities. We leverage a pre-trained language model as well as the connections between nodes to learn a pooled representation for each document network. We show it outperforms several competitive baselines and present ablation studies supporting the benefit of the linked information. Code is available on Github.

2020-10-31

WNUT (published)

Contact Graph Epidemic Modelling of COVID-19 for Transmission and Intervention Strategies

Abby Leung

Xiaoye Ding

Shenyang Huang

The coronavirus disease 2019 (COVID-19) pandemic has quickly become a global public health crisis unseen in recent years. It is known that t… (see more)he structure of the human contact network plays an important role in the spread of transmissible diseases. In this work, we study a structure aware model of COVID-19 CGEM. This model becomes similar to the classical compartment-based models in epidemiology if we assume the contact network is a Erdos-Renyi (ER) graph, i.e. everyone comes into contact with everyone else with the same probability. In contrast, CGEM is more expressive and allows for plugging in the actual contact networks, or more realistic proxies for it. Moreover, CGEM enables more precise modelling of enforcing and releasing different non-pharmaceutical intervention (NPI) strategies. Through a set of extensive experiments, we demonstrate significant differences between the epidemic curves when assuming different underlying structures. More specifically we demonstrate that the compartment-based models are overestimating the spread of the infection by a factor of 3, and under some realistic assumptions on the compliance factor, underestimating the effectiveness of some of NPIs, mischaracterizing others (e.g. predicting a later peak), and underestimating the scale of the second peak after reopening.

2020-10-05

ArXiv (preprint)

Laplacian Change Point Detection for Dynamic Graphs

2020-08-19

Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (published)

Machine learning analysis of exome trios to contrast the genomic architecture of autism and schizophrenia

Sameer Sardaar

Bill Qi

Alexandre Dionne-Laporte

Guy. A. Rouleau

Yannis J. Trakadis

Machine learning (ML) algorithms and methods offer great tools to analyze large complex genomic datasets. Our goal was to compare the genomi… (see more)c architecture of schizophrenia (SCZ) and autism spectrum disorder (ASD) using ML. In this paper, we used regularized gradient boosted machines to analyze whole-exome sequencing (WES) data from individuals SCZ and ASD in order to identify important distinguishing genetic features. We further demonstrated a method of gene clustering to highlight which subsets of genes identified by the ML algorithm are mutated concurrently in affected individuals and are central to each disease (i.e., ASD vs. SCZ “hub” genes). In summary, after correcting for population structure, we found that SCZ and ASD cases could be successfully separated based on genetic information, with 86–88% accuracy on the testing dataset. Through bioinformatic analysis, we explored if combinations of genes concurrently mutated in patients with the same condition (“hub” genes) belong to specific pathways. Several themes were found to be associated with ASD, including calcium ion transmembrane transport, immune system/inflammation, synapse organization, and retinoid metabolic process. Moreover, ion transmembrane transport, neurotransmitter transport, and microtubule/cytoskeleton processes were highlighted for SCZ. Our manuscript introduces a novel comparative approach for studying the genetic architecture of genetically related diseases with complex inheritance and highlights genetic similarities and differences between ASD and SCZ.

2020-02-27

BMC Psychiatry (published)

SGP: Spotting Groups Polluting the Online Political Discourse

Junhao Wang

Sacha Lévy

Ren Wang

Aayushi Kulshrestha

Social media sites are becoming a key factor in politics. These platforms are easy to manipulate for the purpose of distorting information s… (see more)pace to confuse and distract voters. It is of paramount importance for social media platforms, users engaged with online political discussions, as well as government agencies to understand the dynamics on social media, and identify malicious groups engaging in misinformation campaigns and thus polluting the general discourse around a topic of interest. Past works to identify such disruptive patterns are mostly focused on analyzing user-generated content such as tweets. In this study, we take a holistic approach and propose SGP to provide an informative birds eye view of all the activities in these social media sites around a broad topic and detect coordinated groups suspicious of engaging in misinformation campaigns. To show the effectiveness of SGP, we deploy it to provide a concise overview of polluting activity on Twitter around the upcoming 2019 Canadian Federal Elections, by analyzing over 60 thousand user accounts connected through 3.4 million connections and 1.3 million hashtags. Users in the polluting groups detected by SGP-flag are over 4x more likely to become suspended while majority of these highly suspicious users detected by SGP-flag escaped Twitter's suspending algorithm. Moreover, while few of the polluting hashtags detected are linked to misinformation campaigns, SGP-sig also flags others that have not been picked up on. More importantly, we also show that a large coordinated set of right-winged conservative groups based in the US are heavily engaged in Canadian politics.

2019-10-15

ArXiv (preprint)

Anomaly Detection with Joint Representation Learning of Content and Connection

Junhao Wang

Renhao Wang

Aayushi Kulshrestha

Social media sites are becoming a key factor in politics. These platforms are easy to manipulate for the purpose of distorting information s… (see more)pace to confuse and distract voters. Past works to identify disruptive patterns are mostly focused on analyzing the content of tweets. In this study, we jointly embed the information from both user posted content as well as a user's follower network, to detect groups of densely connected users in an unsupervised fashion. We then investigate these dense sub-blocks of users to flag anomalous behavior. In our experiments, we study the tweets related to the upcoming 2019 Canadian Elections, and observe a set of densely-connected users engaging in local politics in different provinces, and exhibiting troll-like behavior.

2019-06-15

ArXiv (preprint)

Social-Affiliation Networks: Patterns and the SOAR Model

Dhivya Eswaran

Artur Dubrawski

Christos Faloutsos

2018-09-09

ECML/PKDD (published)

Active Search of Connections for Case Building and Combating Human Trafficking

David Bayani

Artur Dubrawski

How can we help an investigator to efficiently connect the dots and uncover the network of individuals involved in a criminal activity based… (see more) on the evidence of their connections, such as visiting the same address, or transacting with the same bank account? We formulate this problem as Active Search of Connections, which finds target entities that share evidence of different types with a given lead, where their relevance to the case is queried interactively from the investigator. We present RedThread, an efficient solution for inferring related and relevant nodes while incorporating the user's feedback to guide the inference. Our experiments focus on case building for combating human trafficking, where the investigator follows leads to expose organized activities, i.e. different escort advertisements that are connected and possibly orchestrated. RedThread is a local algorithm and enables online case building when mining millions of ads posted in one of the largest classified advertising websites. The results of RedThread are interpretable, as they explain how the results are connected to the initial lead. We experimentally show that RedThread learns the importance of the different types and different pieces of evidence, while the former could be transferred between cases.

2018-07-18

Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (published)

Modular Networks for Validating Community Detection Algorithms

Justin J Fagnan

Afra Abnar

Osmar R Zaiane

How can we accurately compare different community detection algorithms? These algorithms cluster nodes in a given network, and their perform… (see more)ance is often validated on benchmark networks with explicit ground-truth communities. Given the lack of cluster labels in real-world networks, a model that generates realistic networks is required for accurate evaluation of these algorithm. In this paper, we present a simple, intuitive, and flexible benchmark generator to generate intrinsically modular networks for community validation. We show how the generated networks closely comply with the characteristics observed for real networks; whereas their characteristics could be directly controlled to match wide range of real world networks. We further show how common community detection algorithms rank differently when being evaluated on these benchmarks compared to current available alternatives.

2018-01-03

ArXiv (preprint)