Reihaneh Rabbany

Biographie

Reihaneh Rabbany est professeure adjointe à l'École d'informatique de l'Université McGill. Elle est membre du corps professoral de Mila – Institut québécois d’intelligence artificielle et titulaire d'une chaire en IA Canada-CIFAR. Elle est également membre du corps enseignant du Centre pour l’étude de la citoyenneté démocratique de McGill. Avant de se joindre à l’Université McGill, elle a été boursière postdoctorale à la School of Computer Science de l'Université Carnegie Mellon. Elle a obtenu un doctorat à l’Université de l’Alberta, au Département d'informatique. Elle dirige le laboratoire de données complexes, dont les recherches se situent à l'intersection de la science des réseaux, de l'exploration des données et de l'apprentissage automatique, et se concentrent sur l'analyse des données interconnectées du monde réel et sur les applications sociales.

Étudiants actuels

Hussein Abdallah

Postdoctorat - McGill

Taimaa Bachi

Stagiaire de recherche - McGill

Aurélien Bück-Kaeffer

Maîtrise recherche - McGill

Jacob Chmura

Maîtrise recherche - McGill

Superviseur⋅e principal⋅e :

Stagiaire de recherche - UdeM

Superviseur⋅e principal⋅e :

Islam Eldifrawi

Visiteur de recherche indépendant - University of Sherbrooke

Aarash Feizi

Doctorat - McGill

Co-superviseur⋅e :

Adriana Romero Soriano

Collaborateur·rice de recherche - McGill

Shenyang Huang

Collaborateur·rice alumni - McGill

Co-superviseur⋅e :

Anne Imouza

Doctorat - McGill

Superviseur⋅e principal⋅e :

Emma Kondrup

Doctorat - McGill

Co-superviseur⋅e :

Andrew Lin

Stagiaire de recherche - McGill University

Doctorat - McGill

Sitao Luan

Postdoctorat - McGill

Superviseur⋅e principal⋅e :

Guillaume Rabusseau

Shahrad Mohammadzadeh

Maîtrise recherche - McGill

Co-superviseur⋅e :

Visiteur de recherche indépendant - McGill

Collaborateur·rice alumni - McGill

Site web

Soroush Omranpour

Collaborateur·rice alumni - McGill

Co-superviseur⋅e :

Guillaume Rabusseau

Farimah Poursafaei

Collaborateur·rice alumni - McGill

Maîtrise recherche - McGill University

Dorsaf Sallami

Collaborateur·rice de recherche - McGill

Site web

Sneheel Sarangi

Maîtrise recherche - McGill

Collaborateur·rice de recherche - McGill University

Jacob-Junqi Tian

Maîtrise recherche - McGill

Collaborateur·rice de recherche - UdeM

Superviseur⋅e principal⋅e :

Doctorat - McGill

McGill University

Sveta Zhuk

Maîtrise recherche - UdeM

Superviseur⋅e principal⋅e :

Démasquer les deepfakes grâce à l'IA

Billets de blogue

Un groupe hétéroclite de huit jeunes adultes se tient serré sur un toit, souriant et riant avec la silhouette de la ville en arrière-plan. Deux encarts circulaires mettent en évidence une caméra vintage tenue par l'un des membres du groupe, soulignant l'élément deepfake.

16 décembre 2025

par

Victor Livernoche

Reihaneh Rabbany

Lire l'article

Flight-SEIR: Incorporating Flight Data to Improve Epidemiological Modelling and Disease Outbreak Prevention

3 août 2021

Flight-SEIR : incorporer les données de vol pour améliorer la modélisation épidémiologique et la prévention d’éclosions de maladies infectieuses

par

Shenyang Huang

Reihaneh Rabbany

Lire l'article

Publications

Simulation System Towards Solving Societal-Scale Manipulation

Maximilian Puelma Touzel

Sneheel Sarangi

Austin Welch

Gayatri K

Dan Zhao

Hao Yu

Tom Gibbs

Ethan Kosak-Hine

Andreea Musulan

Camille Thibault

Busra Tugce Gurbuz

The rise of AI-driven manipulation poses significant risks to societal trust and democratic processes. Yet, studying these effects in real-w… (voir plus)orld settings at scale is ethically and logistically impractical, highlighting a need for simulation tools that can model these dynamics in controlled settings to enable experimentation with possible defenses. We present a simulation environment designed to address this. We elaborate upon the Concordia framework that simulates offline, `real life' activity by adding online interactions to the simulation through social media with the integration of a Mastodon server. Through a variety of means we then improve simulation efficiency and information flow, and add a set of measurement tools, particularly longitudinal surveys of the agents' political positions. We demonstrate the simulator with a tailored example of how partisan manipulation of agents can affect election results.

2024-10-11

NeurIPS.cc/2024/Workshop/SafeGenAi (poster)

The Structural Safety Generalization Problem

Tom Gibbs

Julius Broomfield

George Ingebretsen

Ethan Kosak-Hine

Tia Nasir

Jason Zhang

Reihaneh Iranmanesh

Sara Pieri

It is widely known that AI is vulnerable to adversarial examples, from pixel perturbations to jailbreaks. We propose that there is a key, ea… (voir plus)sier class of problems that is also still unsolved: failures of safety to generalize over structure, despite semantic equivalence. We demonstrate this vulnerability by showing how recent AI systems are differently vulnerable both to multi-turn and multi-image attacks, compared to their single-turn and single-image counterparts with equivalent meaning. We suggest this is the same class of vulnerability as that found in yet unconnected threads of the literature: vulnerabilities to low-resource languages and indefensibility of strongly superhuman Go AIs to cyclic attacks. When viewed together, these reveal a common picture: models that are not only vulnerable to attacks, but vulnerable to attacks with near identical meaning in their benign and harmful components both, and only different in structure. In contrast to attacks with identical benign input (e.g., pictures that look like cats) but unknown semanticity of the harmful component (e.g., diverse noise that is all unintelligible to humans), these represent a class of attacks where semantic understanding and defense against one version should guarantee defense against others—yet current AI safety measures do not. This vulnerability represents a necessary but not sufficient condition towards defending against attacks whose harmful component has arbitrary semanticity. Consequently, by building on the data and approaches we highlight, we frame an intermediate problem for AI safety to solve, that represents a critical checkpoint towards safe AI while being far more tractable than trying to solve it directly and universally.

2024-10-11

NeurIPS.cc/2024/Workshop/SafeGenAi (poster)

Decompose, Recompose, and Conquer: Multi-modal LLMs are Vulnerable to Compositional Adversarial Attacks in Multi-Image Queries

Julius Broomfield

George Ingebretsen

Reihaneh Iranmanesh

Sara Pieri

Ethan Kosak-Hine

Tom Gibbs

Large Language Models have been extensively studied for their vulnerabilities, particularly in the context of adversarial attacks. However, … (voir plus)the emergence of Vision Language Models introduces new modalities of risk that have not yet been thoroughly explored, especially when processing multiple images simultaneously. In this paper, we introduce two black-box jailbreak methods that leverage multi-image inputs to uncover vulnerabilities in these models. We present a new safety evaluation dataset for multimodal LLMs called MultiBench, which is composed of these jailbreak methods. These methods can easily be applied and evaluated using our toolkit. We test these methods against six safety aligned frontier models from Google, OpenAI, and Anthropic, revealing significant safety vulnerabilities. Our findings suggest that even the most powerful language models remain vulnerable against compositional adversarial attacks, specifically those composed of multiple images.

2024-10-08

NeurIPS.cc/2024/Workshop/Red_Teaming_GenAI (poster)

TGB 2.0: A Benchmark for Learning on Temporal Knowledge Graphs and Heterogeneous Graphs

Erfan Loghmani

Emanuele Rossi

Ioannis Koutis

Heiner Stuckenschmidt

Guillaume Rabusseau

Multi-relational temporal graphs are powerful tools for modeling real-world data, capturing the evolving and interconnected nature of entiti… (voir plus)es over time. Recently, many novel models are proposed for ML on such graphs intensifying the need for robust evaluation and standardized benchmark datasets. However, the availability of such resources remains scarce and evaluation faces added complexity due to reproducibility issues in experimental protocols. To address these challenges, we introduce Temporal Graph Benchmark 2.0 (TGB 2.0), a novel benchmarking framework tailored for evaluating methods for predicting future links on Temporal Knowledge Graphs and Temporal Heterogeneous Graphs with a focus on large-scale datasets, extending the Temporal Graph Benchmark. TGB 2.0 facilitates comprehensive evaluations by presenting eight novel datasets spanning five domains with up to 53 million edges. TGB 2.0 datasets are significantly larger than existing datasets in terms of number of nodes, edges, or timestamps. In addition, TGB 2.0 provides a reproducible and realistic evaluation pipeline for multi-relational temporal graphs. Through extensive experimentation, we observe that 1) leveraging edge-type information is crucial to obtain high performance, 2) simple heuristic baselines are often competitive with more complex methods, 3) most methods fail to run on our largest datasets, highlighting the need for research on more scalable methods.

2024-09-25

Datasets and Benchmarks Track @ Neural Information Processing Systems (poster)

ToxiSight: Insights Towards Detected Chat Toxicity

Domenico Tullo

We present a comprehensive explainability dashboard designed for in-game chat toxicity. This dashboard integrates various existing explainab… (voir plus)le AI (XAI) techniques, including token importance analysis, model output visualization, and attribution to the training dataset. It also provides insights through the closest positive and negative examples, facilitating a deeper understanding and potential correction of the training data. Additionally, the dashboard includes word sense analysis—particularly useful for new moderators—and offers free-text explanations for both positive and negative predictions. This multi-faceted approach enhances the interpretability and transparency of toxicity detection models.

2024-09-20

EMNLP/2024/Workshop/BlackBoxNLP (accepté)

Web Retrieval Agents for Evidence-Based Misinformation Detection

Jacob-Junqi Tian

Hao Yu

Yury Orlovskiy

Tyler Vergho

Mauricio Rivera

Mayank Goel

2024-07-09

colmweb.org/COLM/2024/Conference (accepté)

Regional and Temporal Patterns of Partisan Polarization during the COVID-19 Pandemic in the United States and Canada

Anne Imouza

Maximilian Puelma Touzel

C'ecile Amadoro

Gabrielle Desrosiers-Brisebois

Sacha Lévy

Public health measures were among the most polarizing topics debated online during the COVID-19 pandemic. Much of the discussion surrounded … (voir plus)specific events, such as when and which particular interventions came into practise. In this work, we develop and apply an approach to measure subnational and event-driven variation of partisan polarization and explore how these dynamics varied both across and within countries. We apply our measure to a dataset of over 50 million tweets posted during late 2020, a salient period of polarizing discourse in the early phase of the pandemic. In particular, we examine regional variations in both the United States and Canada, focusing on three specific health interventions: lockdowns, masks, and vaccines. We find that more politically conservative regions had higher levels of partisan polarization in both countries, especially in the US where a strong negative correlation exists between regional vaccination rates and degree of polarization in vaccine related discussions. We then analyze the timing, context, and profile of spikes in polarization, linking them to specific events discussed on social media across different regions in both countries. These typically last only a few days in duration, suggesting that online discussions reflect and could even drive changes in public opinion, which in the context of pandemic response impacts public health outcomes across different regions and over time.

2024-07-02

ArXiv (prépublication)

arxiv.org

MiNT: Multi-Network Training for Transfer Learning on Temporal Graphs

Razieh Shirzadkhani

Tran Gia Bao Ngo

Kiarash Shamsi

Poupak Azad

Baris Coskunuzer

Cuneyt Gurcan Akcora

2024-06-13

ArXiv (prépublication)

Static graph approximations of dynamic contact networks for epidemic forecasting

Razieh Shirzadkhani

Shenyang Huang

Abby Leung

Epidemic modeling is essential in understanding the spread of infectious diseases like COVID-19 and devising effective intervention strategi… (voir plus)es to control them. Recently, network-based disease models have integrated traditional compartment-based modeling with real-world contact graphs and shown promising results. However, in an ongoing epidemic, future contact network patterns are not observed yet. To address this, we use aggregated static networks to approximate future contacts for disease modeling. The standard method in the literature concatenates all edges from a dynamic graph into one collapsed graph, called the full static graph. However, the full static graph often leads to severe overestimation of key epidemic characteristics. Therefore, we propose two novel static network approximation methods, DegMST and EdgeMST, designed to preserve the sparsity of real world contact network while remaining connected. DegMST and EdgeMST use the frequency of temporal edges and the node degrees respectively to preserve sparsity. Our analysis show that our models more closely resemble the network characteristics of the dynamic graph compared to the full static ones. Moreover, our analysis on seven real-world contact networks suggests EdgeMST yield more accurate estimations of disease dynamics for epidemic forecasting when compared to the standard full static method.

2024-05-21

Scientific Reports (publié)

T-NET: Weakly Supervised Graph Learning for Combatting Human Trafficking

Pratheeksha Nair

Javin Liu

Catalina Vajiac

Andreas Olligschlaeger

Duen Horng Chau

Mirela Cazzolato

Cara Jones

Christos Faloutsos

Human trafficking (HT) for forced sexual exploitation, often described as modern-day slavery, is a pervasive problem that affects millions o… (voir plus)f people worldwide. Perpetrators of this crime post advertisements (ads) on behalf of their victims on adult service websites (ASW). These websites typically contain hundreds of thousands of ads including those posted by independent escorts, massage parlor agencies and spammers (fake ads). Detecting suspicious activity in these ads is difficult and developing data-driven methods is challenging due to the hard-to-label, complex and sensitive nature of the data. In this paper, we propose T-Net, which unlike previous solutions, formulates this problem as weakly supervised classification. Since it takes several months to years to investigate a case and obtain a single definitive label, we design domain-specific signals or indicators that provide weak labels. T-Net also looks into connections between ads and models the problem as a graph learning task instead of classifying ads independently. We show that T-Net outperforms all baselines on a real-world dataset of ads by 7% average weighted F1 score. Given that this data contains personally identifiable information, we also present a realistic data generator and provide the first publicly available dataset in this domain which may be leveraged by the wider research community.

2024-03-23

Proceedings of the AAAI Conference on Artificial Intelligence (publié)

Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets

Dominique Beaini

Shenyang Huang

Joao Alex Cunha

Zhiyi Li

Gabriela Moisescu-Pareja

Oleksandr Dymov

Samuel Maddrell-Mander

Callum McLean

Jama Hussein Mohamud

Michael Craig

Cristian Gabellini

Kerstin Klasers

Josef Dean

Cas Wognum … (voir 15 de plus)

Maciej Sypetkowski

Ioannis Koutis

Hadrien Mary

Therence Bois

Andrew Fitzgibbon

Błażej Banaszewski

Chad Martin

Dominic Masters

Recently, pre-trained foundation models have shown significant advancements in multiple fields. However, the lack of datasets with labeled f… (voir plus)eatures and codebases has hindered the development of a supervised foundation model for molecular tasks. Here, we have carefully curated seven datasets specifically tailored for node- and graph-level prediction tasks to facilitate supervised learning on molecules. Moreover, to support the development of multi-task learning on our proposed datasets, we created the Graphium graph machine learning library. Our dataset collection encompasses two distinct categories. Firstly, the TOYMIX category modifies three small existing datasets with additional data for multi-task learning. Secondly, the LARGEMIX category includes four large-scale datasets with 344M graph-level data points and 409M node-level data points from ∼5M unique molecules. Finally, the ultra-large dataset contains 2,210M graph-level data points and 2,031M node-level data points coming from 86M molecules. Hence our datasets represent an order of magnitude increase in data volume compared to other 2D-GNN datasets. In addition, recognizing that molecule-related tasks often span multiple levels, we have designed our library to explicitly support multi-tasking, offering a diverse range of multi-level representations, i.e., representations at the graph, node, edge, and node-pair level. We equipped the library with an extensive collection of models and features to cover different levels of molecule analysis. By combining our curated datasets with this versatile library, we aim to accelerate the development of molecule foundation models. Datasets and code are available at https://github.com/datamol-io/graphium.

2024-01-15

ICLR.cc/2024/Conference (poster)

Combining Confidence Elicitation and Sample-based Methods for Uncertainty Quantification in Misinformation Mitigation

Mauricio Rivera

2024-01-12

ArXiv (prépublication)