Zhaocheng Zhu

Maria Ulmer

Matthias Arnold

Maria Colomé-Tatché

Annalisa Marsico

2026-01-19

Nature Biomedical Engineering (published)

Enhancing link prediction in biomedical knowledge graphs with BioPathNet

Emy Yue Hu

Svitlana Oleshko

Samuele Firmani

Hui Cheng

Maria Ulmer

Matthias Arnold

Maria Colomé-Tatché

Annalisa Marsico

Understanding complex interactions in biomedical networks is crucial for advancements in biomedicine, but traditional link prediction (LP) m… (see more)ethods are limited in capturing this complexity. We present BioPathNet, a graph neural network framework based on the neural Bellman–Ford network (NBFNet), addressing limitations of traditional representation-based learning methods through path-based reasoning for LP in biomedical knowledge graphs. Unlike node-embedding frameworks, BioPathNet learns representations between node pairs by considering all relations along paths, enhancing prediction accuracy and interpretability, and allowing visualization of influential paths and biological validation. BioPathNet leverages a background regulatory graph for enhanced message passing and uses stringent negative sampling to improve precision and scalability. BioPathNet outperforms or matches existing methods across diverse tasks including gene function annotation, drug–disease indication, synthetic lethality and lncRNA–target interaction prediction. Our study identifies promising additional drug indications for diseases such as acute lymphoblastic leukaemia and Alzheimer’s disease, validated by medical experts and clinical trials. In addition, we prioritize putative synthetic lethal gene pairs and regulatory lncRNA–target interactions. BioPathNet’s interpretability will enable researchers to trace prediction paths and gain molecular insights.

2026-01-19

Nature Biomedical Engineering (published)

Overcoming Long-Context Limitations of State-Space Models via Context-Dependent Sparse Attention

Efficient long-context modeling remains a critical challenge for natural language processing (NLP), as the time complexity of the predominan… (see more)t Transformer architecture scales quadratically with the sequence length. While state-space models (SSMs) offer alternative sub-quadratic solutions, they struggle to capture long-range dependencies effectively. In this work, we focus on analyzing and improving the long-context modeling capabilities of SSMs. We show that the widely used synthetic task, associative recall, which requires a model to recall a value associated with a single key without context, insufficiently represents the complexities of real-world long-context modeling. To address this limitation, we extend the associative recall to a novel synthetic task, \emph{joint recall}, which requires a model to recall the value associated with a key given in a specified context. Theoretically, we prove that SSMs do not have the expressiveness to solve multi-query joint recall in sub-quadratic time complexity. To resolve this issue, we propose a solution based on integrating SSMs with Context-Dependent Sparse Attention (CDSA), which has the expressiveness to solve multi-query joint recall with sub-quadratic computation. To bridge the gap between theoretical analysis and real-world applications, we propose locality-sensitive Hashing Attention with sparse Key Selection (HAX), which instantiates the theoretical solution and is further tailored to natural language domains. Extensive experiments on both synthetic and real-world long-context benchmarks show that HAX consistently outperforms SSM baselines and SSMs integrated with context-independent sparse attention (CISA).

2025-09-17

NeurIPS.cc/2025/Conference (poster)

Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

Zhanke Zhou

Xuan Li

Mikhail Galkin

Xiao Feng

Sanmi Koyejo

Bo Han

2025-06-29

ICML.cc/2025/Workshop/R2-FM (poster)

Fully-Inductive Node Classification on Arbitrary Graphs

Hesham Mostafa

Michael Bronstein

One fundamental challenge in graph machine learning is generalizing to new graphs. Many existing methods following the inductive setup can g… (see more)eneralize to test graphs with new structures, but assuming the feature and label spaces remain the same as the training ones. This paper introduces a fully-inductive setup, where models should perform inference on arbitrary test graphs with new structures, feature and label spaces. We propose GraphAny as the first attempt at this challenging setup. GraphAny models inference on a new graph as an analytical solution to a LinearGNN, which can be naturally applied to graphs with any feature and label spaces. To further build a stronger model with learning capacity, we fuse multiple LinearGNN predictions with learned inductive attention scores. Specifically, the attention module is carefully parameterized as a function of the entropy-normalized distance features between pairs of LinearGNN predictions to ensure generalization to new graphs. Empirically, GraphAny trained on a single Wisconsin dataset with only 120 labeled nodes can generalize to 30 new graphs with an average accuracy of 67.26%, surpassing not only all inductive baselines, but also strong transductive methods trained separately on each of the 30 test graphs.

2025-01-21

ICLR.cc/2025/Conference (poster)

GraphText: Graph Reasoning in Text Space

Jianan Zhao

Le Zhuo

Yikang Shen

Meng Qu

Kai Liu

Michael M. Bronstein

2024-10-09

NeurIPS.cc/2024/Workshop/AFM (poster)

BioPathNet: Enhancing Link Prediction in Biomedical Knowledge Graphs through Path Representation Learning

Annalisa Marsico

Yue Hu

Svitlana Oleshko

Samuele Firmani

Hui Cheng

Maria Ulmer

Matthias Arnold

Maria Colomé-Tatché

Abstract

Understanding complex interactions in biomedical networks is crucial for advancements in biomedic… (see more)ine, but traditional link prediction (LP) methods are limited in capturing this complexity. Representation-based learning techniques improve prediction accuracy by mapping nodes to low-dimensional embeddings, yet they often struggle with interpretability and scalability. We present BioPathNet, a novel graph neural network framework based on the Neural Bellman-Ford Network (NBFNet), addressing these limitations through path-based reasoning for LP in biomedical knowledge graphs. Unlike node-embedding frameworks, BioPathNet learns representations between node pairs by considering all relations along paths, enhancing prediction accuracy and interpretability. This allows visualization of influential paths and facilitates biological validation. BioPathNet leverages a background regulatory graph (BRG) for enhanced message passing and uses stringent negative sampling to improve precision. In evaluations across various LP tasks, such as gene function annotation, drug-disease indication, synthetic lethality, and lncRNA-mRNA interaction prediction, BioPathNet consistently outperformed shallow node embedding methods, relational graph neural networks and task-specific state-of-the-art methods, demonstrating robust performance and versatility. Our study predicts novel drug indications for diseases like acute lymphoblastic leukemia (ALL) and Alzheimer’s, validated by medical experts and clinical trials. We also identified new synthetic lethality gene pairs and regulatory interactions involving lncRNAs and target genes, confirmed through literature reviews. BioPathNet's interpretability will enable researchers to trace prediction paths and gain molecular insights, making it a valuable tool for drug discovery, personalized medicine and biology in general.

2024-09-16

Research Square (published)

Path-based reasoning for biomedical knowledge graphs with BioPathNet

Yue Hu

Svitlana Oleshko

Samuele Firmani

Hui Cheng

Maria Ulmer

Matthias Arnold

Maria Colomé-Tatché

Annalisa Marsico

Understanding complex interactions in biomedical networks is crucial for advancements in biomedicine, but traditional link prediction (LP) m… (see more)ethods are limited in capturing this complexity. Representation-based learning techniques improve prediction accuracy by mapping nodes to low-dimensional embeddings, yet they often struggle with interpretability and scalability. We present BioPathNet, a novel graph neural network framework based on the Neural Bellman-Ford Network (NBFNet), addressing these limitations through path-based reasoning for LP in biomedical knowledge graphs. Unlike node-embedding frameworks, BioPathNet learns representations between node pairs by considering all relations along paths, enhancing prediction accuracy and interpretability. This allows visualization of influential paths and facilitates biological validation. BioPathNet leverages a background regulatory graph (BRG) for enhanced message passing and uses stringent negative sampling to improve precision. In evaluations across various LP tasks, such as gene function annotation, drug-disease indication, synthetic lethality, and lncRNA-mRNA interaction prediction, BioPathNet consistently outperformed shallow node embedding methods, relational graph neural networks and task-specific state-of-the-art methods, demonstrating robust performance and versatility. Our study predicts novel drug indications for diseases like acute lymphoblastic leukemia (ALL) and Alzheimer’s, validated by medical experts and clinical trials. We also identified new synthetic lethality gene pairs and regulatory interactions involving lncRNAs and target genes, confirmed through literature reviews. BioPathNet’s interpretability will enable researchers to trace prediction paths and gain molecular insights, making it a valuable tool for drug discovery, personalized medicine and biology in general.

2024-06-17

bioRxiv (published)

The 1st International Workshop on Graph Foundation Models (GFM).

Haitao Mao

Jianan Zhao

Xiaoxin He

Zhikai Chen

Qian Huang

Micheal Bronstein

Xavier Bresson

Bryan Hooi

Haiyang Zhang

Xianfeng Tang

Luo Chen

Jiliang Tang

Foundation models such as GPT-4 for natural language processing (NLP), Flamingo for computer vision (CV), have set new benchmarks in AI by d… (see more)elivering state-of-the-art results across various tasks with minimal task-specific data. Despite their success, the application of these models to the graph domain is challenging due to the relational nature of graph-structured data. To address this gap, we propose the Graph Foundation Model (GFM) Workshop, the first workshop for GFMs, dedicated to exploring the adaptation and development of foundation models specifically designed for graph data. The GFM workshop focuses on two critical questions: (1) How can the underlying capabilities of existing foundation models be effectively applied to graph data? (2) What foundational principles should guide the creation of models tailored to the graph domain? Through a curated set of panel sections, keynote talks, and paper presentations, our workshop intends to catalyze innovative approaches and theoretical frameworks for Graph Foundation Models (GFMs). We target a broad audience, encompassing researchers, practitioners, and students, and aim to lay the groundwork for the next wave of breakthroughs in integrating graph data with foundation models.

2024-05-12

Companion Proceedings of the ACM on Web Conference 2024 (published)

Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets

Dominique Beaini

Shenyang Huang

Joao Alex Cunha

Zhiyi Li

Gabriela Moisescu-Pareja

Oleksandr Dymov

Samuel Maddrell-Mander

Callum McLean

Jama Hussein Mohamud

Michael Craig

Cristian Gabellini

Kerstin Klaser

Josef Dean

Cas Wognum … (see 14 more)

Maciej Sypetkowski

Hadrien Mary

Therence Bois

Andrew Fitzgibbon

Błażej Banaszewski

Chad Martin

Dominic Masters

Recently, pre-trained foundation models have shown significant advancements in multiple fields. However, the lack of datasets with labeled f… (see more)eatures and codebases has hindered the development of a supervised foundation model for molecular tasks. Here, we have carefully curated seven datasets specifically tailored for node- and graph-level prediction tasks to facilitate supervised learning on molecules. Moreover, to support the development of multi-task learning on our proposed datasets, we created the Graphium graph machine learning library. Our dataset collection encompasses two distinct categories. Firstly, the TOYMIX category modifies three small existing datasets with additional data for multi-task learning. Secondly, the LARGEMIX category includes four large-scale datasets with 344M graph-level data points and 409M node-level data points from ∼5M unique molecules. Finally, the ultra-large dataset contains 2,210M graph-level data points and 2,031M node-level data points coming from 86M molecules. Hence our datasets represent an order of magnitude increase in data volume compared to other 2D-GNN datasets. In addition, recognizing that molecule-related tasks often span multiple levels, we have designed our library to explicitly support multi-tasking, offering a diverse range of multi-level representations, i.e., representations at the graph, node, edge, and node-pair level. We equipped the library with an extensive collection of models and features to cover different levels of molecule analysis. By combining our curated datasets with this versatile library, we aim to accelerate the development of molecule foundation models. Datasets and code are available at https://github.com/datamol-io/graphium.

2024-05-06

International Conference on Learning Representations (Accept (poster))

Towards Foundation Models for Knowledge Graph Reasoning

Hesham Mostafa

Foundation models in language and vision have the ability to run inference on any textual and visual inputs thanks to the transferable repre… (see more)sentations such as a vocabulary of tokens in language. Knowledge graphs (KGs) have different entity and relation vocabularies that generally do not overlap. The key challenge of designing foundation models on KGs is to learn such transferable representations that enable inference on any graph with arbitrary entity and relation vocabularies. In this work, we make a step towards such foundation models and present ULTRA, an approach for learning universal and transferable graph representations. ULTRA builds relational representations as a function conditioned on their interactions. Such a conditioning strategy allows a pre-trained ULTRA model to inductively generalize to any unseen KG with any relation vocabulary and to be fine-tuned on any graph. Conducting link prediction experiments on 57 different KGs, we find that the zero-shot inductive inference performance of a single pre-trained ULTRA model on unseen graphs of various sizes is often on par or better than strong baselines trained on specific graphs. Fine-tuning further boosts the performance.

2024-01-15

ICLR.cc/2024/Conference (poster)

A Foundation Model for Zero-shot Logical Query Reasoning

Mikhail Galkin

Jincheng Zhou

Bruno Ribeiro

Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional querie… (see more)s comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being deployed on a new graph. Here we present UltraQuery, the first foundation model for inductive reasoning that can zero-shot answer logical queries on any KG. The core idea of UltraQuery is to derive both projections and logical operations as vocabulary-independent functions which generalize to new entities and relations in any KG. With the projection operation initialized from a pre-trained inductive KG reasoning model, UltraQuery can solve CLQA on any KG after finetuning on a single dataset. Experimenting on 23 datasets, UltraQuery in the zero-shot inference mode shows competitive or better query answering performance than best available baselines and sets a new state of the art on 15 of them.

2023-12-31

NeurIPS (published)