Bang Liu

Biography

Bang Liu is an assistant professor in the Department of Computer Science and Operations Research (DIRO), and a core member of the Applied Research in Computational Linguistics Lab (RALI) at Université de Montréal. He is also an associate academic member of Mila – Quebec Artificial Intelligence Institute and a Canada CIFAR AI Chair.

Liu received his BEng from the University of Science and Technology of China in 2013, and his MSc and PhD degrees from the University of Alberta in 2015 and 2020, respectively. His research interests lie primarily in the areas of natural language processing, multimodal and embodied learning, theory and techniques for AGI (e.g., understanding and improving large language models), and AI for science (e.g., health, material science, XR).

Current Students

Qianggang Ding

PhD - Université de Montréal

Farshid Effaty

Postdoctorate - Université de Montréal

Wenhao Huang

PhD - Université de Montréal

Yizhan Li

PhD - Université de Montréal

Kyle Roth

PhD - Université de Montréal

Haochen Shi

PhD - Université de Montréal

Xiran Song

PhD - Université de Montréal

Github

Jia'ao Sun

PhD - Université de Montréal

Xiaoqiang Wang

PhD - Université de Montréal

PhD - Université de Montréal

Github

Dekun Wu

PhD - Université de Montréal

Sifan Wu

PhD - Université de Montréal

Mengyang Xiong

Research Intern - McGill University

Huan Zhang

Master's Research - Université de Montréal

Revolutionizing Materials Science with NLP: Introducing MatSci-NLP and HoneyBee

Armin Zolfagharidariani

Master's Research - Université de Montréal

Github

Blog Posts

MatSci-Instruct and HoneyBee training workflow.

October 21, 2024

Bang Liu

Read the article

Publications

Scalable Heterogeneous Graph Learning via Heterogeneous-aware Orthogonal Prototype Experts

Wei Zhou

Hong Huang

Ruize Shi

Heterogeneous Graph Neural Networks(HGNNs) have advanced mainly through better encoders, yet their decoding/projection stage still relies on… (see more) a single shared linear head, assuming it can map rich node embeddings to labels. We call this the Linear Projection Bottleneck: in heterogeneous graphs, contextual diversity and long-tail shifts make a global head miss fine semantics, overfit hub nodes, and underserve tail nodes. While Mixture-of-Experts(MoE) could help, naively applying it clashes with structural imbalance and risks expert collapse. We propose a Heterogeneous-aware Orthogonal Prototype Experts framework named HOPE, a plug-and-play replacement for the standard prediction head. HOPE uses learnable prototype-based routing to assign instances to experts by similarity, letting expert usage follow the natural long-tail distribution, and adds expert orthogonalization to encourage diversity and prevent collapse. Experiments on four real datasets show consistent gains across SOTA HGNN backbones with minimal overhead.

2026-01-08

arXiv (preprint)

Evolving Programmatic Skill Networks

Haochen Shi

Xingdi Yuan

We study continual skill acquisition in open-ended embodied environments where an agent must construct, refine, and reuse an expanding libra… (see more)ry of executable skills. We introduce the Programmatic Skill Network (PSN), a framework in which skills are executable symbolic programs forming a compositional network that evolves through experience. PSN defines three core mechanisms instantiated via large language models: (1)REFLECT for structured fault localization over skill compositions, (2) progressive optimization with maturity-aware update gating that stabilizes reliable skills while maintaining plasticity for uncertain ones, and (3) canonical structural refactoring under rollback validation that maintains network compactness. We further show that PSN's learning dynamics exhibit structural parallels to neural network training. Experiments on MineDojo and Crafter demonstrate robust skill reuse, rapid adaptation, and strong generalization across open-ended task distributions.\footnote{We plan to open-source the code.

2026-01-06

arXiv (preprint)

GraphOmni: A Comprehensive and Extensible Benchmark Framework for Large Language Models on Graph-theoretic Tasks

Hao Xu

Xiangru Jian

Xinjian Zhao

Wei Pang

Chao Zhang

Qixin Zhang

Zhengyuan Dong

Joao Monteiro

Qiuzhuang Sun

Tianshu Yu

This paper introduces GraphOmni, a comprehensive benchmark designed to evaluate the reasoning capabilities of LLMs on graph-theoretic tasks … (see more)articulated in natural language. GraphOmni spans diverse graph types, serialization formats, and prompting schemes, substantially extending upon prior efforts in both scope and depth. Through systematic evaluation, we uncover critical interactions among these dimensions, revealing their decisive impact on model performance. Our experiments show that state-of-the-art closed-source models such as Claude-3.5 and o4-mini consistently lead overall, yet still leave considerable headroom, while open-source models display pronounced sensitivity to various design choices. Beyond the standard scope, larger graphs, real-world graphs, and additional NP-hard tasks are further discussed. We further analyze efficiency via output token usage, highlighting cost–accuracy trade-offs, and introduce a reinforcement learning-based optimizer that adaptively selects factor combinations, reducing evaluation cost by 75\% while retaining strong accuracy. This flexible and extensible benchmark not only deepens understanding of LLM performance on structured graph reasoning but also establishes a robust foundation for advancing model design and evaluation. The code and datasets are available at https://anonymous.4open.science/r/ID-14092.

2025-12-31

International Conference on Learning Representations (Accept (Poster))

Towards Agentic Intelligence for Materials Science

Huan Zhang

Yizhan Li

Wenhao Huang

Ziyu Hou

Yu Song

Xuye Liu

Jinya Jiang

Leonard R. MacGillivray

Teruyasu Mizoguchi

Tianshu Yu

Lizi Liao

Yuyu Luo

Yu Rong

Jia LI

Ying Diao

Heng Ji … (see 1 more)

The convergence of artificial intelligence and materials science presents a transformative opportunity, but achieving true acceleration in d… (see more)iscovery requires moving beyond task-isolated, fine-tuned models toward agentic systems that plan, act, and learn across the full discovery loop. This survey advances a unique pipeline-centric view that spans from corpus curation and pretraining, through domain adaptation and instruction tuning, to goal-conditioned agents interfacing with simulation and experimental platforms. Unlike prior reviews, we treat the entire process as an end-to-end system to be optimized for tangible discovery outcomes rather than proxy benchmarks. This perspective allows us to trace how upstream design choices-such as data curation and training objectives-can be aligned with downstream experimental success through effective credit assignment. To bridge communities and establish a shared frame of reference, we first present an integrated lens that aligns terminology, evaluation, and workflow stages across AI and materials science. We then analyze the field through two focused lenses: From the AI perspective, the survey details LLM strengths in pattern recognition, predictive analytics, and natural language processing for literature mining, materials characterization, and property prediction; from the materials science perspective, it highlights applications in materials design, process optimization, and the acceleration of computational workflows via integration with external tools (e.g., DFT, robotic labs). Finally, we contrast passive, reactive approaches with agentic design, cataloging current contributions while motivating systems that pursue long-horizon goals with autonomy, memory, and tool use. This survey charts a practical roadmap towards autonomous, safety-aware LLM agents aimed at discovering novel and useful materials.

2025-12-31

arXiv (preprint)

Accelerated Inorganic Materials Design with Generative Al Agents

Izumi Takahara

Teruyasu Mizoguchi

Designing inorganic crystalline materials with tailored properties is critical to technological innovation, yet current generative computati… (see more)onal methods often struggle to efficiently explore desired targets with sufficient interpretability. Here, we present MatAgent, a generative approach for inorganic materials discovery that harnesses the powerful reasoning capabilities of large language models (LLMs). By combining a diffusion-based generative model for crystal structure estimation with a predictive model for property evaluation, MatAgent uses iterative, feedback-driven guidance to steer material exploration precisely toward user-defined targets. Integrated with external cognitive tools-including short-term memory, long-term memory, the periodic table, and a comprehensive materials knowledge base-MatAgent emulates human expert reasoning to vastly expand the accessible compositional space. Our results demonstrate that MatAgent robustly directs exploration toward desired properties while consistently achieving high compositional validity, uniqueness, and material novelty. This framework thus provides a highly interpretable, practical, and versatile AI-driven solution to accelerate the discovery and design of next-generation inorganic materials.

2025-11-30

Cell Reports Physical Science (published)

Improving GUI Grounding with Explicit Position-to-Coordinate Mapping

Christopher Pal

2025-10-02

ArXiv (preprint)

Accelerated Inorganic Materials Design with Generative AI Agents

Izumi Takahara

Teruyasu Mizoguchi

2025-09-19

NeurIPS.cc/2025/Workshop/AI4Mat (poster)

Concept-based Steering of Large Language Models for Conditional Molecular Generation

Yang Zhang

Modern LLMs, with their internet-scale pretraining and advanced human-level capabilities across specialized tasks, have demonstrated promisi… (see more)ng performance in molecular discovery using existing text-based molecular representations, such as SMILES and SELFIES. However, generating valid, unique, and high-fidelity molecules while precisely controlling for multiple properties simultaneously remains challenging. While prior works demonstrated success by fine-tuning language models on a novel corpus of molecules with property-conditioned tags, real-world applications require generating molecules from diverse property distributions, previously unseen in the training data. To this end, we present Concept-based Activation STeering (CAST), the first approach to apply activation steering to directly edit a model's internal representation for conditional molecular generation. CAST offers a lightweight, flexible alternative to fine-tuning by computing property-conditioned steering vectors via a concept network that does not require retraining the LLM. Through extensive experiments on datasets such as Therapeutics Data Commons, we show that CAST consistently outperforms existing methods on both in-distribution and out-of-distribution conditional generation tasks. We also conduct comprehensive ablation studies to highlight the extent of control our concept-guided steering provides on the molecules generated by the LLM.

2025-09-19

NeurIPS.cc/2025/Workshop/AI4Mat (poster)

System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts

Xiaoqiang Wang

Yun Zhu

Chain-of-thought (CoT) reasoning enables large language models (LLMs) to move beyond fast System-1 responses and engage in deliberative Syst… (see more)em-2 reasoning. However, this comes at the cost of significant inefficiency due to verbose intermediate output. Recent latent-space reasoning methods improve efficiency by operating on hidden states without decoding into language, yet they treat all steps uniformly, failing to distinguish critical deductions from auxiliary steps and resulting in suboptimal use of computational resources. In this paper, we propose System-1.5 Reasoning, an adaptive reasoning framework that dynamically allocates computation across reasoning steps through shortcut paths in latent space. Specifically, System-1.5 Reasoning introduces two types of dynamic shortcuts. The model depth shortcut (DS) adaptively reasons along the vertical depth by early exiting non-critical tokens through lightweight adapter branches, while allowing critical tokens to continue through deeper Transformer layers. The step shortcut (SS) reuses hidden states across the decoding steps to skip trivial steps and reason horizontally in latent space. Training System-1.5 Reasoning involves a two-stage self-distillation process: first distilling natural language CoT into latent-space continuous thought, and then distilling full-path System-2 latent reasoning into adaptive shortcut paths (System-1.5 Reasoning). Experiments on reasoning tasks demonstrate the superior performance of our method. For example, on GSM8K, System-1.5 Reasoning achieves reasoning performance comparable to traditional CoT fine-tuning methods while accelerating inference by over 20x and reducing token generation by 92.31% on average.

2025-09-17

NeurIPS.cc/2025/Conference (poster)

R3Mem: Bridging Memory Retention and Retrieval via Reversible Compression.

Xiaoqiang Wang 0007

Yun Zhu

2025-08-26

DBLP.org (unknown)

LLMs for Experiment Design in Scientific Domains: Are We There Yet?

Rushil Gupta

Jason Hartford

2025-06-10

ICML.cc/2025/Workshop/GenBio (poster)

Improving Context Fidelity via Native Retrieval-Augmented Reasoning

Jinlin Wang

Xinyu Wang

Shiqi Li

Xiangru Tang

Sirui Hong

Xiao-Wen Chang

Chenglin Wu

Large language models (LLMs) often struggle with context fidelity, producing inconsistent answers when responding to questions based on prov… (see more)ided information. Existing approaches either rely on expensive supervised fine-tuning to generate evidence post-answer or train models to perform web searches without necessarily improving utilization of the given context. We propose CARE, a novel native retrieval-augmented reasoning framework that teaches LLMs to explicitly integrate in-context evidence within their reasoning process with the model's own retrieval capabilities. Our method requires minimal labeled evidence data while significantly enhancing both retrieval accuracy and answer generation performance through strategically retrieved in-context tokens in the reasoning chain. Extensive experiments on multiple real-world and counterfactual QA benchmarks demonstrate that our approach substantially outperforms supervised fine-tuning, traditional retrieval-augmented generation methods, and external retrieval solutions. This work represents a fundamental advancement in making LLMs more accurate, reliable, and efficient for knowledge-intensive tasks.

2025-06-08

ICML.cc/2025/Workshop/LCFM (accepted)