Portrait of Benjamin Fung

Benjamin Fung

Associate Academic Member
Associate Professor, McGill University, School of Information Studies
McGill University University
Research Topics
AI for Software Engineering
Applied Machine Learning
Cybersecurity
Data Mining
Deep Learning
Information Retrieval
Misinformation
Privacy
Representation Learning

Biography

Benjamin Fung is a Canada Research Chair in Data Mining for Cybersecurity, as well as a full professor at the School of Information Studies and associate member of the School of Computer Science, McGill University.

Fung serves as an associate editor of IEEE Transactions of Knowledge and Data Engineering and Sustainable Cities and Society. He received his PhD in computing science from Simon Fraser University in 2007.

Dr. Fung has over 150 refereed publications to his credit and and more than 14,000 citations (h-index 57) spanning the fields of data mining, machine learning, privacy, cybersecurity and building engineering. His findings in the fields of data mining for crime investigations and authorship analysis have been reported by the media worldwide.

Publications

Diminished social memory and hippocampal correlates of social interactions in chronic social defeat stress susceptibility
Amanda Larosa
Tian Rui Zhang
Alice S. Wong
Cyrus Y.H. Fung
Y. H. Fung Cyrus
Xiong Ling Yun (Jenny) Long
Prabhjeet Singh
Tak Pan Wong
AugmenToxic: Leveraging Reinforcement Learning to Optimize LLM Instruction Fine-Tuning for Data Augmentation to Enhance Toxicity Detection
Arezo Bodaghi
Ketra A. Schmitt
Towards a unified XAI-based framework for digital forensic investigations
Zainab Khalid
Farkhund Iqbal
A Literature Review on Detecting, Verifying, and Mitigating Online Misinformation
Arezo Bodaghi
Ketra A. Schmitt
Pierre Watine
Social media use has transformed communication and made social interaction more accessible. Public microblogs allow people to share and acce… (see more)ss news through existing and social-media-created social connections and access to public news sources. These benefits also create opportunities for the spread of false information. False information online can mislead people, decrease the benefits derived from social media, and reduce trust in genuine news. We divide false information into two categories: unintentional false information, also known as misinformation; and intentionally false information, also known as disinformation and fake news. Given the increasing prevalence of misinformation, it is imperative to address its dissemination on social media platforms. This survey focuses on six key aspects related to misinformation: 1) clarify the definition of misinformation to differentiate it from intentional forms of false information; 2) categorize proposed approaches to manage misinformation into three types: detection, verification, and mitigation; 3) review the platforms and languages for which these techniques have been proposed and tested; 4) describe the specific features that are considered in each category; 5) compare public datasets created to address misinformation and categorize into prelabeled content-only datasets and those including users and their connections; and 6) survey fact-checking websites that can be used to verify the accuracy of information. This survey offers a comprehensive and unprecedented review of misinformation, integrating various methodological approaches, datasets, and content-, user-, and network-based approaches, which will undoubtedly benefit future research in this field.
A Comprehensive Analysis of Explainable AI for Malware Hunting
Mohd Saqib
Samaneh Mahdavifar
Philippe Charland
In the past decade, the number of malware variants has increased rapidly. Many researchers have proposed to detect malware using intelligent… (see more) techniques, such as Machine Learning (ML) and Deep Learning (DL), which have high accuracy and precision. These methods, however, suffer from being opaque in the decision-making process. Therefore, we need Artificial Intelligence (AI)-based models to be explainable, interpretable, and transparent to be reliable and trustworthy. In this survey, we reviewed articles related to Explainable AI (XAI) and their application to the significant scope of malware detection. The article encompasses a comprehensive examination of various XAI algorithms employed in malware analysis. Moreover, we have addressed the characteristics, challenges, and requirements in malware analysis that cannot be accommodated by standard XAI methods. We discussed that even though Explainable Malware Detection (EMD) models provide explainability, they make an AI-based model more vulnerable to adversarial attacks. We also propose a framework that assigns a level of explainability to each XAI malware analysis model, based on the security features involved in each method. In summary, the proposed project focuses on combining XAI and malware analysis to apply XAI models for scrutinizing the opaque nature of AI systems and their applications to malware analysis.
Survey on Explainable AI: Techniques, challenges and open issues
Adel Abusitta
Miles Q. Li
Survey on Explainable AI: Techniques, challenges and open issues
Adel Abusitta
Miles Q. Li
Survey on Explainable AI: Techniques, challenges and open issues
Adel Abusitta
Miles Q. Li
Survey on Explainable AI: Techniques, challenges and open issues
Adel Abusitta
Miles Q. Li
Tracing the Ransomware Bloodline: Investigation and Detection of Drifting Virlock Variants
Salwa Razaulla
Claude Fachkha
Amjad Gawanmeh
Christine Markarian
Chadi Assi
Malware, especially ransomware, has dramatically increased in volume and sophistication in recent years. The growing complexity and destruct… (see more)ive potential of ransomware demand effective countermeasures. Despite tremendous efforts by the security community to document these threats, reliance on manual analysis makes it challenging to discern unique malware variants from polymorphic variants. Moreover, the easy accessibility of source code of prominent ransomware families in public domains has led to the rise of numerous variants, complicating manual detection and hindering the identification of phylogenetic relationships. This paper introduces a novel approach that narrows the focus to analyze one such prominent ransomware family, Virlock. Using binary code similarity, we systematically reconstruct the lineage of Virlock, tracing its relationships, evolution, and variants. Employing this technique on a dataset of over 1000 Virlock samples submitted to VirusTotal and VirusShare, our analysis unveils intricate relationships within the Virlock ransomware family, offering valuable insights into the tangled relationships of this ransomware.
Better entity matching with transformers through ensembles
Jwen Fai Low
Pulei Xiong
ERS0: Enhancing Military Cybersecurity with AI-Driven SBOM for Firmware Vulnerability Detection and Asset Management
Max Beninger
Philippe Charland
Steven H. H. Ding
Firmware vulnerability detection and asset management through a software bill of material (SBOM) approach is integral to defensive military … (see more)operations. SBOMs provide a comprehensive list of software components, enabling military organizations to identify vulnerabilities within critical systems, including those controlling various functions in military platforms, as well as in operational technologies and Internet of Things devices. This proactive approach is essential for supply chain security, ensuring that software components are sourced from trusted suppliers and have not been tampered with during production, distribution, or through updates. It is a key element of defense strategies, allowing for rapid assessment, response, and mitigation of vulnerabilities, ultimately safeguarding military capabilities and information from cyber threats. In this paper, we propose ERS0, an SBOM system, driven by artificial intelligence (AI), for detecting firmware vulnerabilities and managing firmware assets. We harness the power of pre-trained large-scale language models to effectively address a wide array of string patterns, extending our coverage to thousands of third-party library patterns. Furthermore, we employ AI-powered code clone search models, enabling a more granular and precise search for vulnerabilities at the binary level, reducing our dependence on string analysis only. Additionally, our AI models extract high-level behavioral functionalities in firmware, such as communication and encryption, allowing us to quantitatively define the behavioral scope of firmware. In preliminary comparative assessments against open-source alternatives, our solution has demonstrated better SBOM coverage, accuracy in vulnerability identification, and a wider array of features.