Benjamin Fung

Membre académique associé

Professeur agrégé, McGill University, École des sciences de l'information

Sujets de recherche

Exploration des données

Biographie

Benjamin Fung est titulaire d'une chaire de recherche du Canada en exploration de données pour la cybersécurité, professeur agrégé à l’École des sciences de l’information et membre agrégé de l’École d’informatique de l'Université McGill, rédacteur adjoint de IEEE Transactions of Knowledge and Data Engineering et rédacteur adjoint de Elsevier Sustainable Cities and Society (SCS). Il a obtenu un doctorat en informatique de l'Université Simon Fraser en 2007. Il a à son actif plus de 150 publications revues par un comité de lecture, et plus de 14 000 citations (h-index 57) qui couvrent les domaines de l'exploration de données, de l'apprentissage automatique, de la protection de la vie privée, de la cybersécurité et du génie du bâtiment. Ses travaux d'exploration de données dans les enquêtes criminelles et l'analyse de la paternité d’une œuvre ont été recensés par les médias du monde entier.

Publications

Diminished social memory and hippocampal correlates of social interactions in chronic social defeat stress susceptibility

Amanda Larosa

Tian Rui Zhang

Alice S. Wong

Y. H. Fung Cyrus

Xiong Ling Yun (Jenny) Long

Benjamin Fung

Tak Pan Wong

2024-08-08

bioRxiv (prépublication)

doi.org

A Comprehensive Analysis of Explainable AI for Malware Hunting

Mohd Saqib

Samaneh Mahdavifar

Benjamin Fung

Philippe Charland

In the past decade, the number of malware variants has increased rapidly. Many researchers have proposed to detect malware using intelligent… (voir plus) techniques, such as Machine Learning (ML) and Deep Learning (DL), which have high accuracy and precision. These methods, however, suffer from being opaque in the decision-making process. Therefore, we need Artificial Intelligence (AI)-based models to be explainable, interpretable, and transparent to be reliable and trustworthy. In this survey, we reviewed articles related to Explainable AI (XAI) and their application to the significant scope of malware detection. The article encompasses a comprehensive examination of various XAI algorithms employed in malware analysis. Moreover, we have addressed the characteristics, challenges, and requirements in malware analysis that cannot be accommodated by standard XAI methods. We discussed that even though Explainable Malware Detection (EMD) models provide explainability, they make an AI-based model more vulnerable to adversarial attacks. We also propose a framework that assigns a level of explainability to each XAI malware analysis model, based on the security features involved in each method. In summary, the proposed project focuses on combining XAI and malware analysis to apply XAI models for scrutinizing the opaque nature of AI systems and their applications to malware analysis.

2024-07-11

ACM Computing Surveys (publié)

doi.org

Better entity matching with transformers through ensembles

Jwen Fai Low

Benjamin Fung

Pulei Xiong

2024-06-01

Knowledge-Based Systems (publié)

doi.org

ERS0: Enhancing Military Cybersecurity with AI-Driven SBOM for Firmware Vulnerability Detection and Asset Management

Max Beninger

Philippe Charland

Steven H. H. Ding

Benjamin Fung

Firmware vulnerability detection and asset management through a software bill of material (SBOM) approach is integral to defensive military … (voir plus)operations. SBOMs provide a comprehensive list of software components, enabling military organizations to identify vulnerabilities within critical systems, including those controlling various functions in military platforms, as well as in operational technologies and Internet of Things devices. This proactive approach is essential for supply chain security, ensuring that software components are sourced from trusted suppliers and have not been tampered with during production, distribution, or through updates. It is a key element of defense strategies, allowing for rapid assessment, response, and mitigation of vulnerabilities, ultimately safeguarding military capabilities and information from cyber threats. In this paper, we propose ERS0, an SBOM system, driven by artificial intelligence (AI), for detecting firmware vulnerabilities and managing firmware assets. We harness the power of pre-trained large-scale language models to effectively address a wide array of string patterns, extending our coverage to thousands of third-party library patterns. Furthermore, we employ AI-powered code clone search models, enabling a more granular and precise search for vulnerabilities at the binary level, reducing our dependence on string analysis only. Additionally, our AI models extract high-level behavioral functionalities in firmware, such as communication and encryption, allowing us to quantitatively define the behavioral scope of firmware. In preliminary comparative assessments against open-source alternatives, our solution has demonstrated better SBOM coverage, accuracy in vulnerability identification, and a wider array of features.

2024-05-28

2024 16th International Conference on Cyber Conflict: Over the Horizon (CyCon) (publié)

doi.org

GAGE: Genetic Algorithm-Based Graph Explainer for Malware Analysis

Mohd Saqib

Benjamin Fung

Philippe Charland

Andrew Walenstein

Malware analysts often prefer reverse engineering using Call Graphs, Control Flow Graphs (CFGs), and Data Flow Graphs (DFGs), which involves… (voir plus) the utilization of black-box Deep Learning (DL) models. The proposed research introduces a structured pipeline for reverse engineering-based analysis, offering promising results compared to state-of-the-art methods and providing high-level interpretability for malicious code blocks in subgraphs. We propose the Canonical Executable Graph (CEG) as a new representation of Portable Executable (PE) files, uniquely incorporating syntactical and semantic information into its node embeddings. At the same time, edge features capture structural aspects of PE files. This is the first work to present a PE file representation encompassing syntactical, semantic, and structural characteristics, whereas previous efforts typically focused solely on syntactic or structural properties. Furthermore, recognizing the limitations of existing graph explanation methods within Explainable Artificial Intelligence (XAI) for malware analysis, primarily due to the specificity of malicious files, we introduce Genetic Algorithm-based Graph Explainer (GAGE). GAGE operates on the CEG, striving to identify a precise subgraph relevant to predicted malware families. Through experiments and comparisons, our proposed pipeline exhibits substantial improvements in model robustness scores and discriminative power compared to the previous benchmarks. Furthermore, we have successfully used GAGE in practical applications on real-world data, producing meaningful insights and interpretability. This research offers a robust solution to enhance cybersecurity by delivering a transparent and accurate understanding of malware behaviour. Moreover, the proposed algorithm is specialized in handling graph-based data, effectively dissecting complex content and isolating influential nodes.

2024-05-13

IEEE International Conference on Data Engineering (publié)

doi.org

Fairness-aware data-driven-based model predictive controller: A study on thermal energy storage in a residential building

Ying Sun

Fariborz Haghighat

Benjamin Fung

2024-05-01

Journal of Energy Storage (publié)

doi.org

Dynamic Neural Control Flow Execution: An Agent-Based Deep Equilibrium Approach for Binary Vulnerability Detection

Litao Li

Steven H. H. Ding

Andrew Walenstein

Philippe Charland

Benjamin Fung

2024-04-03

ArXiv (prépublication)

doi.org

arxiv.org

Carthago Delenda Est: Co-opetitive Indirect Information Diffusion Model for Influence Operations on Online Social Media

Jwen Fai Low

Benjamin Fung

Farkhund Iqbal

Claude Fachkha

2024-02-02

ArXiv (prépublication)

doi.org

arxiv.org

AsmDocGen: Generating Functional Natural Language Descriptions for Assembly Code

Jesia Yuki

Mohammadhossein Amouei

Benjamin Fung

Philippe Charland

Andrew Walenstein

2024-01-01

International Conference on Software and Data Technologies (publié)

doi.org

BETAC: Bidirectional Encoder Transformer for Assembly Code Function Name Recovery

Guillaume Breyton

Mohd Saqib

Benjamin Fung

Philippe Charland

Recovering function names from stripped binaries is a crucial and time-consuming task for software reverse engineering’ particularly in en… (voir plus)hancing network reliability, resilience, and security. This paper tackles the challenge of recovering function names in stripped binaries, a fundamental step in reverse engineering. The absence of syntactic information and the possibility of different code producing identical behavior complicate this task. To overcome these challenges, we introduce a novel model, the Bidirectional Encoder Transformer for Assembly Code (BETAC), leveraging a transformer-based architecture known for effectively processing sequential data. BETAC utilizes self-attention mechanisms and feed-forward networks to discern complex relationships within assembly code for precise function name prediction. We evaluated BETAC against various existing encoder and decoder models in diverse binary datasets, including benign and malicious codes in multiple formats. Our model demonstrated superior performance over previous techniques in certain metrics and showed resilience against code obfuscation.

2024-01-01

DRCN (publié)

doi.org

Multidomain Object Detection Framework Using Feature Domain Knowledge Distillation.

Da-Wei Jaw

Shih-Chia Huang

Zhihui Lu

Benjamin Fung

Sy-Yen Kuo

Object detection techniques have been widely studied, utilized in various works, and have exhibited robust performance on images with suffic… (voir plus)ient luminance. However, these approaches typically struggle to extract valuable features from low-luminance images, which often exhibit blurriness and dim appearence, leading to detection failures. To overcome this issue, we introduce an innovative unsupervised feature domain knowledge distillation (KD) framework. The proposed framework enhances the generalization capability of neural networks across both low-and high-luminance domains without incurring additional computational costs during testing. This improvement is made possible through the integration of generative adversarial networks and our proposed unsupervised KD process. Furthermore, we introduce a region-based multiscale discriminator designed to discern feature domain discrepancies at the object level rather than from the global context. This bolsters the joint learning process of object detection and feature domain distillation tasks. Both qualitative and quantitative assessments shown that the proposed method, empowered by the region-based multiscale discriminator and the unsupervised feature domain distillation process, can effectively extract beneficial features from low-luminance images, outperforming other state-of-the-art approaches in both low-and sufficient-luminance domains.

2024-01-01

IEEE Trans. Cybern. (publié)

doi.org

Survey on Explainable AI: Techniques, challenges and open issues

Adel Abusitta

Miles Q. Li

Benjamin Fung

2024-01-01

Expert Syst. Appl. (publié)

doi.org

Peu importe la taille : démocratiser la découverte de protéines avec l'IA

Boussole des politiques en IA

Demandes de supervision

Benjamin Fung

Biographie

Publications

Peu importe la taille : démocratiser la découverte de protéines avec l'IA

Boussole des politiques en IA

Demandes de supervision

Mots-clés populaires:

Benjamin Fung

Biographie

Publications