Portrait of Benjamin Fung

Benjamin Fung

Associate Academic Member
Associate Professor, McGill University, School of Information Studies
McGill University University
Research Topics
AI for Software Engineering
Applied Machine Learning
Cybersecurity
Data Mining
Deep Learning
Information Retrieval
Misinformation
Privacy
Representation Learning

Biography

Benjamin Fung is a Canada Research Chair in Data Mining for Cybersecurity, as well as a full professor at the School of Information Studies and associate member of the School of Computer Science, McGill University.

Fung serves as an associate editor of IEEE Transactions of Knowledge and Data Engineering and Sustainable Cities and Society. He received his PhD in computing science from Simon Fraser University in 2007.

Dr. Fung has over 150 refereed publications to his credit and and more than 14,000 citations (h-index 57) spanning the fields of data mining, machine learning, privacy, cybersecurity and building engineering. His findings in the fields of data mining for crime investigations and authorship analysis have been reported by the media worldwide.

Publications

Fairness-aware data-driven-based model predictive controller: A study on thermal energy storage in a residential building
Ying Sun
Fariborz Haghighat
Fairness-aware data-driven-based model predictive controller: A study on thermal energy storage in a residential building
Ying Sun
Fariborz Haghighat
Carthago Delenda Est: Co-opetitive Indirect Information Diffusion Model for Influence Operations on Online Social Media
Jwen Fai Low
Farkhund Iqbal
Claude Fachkha
AsmDocGen: Generating Functional Natural Language Descriptions for Assembly Code
Jesia Yuki
Mohammadhossein Amouei
Philippe Charland
Andrew Walenstein
BETAC: Bidirectional Encoder Transformer for Assembly Code Function Name Recovery
Guillaume Breyton
Mohd Saqib
Philippe Charland
Recovering function names from stripped binaries is a crucial and time-consuming task for software reverse engineering’ particularly in en… (see more)hancing network reliability, resilience, and security. This paper tackles the challenge of recovering function names in stripped binaries, a fundamental step in reverse engineering. The absence of syntactic information and the possibility of different code producing identical behavior complicate this task. To overcome these challenges, we introduce a novel model, the Bidirectional Encoder Transformer for Assembly Code (BETAC), leveraging a transformer-based architecture known for effectively processing sequential data. BETAC utilizes self-attention mechanisms and feed-forward networks to discern complex relationships within assembly code for precise function name prediction. We evaluated BETAC against various existing encoder and decoder models in diverse binary datasets, including benign and malicious codes in multiple formats. Our model demonstrated superior performance over previous techniques in certain metrics and showed resilience against code obfuscation.
Dynamic Neural Control Flow Execution: An Agent-Based Deep Equilibrium Approach for Binary Vulnerability Detection
Litao Li
Steven H. H. Ding
Andrew Walenstein
Philippe Charland
Multidomain Object Detection Framework Using Feature Domain Knowledge Distillation.
Da-Wei Jaw
Shih-Chia Huang
Zhihui Lu
Sy-Yen Kuo
Object detection techniques have been widely studied, utilized in various works, and have exhibited robust performance on images with suffic… (see more)ient luminance. However, these approaches typically struggle to extract valuable features from low-luminance images, which often exhibit blurriness and dim appearence, leading to detection failures. To overcome this issue, we introduce an innovative unsupervised feature domain knowledge distillation (KD) framework. The proposed framework enhances the generalization capability of neural networks across both low-and high-luminance domains without incurring additional computational costs during testing. This improvement is made possible through the integration of generative adversarial networks and our proposed unsupervised KD process. Furthermore, we introduce a region-based multiscale discriminator designed to discern feature domain discrepancies at the object level rather than from the global context. This bolsters the joint learning process of object detection and feature domain distillation tasks. Both qualitative and quantitative assessments shown that the proposed method, empowered by the region-based multiscale discriminator and the unsupervised feature domain distillation process, can effectively extract beneficial features from low-luminance images, outperforming other state-of-the-art approaches in both low-and sufficient-luminance domains.
Survey on Explainable AI: Techniques, challenges and open issues
Adel Abusitta
Miles Q. Li
Towards a unified XAI-based framework for digital forensic investigations
Zainab Khalid
Farkhund Iqbal
VulEXplaineR: XAI for Vulnerability Detection on Assembly Code
Samaneh Mahdavifar
Mohd Saqib
Philippe Charland
Andrew Walenstein
Technological Solutions to Online Toxicity: Potential and Pitfalls
Arezo Bodaghi
Ketra A. Schmitt
Social media platforms present a perplexing duality, acting at once as sites to build community and a sense of belonging, while also giving … (see more)rise to misinformation, facilitating and intensifying disinformation campaigns and perpetuating existing patterns of discrimination from the physical world. The first-step platforms take in mitigating the harmful side of social media involves identifying and managing toxic content. Users produce an enormous volume of posts which must be evaluated very quickly. This is an application context that requires machine-learning (ML) tools, but as we detail in this article, ML approaches rely on human annotators, analysts, and moderators. Our review of existing methods and potential improvements indicates that neither humans nor ML can be removed from this process in the near future. However, we see room for improvement in the working conditions of these human workers.
Adaptive Integration of Categorical and Multi-relational Ontologies with EHR Data for Medical Concept Embedding
Chin Wang Cheong
Kejing Yin
William K. Cheung
Jonathan Poon