Portrait de Benjamin Fung

Benjamin Fung

Membre académique associé
Professeur agrégé, McGill University, École des sciences de l'information
McGill University University
Sujets de recherche
Apprentissage automatique appliqué
Apprentissage de représentations
Apprentissage profond
Cybersécurité
Désinformation
Exploration des données
IA pour l'ingénierie logicielle
Recherche d'information
Vie privée

Biographie

Benjamin Fung est titulaire d'une chaire de recherche du Canada en exploration de données pour la cybersécurité, professeur agrégé à l’École des sciences de l’information et membre agrégé de l’École d’informatique de l'Université McGill, rédacteur adjoint de IEEE Transactions of Knowledge and Data Engineering et rédacteur adjoint de Elsevier Sustainable Cities and Society (SCS). Il a obtenu un doctorat en informatique de l'Université Simon Fraser en 2007. Il a à son actif plus de 150 publications revues par un comité de lecture, et plus de 14 000 citations (h-index 57) qui couvrent les domaines de l'exploration de données, de l'apprentissage automatique, de la protection de la vie privée, de la cybersécurité et du génie du bâtiment. Ses travaux d'exploration de données dans les enquêtes criminelles et l'analyse de la paternité d’une œuvre ont été recensés par les médias du monde entier.

Publications

VulANalyzeR: Explainable Binary Vulnerability Detection with Multi-task Learning and Attentional Graph Convolution
Litao Li
Steven H. H. Ding
Yuan Tian
Philippe Charland
Weihan Ou
Leo Song
Congwei Chen
Deep learning-enabled anomaly detection for IoT systems
Adel Abusitta 0001
Adel Abusitta
Glaucio H.S. Carvalho
Omar Abdel Wahab
Talal Halabi
Saja Al-Mamoori
Differentially Private Release of Heterogeneous Network for Managing Healthcare Data
Rashid Hussain Khokhar
Farkhund Iqbal
Khalil Al-Hussaeni
Mohammed Hussain
With the increasing adoption of digital health platforms through mobile apps and online services, people have greater flexibility connecting… (voir plus) with medical practitioners, pharmacists, and laboratories and accessing resources to manage their own health-related concerns. Many healthcare institutions are connecting with each other to facilitate the exchange of healthcare data, with the goal of effective healthcare data management. The contents generated over these platforms are often shared with third parties for a variety of purposes. However, sharing healthcare data comes with the potential risk of exposing patients’ sensitive information to privacy threats. In this article, we address the challenge of sharing healthcare data while protecting patients’ privacy. We first model a complex healthcare dataset using a heterogeneous information network that consists of multi-type entities and their relationships. We then propose DiffHetNet, an edge-based differentially private algorithm, to protect the sensitive links of patients from inbound and outbound attacks in the heterogeneous health network. We evaluate the performance of our proposed method in terms of information utility and efficiency on different types of real-life datasets that can be modeled as networks. Experimental results suggest that DiffHetNet generally yields less information loss and is significantly more efficient in terms of runtime in comparison with existing network anonymization methods. Furthermore, DiffHetNet is scalable to large network datasets.
A Literature Review on Detecting, Verifying, and Mitigating Online Misinformation
Arezo Bodaghi
Ketra A. Schmitt
Pierre Watine
Social media use has transformed communication and made social interaction more accessible. Public microblogs allow people to share and acce… (voir plus)ss news through existing and social-media-created social connections and access to public news sources. These benefits also create opportunities for the spread of false information. False information online can mislead people, decrease the benefits derived from social media, and reduce trust in genuine news. We divide false information into two categories: unintentional false information, also known as misinformation; and intentionally false information, also known as disinformation and fake news. Given the increasing prevalence of misinformation, it is imperative to address its dissemination on social media platforms. This survey focuses on six key aspects related to misinformation: 1) clarify the definition of misinformation to differentiate it from intentional forms of false information; 2) categorize proposed approaches to manage misinformation into three types: detection, verification, and mitigation; 3) review the platforms and languages for which these techniques have been proposed and tested; 4) describe the specific features that are considered in each category; 5) compare public datasets created to address misinformation and categorize into prelabeled content-only datasets and those including users and their connections; and 6) survey fact-checking websites that can be used to verify the accuracy of information. This survey offers a comprehensive and unprecedented review of misinformation, integrating various methodological approaches, datasets, and content-, user-, and network-based approaches, which will undoubtedly benefit future research in this field.
A Novel Deep Multi-head Attentive Vulnerable Line Detector
Miles Q. Li
Ashita Diwan
Of Stances, Themes, and Anomalies in COVID-19 Mask-Wearing Tweets
Jwen Fai Low
Farkhund Iqbal
COVID-19 is an opportunity to study public acceptance of a “new” healthcare intervention, universal masking, which unlike vaccination, i… (voir plus)s mostly alien to the Anglosphere public despite being practiced in ages past. Using a collection of over two million tweets, we studied the ways in which proponents and opponents of masking vied for influence as well as the themes driving the discourse. Pro-mask tweets encouraging others to mask up dominated Twitter early in the pandemic though its continued dominance has been eroded by anti-mask tweets criticizing others for their masking behavior. Engagement, represented by the counts of likes, retweets, and replies, and controversiality and disagreeableness, represented by ratios of the aforementioned counts, favored pro-mask tweets initially but with anti-mask tweets slowly gaining ground. Additional analysis raised the possibility of the platform owners suppressing certain parts of the mask-wearing discussion.
The Age of Ransomware: A Survey on the Evolution, Taxonomy, and Research Directions
Salwa Razaulla
Claude Fachkha
Christine Markarian
Amjad Gawanmeh
Wathiq Mansoor
Chadi Assi
The proliferation of ransomware has become a significant threat to cybersecurity in recent years, causing significant financial, reputationa… (voir plus)l, and operational damage to individuals and organizations. This paper aims to provide a comprehensive overview of the evolution of ransomware, its taxonomy, and its state-of-the-art research contributions. We begin by tracing the origins of ransomware and its evolution over time, highlighting the key milestones and major trends. Next, we propose a taxonomy of ransomware that categorizes different types of ransomware based on their characteristics and behavior. Subsequently, we review the existing research over several years in regard to detection, prevention, mitigation, and prediction techniques. Our extensive analysis, based on more than 150 references, has revealed that significant research, specifically 72.8%, has focused on detecting ransomware. However, a lack of emphasis has been placed on predicting ransomware. Additionally, of the studies focused on ransomware detection, a significant portion, 70%, have utilized Machine Learning methods. This study uncovers a range of shortcomings in research pertaining to real-time protection and identifying zero-day ransomware, and two issues specific to Machine Learning models. Adversarial machine learning exploitation and concept drift have been identified as under-researched areas in the field. This survey is a constructive roadmap for researchers interested in ransomware research matters.
In-Processing Fairness Improvement Methods for Regression Data-Driven Building Models: Achieving Uniform Energy Prediction
Ying Sun
Fariborz Haghighat
A Multifaceted Framework to Evaluate Evasion, Content Preservation, and Misattribution in Authorship Obfuscation Techniques
VDGraph2Vec: Vulnerability Detection in Assembly Code using Message Passing Neural Networks
Ashita Diwan
Miles Q. Li
Software vulnerability detection is one of the most challenging tasks faced by reverse engineers. Recently, vulnerability detection has rece… (voir plus)ived a lot of attention due to a drastic increase in the volume and complexity of software. Reverse engineering is a time-consuming and labor-intensive process for detecting malware and software vulnerabilities. However, with the advent of deep learning and machine learning, it has become possible for researchers to automate the process of identifying potential security breaches in software by developing more intelligent technologies. In this research, we propose VDGraph2Vec, an automated deep learning method to generate representations of assembly code for the task of vulnerability detection. Previous approaches failed to attend to topological characteristics of assembly code while discovering the weakness in the software. VDGraph2Vec embeds the control flow and semantic information of assembly code effectively using the expressive capabilities of message passing neural networks and the RoBERTa model. Our model is able to learn the important features that help distinguish between vulnerable and non-vulnerable software. We carry out our experimental analysis for performance benchmark on three of the most common weaknesses and demonstrate that our model can identify vulnerabilities with high accuracy and outperforms the current state-of-the-art binary vulnerability detection models.
Towards Adaptive Cybersecurity for Green IoT
Talal Halabi
Martine Bellaiche
The Internet of Things (IoT) paradigm has led to an explosion in the number of IoT devices and an exponential rise in carbon footprint incur… (voir plus)red by overburdened IoT networks and pervasive cloud/edge communications. Hence, there is a growing interest in industry and academia to enable the efficient use of computing infrastructures by optimizing the management of data center and IoT resources (hardware, software, network, and data) and reducing operational costs to slash greenhouse gas emissions and create healthy environments. Cybersecurity has also been considered in such efforts as a contributor to these environmental issues. Nonetheless, most green security approaches focus on designing low-overhead encryption schemes and do not emphasize energy-efficient security from architectural and deployment viewpoints. This paper sheds light on the emerging paradigm of adaptive cybersecurity as one of the research directions to support sustainable computing in green IoT. It presents three potential research directions and their associated methods for designing and deploying adaptive security in green computing and resource-constrained IoT environments to save on energy consumption. Such efforts will transform the development of data-driven IoT security solutions to be greener and more environment-friendly.
The generalizability of pre-processing techniques on the accuracy and fairness of data-driven building models: a case study
Ying Sun
Fariborz Haghighat