Publications

A Hybrid CNN-Transformer Approach for Continuous Fine Finger Motion Decoding from sEMG Signals

Zihan Weng

Xiabing Zhang

Yufeng Mou

Chanlin Yi

Fali Li

Peng Xu

This work presents a novel approach that synergistically integrates convolutional neural networks (CNNs) and Transformer models for decoding… (see more) continuous fine finger motions from surface electromyography (sEMG) signals. This integration capitalizes on CNNs’ proficiency in extracting rich temporal and spatial features from multichannel sEMG data and the Transformer’s superior capability in recognizing complex patterns and long-range dependencies. A significant advancement in this field is the use of a custom-developed Epidermal Electrode Array Sleeve (EEAS) for capturing high-fidelity sEMG signals, enabling more accurate and reliable signal acquisition than traditional methods. The decoded joint angles could be used in seamless and intuitive human-machine interaction in various applications, such as virtual reality, augmented reality, robotic control, and prosthetic control. Evaluations demonstrate the superior performance of the proposed CNN-Transformer hybrid architecture in decoding continuous fine finger motions, outperforming individual CNN and Transformer models. The synergistic integration of CNNs and Transformers presents a powerful framework for sEMG decoding, offering exciting opportunities for naturalistic and intuitive human-machine interaction applications. Its robustness and efficiency make it an ideal choice for real-world applications, promising to enhance the interface between humans and machines significantly. The implications of this research extend to advancing the understanding of human neuromuscular signals and their application in computing interfaces.

2024-06-13

2024 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA) (published)

doi.org

MiNT: Multi-Network Training for Transfer Learning on Temporal Graphs

Razieh Shirzadkhani

Tran Gia Bao Ngo

Kiarash Shamsi

Poupak Azad

Baris Coskunuzer

Cuneyt Gurcan Akcora

2024-06-13

ArXiv (preprint)

doi.org

openreview.net

Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice

Shubham Gupta

Mirco Ravanelli

Pascal Germain

Cem Subakan

In this paper, we propose Phoneme Discretized Saliency Maps (PDSM), a discretization algorithm for saliency maps that takes advantage of pho… (see more)neme boundaries for explainable detection of AI-generated voice. We experimentally show with two different Text-to-Speech systems (i.e., Tacotron2 and Fastspeech2) that the proposed algorithm produces saliency maps that result in more faithful explanations compared to standard posthoc explanation methods. Moreover, by associating the saliency maps to the phoneme representations, this methodology generates explanations that tend to be more understandable than standard saliency maps on magnitude spectrograms.

2024-06-13

ArXiv (preprint)

doi.org

arxiv.org

Tracing the Ransomware Bloodline: Investigation and Detection of Drifting Virlock Variants

Salwa Razaulla

Claude Fachkha

Amjad Gawanmeh

Christine Markarian

Benjamin C. M. Fung

Chadi Assi

Malware, especially ransomware, has dramatically increased in volume and sophistication in recent years. The growing complexity and destruct… (see more)ive potential of ransomware demand effective countermeasures. Despite tremendous efforts by the security community to document these threats, reliance on manual analysis makes it challenging to discern unique malware variants from polymorphic variants. Moreover, the easy accessibility of source code of prominent ransomware families in public domains has led to the rise of numerous variants, complicating manual detection and hindering the identification of phylogenetic relationships. This paper introduces a novel approach that narrows the focus to analyze one such prominent ransomware family, Virlock. Using binary code similarity, we systematically reconstruct the lineage of Virlock, tracing its relationships, evolution, and variants. Employing this technique on a dataset of over 1000 Virlock samples submitted to VirusTotal and VirusShare, our analysis unveils intricate relationships within the Virlock ransomware family, offering valuable insights into the tangled relationships of this ransomware.

2024-06-13

International Conference on Computational Collective Intelligence (published)

doi.org

Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition

Eleni Triantafillou

Peter Kairouz

Fabian Pedregosa

Jamie Hayes

Meghdad Kurmanji

Kairan Zhao

Vincent Dumoulin

Julio C. S. Jacques Junior

Ioannis Mitliagkas

Jun Wan

Lisheng Sun-Hosoya

Sergio Escalera

Gintare Karolina Dziugaite

Peter Triantafillou

Isabelle Guyon

We present the findings of the first NeurIPS competition on unlearning, which sought to stimulate the development of novel algorithms and in… (see more)itiate discussions on formal and robust evaluation methodologies. The competition was highly successful: nearly 1,200 teams from across the world participated, and a wealth of novel, imaginative solutions with different characteristics were contributed. In this paper, we analyze top solutions and delve into discussions on benchmarking unlearning, which itself is a research problem. The evaluation methodology we developed for the competition measures forgetting quality according to a formal notion of unlearning, while incorporating model utility for a holistic evaluation. We analyze the effectiveness of different instantiations of this evaluation framework vis-a-vis the associated compute cost, and discuss implications for standardizing evaluation. We find that the ranking of leading methods remains stable under several variations of this framework, pointing to avenues for reducing the cost of evaluation. Overall, our findings indicate progress in unlearning, with top-performing competition entries surpassing existing algorithms under our evaluation framework. We analyze trade-offs made by different algorithms and strengths or weaknesses in terms of generalizability to new datasets, paving the way for advancing both benchmarking and algorithm development in this important area.

2024-06-12

ArXiv (preprint)

doi.org

arxiv.org

Exploring validation metrics for ofﬂine model-based optimisation

Christopher Beckham

Alexandre Piché

David Vázquez

Christopher Pal

In ofﬂine model-based optimisation (MBO) we are interested in using machine learning to de-sign candidates that maximise some measure of d… (see more)esirability through an expensive but real-world scoring process. Ofﬂine MBO tries to approximate this expensive scoring function and use that to evaluate generated designs, however evaluation is non-exact because one approximation is being evaluated with another. Instead, we ask ourselves: if we did have the real world scoring function at hand, what cheap-to-compute validation metrics would correlate best with this? Since the real-world scoring function is available for simulated MBO datasets, insights obtained from this can be transferred over to real-world ofﬂine MBO tasks where the real-world scoring function is expensive to compute. To address this, we propose a conceptual evaluation framework that is amenable to measuring extrapolation, and apply this to conditional denoising diffusion models. Empirically, we ﬁnd that two validation metrics – agreement and Frechet distance – correlate quite well with the ground truth. When there is high variability in conditional generation, feedback is required in the form of an approximated version of the real-world scoring function. Furthermore, we ﬁnd that generating high-scoring samples may require heavily weighting the generative model in favour of sample quality, potentially at the cost of sample diversity.

2024-06-12

TMLR (accepted)

openreview.net

Primary care physicians' perceptions of artificial intelligence systems in the care of adolescents' mental health

Pooria Ghadiri

Mark J. Yaffe

Alayne Mary Adams

Samira Abbasgholizadeh-Rahimi

Given that mental health problems in adolescence may have lifelong impacts, the role of primary care physicians (PCPs) in identifying and ma… (see more)naging these issues is important. Artificial Intelligence (AI) may offer solutions to the current challenges involved in mental health care. We therefore explored PCPs’ challenges in addressing adolescents’ mental health, along with their attitudes towards using AI to assist them in their tasks. We used purposeful sampling to recruit PCPs for a virtual Focus Group (FG). The virtual FG lasted 75 minutes and was moderated by two facilitators. A life transcription was produced by an online meeting software. Transcribed data was cleaned, followed by a priori and inductive coding and thematic analysis. We reached out to 35 potential participants via email. Seven agreed to participate, and ultimately four took part in the FG. PCPs perceived that AI systems have the potential to be cost-effective, credible, and useful in collecting large amounts of patients’ data, and relatively credible. They envisioned AI assisting with tasks such as diagnoses and establishing treatment plans. However, they feared that reliance on AI might result in a loss of clinical competency. PCPs wanted AI systems to be user-friendly, and they were willing to assist in achieving this goal if it was within their scope of practice and they were compensated for their contribution. They stressed a need for regulatory bodies to deal with medicolegal and ethical aspects of AI and clear guidelines to reduce or eliminate the potential of patient harm. This study provides the groundwork for assessing PCPs’ perceptions of AI systems’ features and characteristics, potential applications, possible negative aspects, and requirements for using them. A future study of adolescents’ perspectives on integrating AI into mental healthcare might contribute a fuller understanding of the potential of AI for this population. The online version contains supplementary material available at 10.1186/s12875-024-02417-1.

2024-06-12

BMC Primary Care (published)

doi.org

Turns Out I'm Not Real: Towards Robust Detection of AI-Generated Videos

Qingyuan Liu

Pengyuan Shi

Yun-Yun Tsai

Chengzhi Mao

Junfeng Yang

2024-06-12

ArXiv (preprint)

doi.org

arxiv.org

Dynamics of scientific research on mirror neurons

H. Peyre

A. Yailian

G. Dumas

C. Gauld

2024-06-11

Neuropsychiatrie de l Enfance et de l Adolescence (unknown)

doi.org

PathOCL: Path-Based Prompt Augmentation for OCL Generation with GPT-4

Seif Abukhalaf

Mohammad Hamdaqa

Foutse Khomh

The rapid progress of AI-powered programming assistants, such as GitHub Copilot, has facilitated the development of software applications. T… (see more)hese assistants rely on large language models (LLMs), which are foundation models (FMs) that support a wide range of tasks related to understanding and generating language. LLMs have demonstrated their ability to express UML model specifications using formal languages like the Object Constraint Language (OCL). However, the context size of the prompt is limited by the number of tokens an LLM can process. This limitation becomes significant as the size of UML class models increases. In this study, we introduce PathOCL, a novel path-based prompt augmentation technique designed to facilitate OCL generation. PathOCL addresses the limitations of LLMs, specifically their token processing limit and the challenges posed by large UML class models. PathOCL is based on the concept of chunking, which selectively augments the prompts with a subset of UML classes relevant to the English specification. Our findings demonstrate that PathOCL, compared to augmenting the complete UML class model (UML-Augmentation), generates a higher number of valid and correct OCL constraints using the GPT-4 model. Moreover, the average prompt size crafted using PathOCL significantly decreases when scaling the size of the UML class models.

2024-06-11

Proceedings of the 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering (published)

doi.org

arxiv.org

Data Cleaning and Machine Learning: A Systematic Literature Review

Pierre-Olivier Côté

Amin Nikanjam

Nafisa Ahmed

Dmytro Humeniuk

Foutse Khomh

Context: Machine Learning (ML) is integrated into a growing number of systems for various applications. Because the performance of an ML mod… (see more)el is highly dependent on the quality of the data it has been trained on, there is a growing interest in approaches to detect and repair data errors (i.e., data cleaning). Researchers are also exploring how ML can be used for data cleaning; hence creating a dual relationship between ML and data cleaning. To the best of our knowledge, there is no study that comprehensively reviews this relationship. Objective: This paper's objectives are twofold. First, it aims to summarize the latest approaches for data cleaning for ML and ML for data cleaning. Second, it provides future work recommendations. Method: We conduct a systematic literature review of the papers published between 2016 and 2022 inclusively. We identify different types of data cleaning activities with and for ML: feature cleaning, label cleaning, entity matching, outlier detection, imputation, and holistic data cleaning. Results: We summarize the content of 101 papers covering various data cleaning activities and provide 24 future work recommendations. Our review highlights many promising data cleaning techniques that can be further extended. Conclusion: We believe that our review of the literature will help the community develop better approaches to clean data.

2024-06-10

Automated Software Engineering (published)

doi.org

arxiv.org

GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews

Maxime Darrin

Ines Arous

Pablo Piantanida

Jackie CK Cheung

Scientific peer review is essential for the quality of academic publications. However, the increasing number of paper submissions to confere… (see more)nces has strained the reviewing process. This surge poses a burden on area chairs who have to carefully read an ever-growing volume of reviews and discern each reviewer's main arguments as part of their decision process. In this paper, we introduce \sys, a summarization method designed to offer a concise yet comprehensive overview of scholarly reviews. Unlike traditional consensus-based methods, \sys extracts both common and unique opinions from the reviews. We introduce novel uniqueness scores based on the Rational Speech Act framework to identify relevant sentences in the reviews. Our method aims to provide a pragmatic glimpse into all reviews, offering a balanced perspective on their opinions. Our experimental results with both automatic metrics and human evaluation show that \sys generates more discriminative summaries than baseline methods in terms of human evaluation while achieving comparable performance with these methods in terms of automatic metrics.

2024-06-10

ArXiv (preprint)

doi.org

arxiv.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications