Publications

Spatial Action Unit Cues for Interpretable Deep Facial Expression Recognition

Soufiane Belharbi

Alessandro L. Koerich

Simon Bacon

Eric Granger

Although state-of-the-art classifiers for facial expression recognition (FER) can achieve a high level of accuracy, they lack interpretabili… (see more)ty, an important feature for end-users. Experts typically associate spatial action units (AUs) from a codebook to facial regions for the visual interpretation of expressions. In this paper, the same expert steps are followed. A new learning strategy is proposed to explicitly incorporate AU cues into classifier training, allowing to train deep interpretable models. During training, this AU codebook is used, along with the input image expression label, and facial landmarks, to construct a AU heatmap that indicates the most discriminative image regions of interest w.r.t the facial expression. This valuable spatial cue is leveraged to train a deep interpretable classifier for FER. This is achieved by constraining the spatial layer features of a classifier to be correlated with AU heatmaps. Using a composite loss, the classifier is trained to correctly classify an image while yielding interpretable visual layer-wise attention correlated with AU maps, simulating the expert decision process. Our strategy only relies on image class expression for supervision, without additional manual annotations. Our new strategy is generic, and can be applied to any deep CNN- or transformer-based classifier without requiring any architectural change or significant additional training time. Our extensive evaluation on two public benchmarks RAF-DB, and AffectNet datasets shows that our proposed strategy can improve layer-wise interpretability without degrading classification performance. In addition, we explore a common type of interpretable classifiers that rely on class activation mapping (CAM) methods, and show that our approach can also improve CAM interpretability.

2024-10-01

ArXiv (preprint)

doi.org

arxiv.org

Spatial Action Unit Cues for Interpretable Deep Facial Expression Recognition

Soufiane Belharbi

Marco Pedersoli

Alessandro Lameiras Koerich

Simon Bacon

Eric Granger

Although state-of-the-art classifiers for facial expression recognition (FER) can achieve a high level of accuracy, they lack interpretabili… (see more)ty, an important feature for end-users. Experts typically associate spatial action units (AUs) from a codebook to facial regions for the visual interpretation of expressions. In this paper, the same expert steps are followed. A new learning strategy is proposed to explicitly incorporate AU cues into classifier training, allowing to train deep interpretable models. During training, this AU codebook is used, along with the input image expression label, and facial landmarks, to construct a AU heatmap that indicates the most discriminative image regions of interest w.r.t the facial expression. This valuable spatial cue is leveraged to train a deep interpretable classifier for FER. This is achieved by constraining the spatial layer features of a classifier to be correlated with AU heatmaps. Using a composite loss, the classifier is trained to correctly classify an image while yielding interpretable visual layer-wise attention correlated with AU maps, simulating the expert decision process. Our strategy only relies on image class expression for supervision, without additional manual annotations. Our new strategy is generic, and can be applied to any deep CNN- or transformer-based classifier without requiring any architectural change or significant additional training time. Our extensive evaluation on two public benchmarks RAF-DB, and AffectNet datasets shows that our proposed strategy can improve layer-wise interpretability without degrading classification performance. In addition, we explore a common type of interpretable classifiers that rely on class activation mapping (CAM) methods, and show that our approach can also improve CAM interpretability.

2024-10-01

arXiv (published)

doi.org

arxiv.org

A Survey of Diversification Techniques in Search and Recommendation

Haolun Wu

Yansen Zhang

Chen Ma

Fuyuan Lyu

Bowei He

Fernando Diaz

Bhaskar Mitra

Xue (Steve) Liu

Diversifying search results is an important research topic in retrieval systems in order to satisfy both the various interests of customers … (see more)and the equal market exposure of providers. There has been a growing attention on diversity-aware research during recent years, accompanied by a proliferation of literature on methods to promote diversity in search and recommendation. However, the diversity-aware studies in retrieval systems lack a systematic organization and are rather fragmented. In this survey, we are the first to propose a unified taxonomy for classifying the metrics and approaches of diversification in both search and recommendation, which are two of the most extensively researched fields of retrieval systems. We begin the survey with a brief discussion of why diversity is important in retrieval systems, followed by a summary of the various diversity concerns in search and recommendation, highlighting their relationship and differences. For the survey’s main body, we present a unified taxonomy of diversification metrics and approaches in retrieval systems, from both the search and recommendation perspectives. In the later part of the survey, we discuss the openness research questions of diversity-aware research in search and recommendation in an effort to inspire future innovations and encourage the implementation of diversity in real-world systems.

2024-10-01

IEEE Transactions on Knowledge and Data Engineering (published)

doi.org

arxiv.org

The oneirogen hypothesis: modeling the hallucinatory effects of classical psychedelics in terms of replay-dependent plasticity mechanisms

2024-09-30

bioRxiv (preprint)

doi.org

What Information Contributes to Log-based Anomaly Detection? Insights from a Configurable Transformer-Based Approach

Xingfang Wu

Heng Li

Foutse Khomh

Log data are generated from logging statements in the source code, providing insights into the execution processes of software applications … (see more)and systems. State-of-the-art log-based anomaly detection approaches typically leverage deep learning models to capture the semantic or sequential information in the log data and detect anomalous runtime behaviors. However, the impacts of these different types of information are not clear. In addition, existing approaches have not captured the timestamps in the log data, which can potentially provide more fine-grained temporal information than sequential information. In this work, we propose a configurable transformer-based anomaly detection model that can capture the semantic, sequential, and temporal information in the log data and allows us to configure the different types of information as the model's features. Additionally, we train and evaluate the proposed model using log sequences of different lengths, thus overcoming the constraint of existing methods that rely on fixed-length or time-windowed log sequences as inputs. With the proposed model, we conduct a series of experiments with different combinations of input features to evaluate the roles of different types of information in anomaly detection. When presented with log sequences of varying lengths, the model can attain competitive and consistently stable performance compared to the baselines. The results indicate that the event occurrence information plays a key role in identifying anomalies, while the impact of the sequential and temporal information is not significant for anomaly detection in the studied public datasets. On the other hand, the findings also reveal the simplicity of the studied public datasets and highlight the importance of constructing new datasets that contain different types of anomalies to better evaluate the performance of anomaly detection models.

2024-09-30

ArXiv (preprint)

doi.org

arxiv.org

Automating MedSAM by Learning Prompts with Weak Few-Shot Supervision

Melanie Gaillochet

Christian Desrosiers

Hervé Lombaert

2024-09-28

Lecture Notes in Computer Science (published)

doi.org

arxiv.org

Latent Representation Learning for Multimodal Brain Activity Translation

Arman Afrasiyabi

Dhananjay Bhaskar

Erica L. Busch

Laurent Caplette

Rahul Singh

Guillaume Lajoie

Nicholas B. Turk-Browne

Smita Krishnaswamy

Neuroscience employs diverse neuroimaging techniques, each offering distinct insights into brain activity, from electrophysiological recordi… (see more)ngs such as EEG, which have high temporal resolution, to hemodynamic modalities such as fMRI, which have increased spatial precision. However, integrating these heterogeneous data sources remains a challenge, which limits a comprehensive understanding of brain function. We present the Spatiotemporal Alignment of Multimodal Brain Activity (SAMBA) framework, which bridges the spatial and temporal resolution gaps across modalities by learning a unified latent space free of modality-specific biases. SAMBA introduces a novel attention-based wavelet decomposition for spectral filtering of electrophysiological recordings, graph attention networks to model functional connectivity between functional brain units, and recurrent layers to capture temporal autocorrelations in brain signal. We show that the training of SAMBA, aside from achieving translation, also learns a rich representation of brain information processing. We showcase this classify external stimuli driving brain activity from the representation learned in hidden layers of SAMBA, paving the way for broad downstream applications in neuroscience research and clinical contexts.

2024-09-27

ArXiv (preprint)

doi.org

arxiv.org

Longitudinal bi-criteria framework for assessing national healthcare responses to pandemic outbreaks

Adel Guitouni

Nabil Belacel

Loubna Benabbou

Belaid Moa

Munire Erman

Halim Abdul

2024-09-27

Scientific Reports (published)

doi.org

Longitudinal bi-criteria framework for assessing national healthcare responses to pandemic outbreaks

Adel Guitouni

Nabil Belacel

Loubna Benabbou

Belaid Moa

Munire Erman

Halim Abdul

Pandemics like COVID-19 have illuminated the significant disparities in the performance of national healthcare systems (NHCSs) during rapidl… (see more)y evolving crises. The challenge of comparing NHCS performance has been a difficult topic in the literature. To address this gap, our study introduces a bi-criteria longitudinal algorithm that merges fuzzy clustering with Data Envelopment Analysis (DEA). This new approach provides a comprehensive and dynamic assessment of NHCS performance and efficiency during the early phase of the pandemic. By categorizing each NHCS as an efficient performer, inefficient performer, efficient underperformer, or inefficient underperformer, our analysis vividly represents performance dynamics, clearly identifying the top and bottom performers within each cluster of countries. Our methodology offers valuable insights for performance evaluation and benchmarking, with significant implications for enhancing pandemic response strategies. The study’s findings are discussed from theoretical and practical perspectives, offering guidance for future health system assessments and policy-making.

2024-09-27

Scientific Reports (published)

doi.org

CALE: Continuous Arcade Learning Environment

Jesse Farebrother

Pablo Samuel Castro

We introduce the Continuous Arcade Learning Environment (CALE), an extension of the well-known Arcade Learning Environment (ALE) [Bellemare … (see more)et al., 2013]. The CALE uses the same underlying emulator of the Atari 2600 gaming system (Stella), but adds support for continuous actions. This enables the benchmarking and evaluation of continuous-control agents (such as PPO [Schulman et al., 2017] and SAC [Haarnoja et al., 2018]) and value-based agents (such as DQN [Mnih et al., 2015] and Rainbow [Hessel et al., 2018]) on the same environment suite. We provide a series of open questions and research directions that CALE enables, as well as initial baseline results using Soft Actor-Critic. CALE is available as part of the ALE athttps://github.com/Farama-Foundation/Arcade-Learning-Environment.

2024-09-26

NeurIPS.cc/2024/Datasets_and_Benchmarks_Track (poster)

doi.org

openreview.net

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark

David LE MEUR

David Orlando Romero Mogrovejo

Chenyang Lyu

Haryo Akbarianto Wibowo

Teresa Lynn

Injy Hamed

Aditya Nanda Kishore Khandavally

Aishik Mandal

Alina Dragonetti

Artem Abzaliev

Atnafu Lambebo Tonja

Bontu Fufa Balcha

Chenxi Whitehouse

Christian Salamea-Palacios

Dan John Velasco

David Ifeoluwa Adelani

D. Meur

Emilio Villa Cueva

Fajri Koto

Fauzan Farooqui … (see 57 more)

Frederico Belcavello

Ganzorig Batnasan

Gisela Vallejo

Gráinne Caulfield

Guido Ivetta

Haiyue Song

Henok Biadglign Ademtew

Hernán Maina

Holy Lovenia

Israel Abebe Azime

Jan Christian Blaise Cruz

Jay Gala

Jiahui Geng

Jesus-German Ortiz-Barajas

Jinheon Baek

Jocelyn Dunstan

Laura Alonso Alemany

Teresa Clifford

Kumaranage Ravindu Yasas Nagasinghe

Luciana Benotti

Luis Fernando D'Haro

Marcelo Viridiano

Marcos Estecha-Garitagoitia

Maria Camila Buitrago Cabrera

Mario Rodríguez-Cantelar

Mélanie Jouitteau

Mihail Minkov Mihaylov

Mohamed Fazli Mohamed Imam

Muhammad Farid Adilazuarda

Munkhjargal Gochoo

Munkh-Erdene Otgonbold

Naome Etori

Olivier NIYOMUGISHA

Paula Mónica Silva

Pranjal A Chitale

Raj Dabre

Rendi Chevi

Ruochen Zhang

Ryandito Diandaru

Samuel Cahyawijaya

Santiago Góngora

Soyeong Jeong

Sukannya Purkayastha

Tatsuki Kuribayashi

Thanmay Jayakumar

Tiago Timponi Torrent

Toqeer Ehsan

Vladimir Araujo

Yova Kementchedjhieva

Zara Burzo

Zheng Wei Lim

Zheng Xin Yong

Oana Ignat

Joan Nwatu

Rada Mihalcea

Thamar Solorio

Alham Fikri Aji

2024-09-26

NeurIPS.cc/2024/Datasets_and_Benchmarks_Track (oral)

doi.org

openreview.net

Development of AI-assisted microscopy frameworks through realistic simulation in pySTED

Anthony Bilodeau

Albert Michaud-Gagnon

Julia Chabbert

Benoit Turcotte

Jörn Heine

Audrey Durand

Flavie Lavoie-Cardinal

The integration of artificial intelligence into microscopy systems significantly enhances performance, optimizing both the image acquisition… (see more) and analysis phases. Development of artificial intelligence (AI)-assisted super-resolution microscopy is often limited by the access to large biological datasets, as well as by the difficulties to benchmark and compare approaches on heterogeneous samples. We demonstrate the benefits of a realistic STED simulation platform, pySTED, for the development and deployment of AI-strategies for super-resolution microscopy. The simulation environment provided by pySTED allows the augmentation of data for the training of deep neural networks, the development of online optimization strategies, and the training of reinforcement learning models, that can be deployed successfully on a real microscope.

2024-09-26

Nature Machine Intelligence (published)

doi.org

Hugo Larochelle appointed Scientific Director of Mila

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

AI Insights for Policymakers

Publications

Hugo Larochelle appointed Scientific Director of Mila

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

AI Insights for Policymakers

Popular keywords:

Publications