Publications

Decoding face recognition abilities in the human brain

Simon Faghel-Soubeyrand

Meike Ramon

Eva Bamps

Matteo Zoia

Jessica Woodhams

Anne-Raphaelle Richoz

Roberto Caldara

Frédéric Gosselin

Ian Charest

Why are some individuals better at recognising faces? Uncovering the neural mechanisms supporting face recognition ability has proven elusiv… (see more)e. To tackle this challenge, we used a multi-modal data-driven approach combining neuroimaging, computational modelling, and behavioural tests. We recorded the high-density electroencephalographic brain activity of individuals with extraordinary face recognition abilities—super-recognisers—and typical recognisers in response to diverse visual stimuli. Using multivariate pattern analyses, we decoded face recognition abilities from 1 second of brain activity with up to 80% accuracy. To better understand the mechanisms subtending this decoding, we compared computations in the brains of our participants with those in artificial neural network models of vision and semantics, as well as with those involved in human judgments of shape and meaning similarity. Compared to typical recognisers, we found stronger associations between early brain computations of super-recognisers and mid-level computations of vision models as well as shape similarity judgments. Moreover, we found stronger associations between late brain representations of super-recognisers and computations of the artificial semantic model as well as meaning similarity judgments. Overall, these results indicate that important individual variations in brain processing, including neural computations extending beyond purely visual processes, support differences in face recognition abilities. They provide the first empirical evidence for an association between semantic computations and face recognition abilities. We believe that such multi-modal data-driven approaches will likely play a critical role in further revealing the complex nature of idiosyncratic face recognition in the human brain. The ability to robustly recognise faces is crucial to our success as social beings. Yet, we still know little about the brain mechanisms allowing some individuals to excel at face recognition. This study builds on a sizeable neural dataset measuring the brain activity of individuals with extraordinary face recognition abilities—super-recognisers—to tackle this challenge. Using state-of-the-art computational methods, we show robust prediction of face recognition abilities in single individuals from a mere second of brain activity, and revealed specific brain computations supporting individual differences in face recognition ability. Doing so, we provide direct empirical evidence for an association between semantic computations and face recognition abilities in the human brain—a key component of prominent face recognition models.

2024-02-29

PNAS Nexus (published)

doi.org

Dissecting Deep RL with High Update Ratios: Combatting Value Overestimation and Divergence

Marcel Hussing

Claas Voelcker

Igor Gilitschenski

Amir-massoud Farahmand

Eric R. Eaton

We show that deep reinforcement learning can maintain its ability to learn without resetting network parameters in settings where the number… (see more) of gradient updates greatly exceeds the number of environment samples. Under such large update-to-data ratios, a recent study by Nikishin et al. (2022) suggested the emergence of a primacy bias , in which agents overfit early interactions and downplay later experience, impairing their ability to learn. In this work, we dissect the phenomena underlying the primacy bias. We inspect the early stages of training that ought to cause the failure to learn and find that a fundamental challenge is a long-standing acquaintance: value overestimation. Overinflated Q-values are found not only on out-of-distribution but also in-distribution data and can be traced to unseen action prediction propelled by optimizer momentum. We employ a simple unit-ball normalization that enables learning under large update ratios, show its efficacy on the widely used dm_control suite, and obtain strong performance on the challenging dog tasks, competitive with model-based approaches. Our results question, in parts, the prior explanation for sub-optimal learning due to overfitting on early data.

2024-02-29

arXiv (published)

doi.org

FedSwarm: An Adaptive Federated Learning Framework for Scalable AIoT

Haizhou Du

Chengdong Ni

Chaoqian Cheng

Qiao Xiang

X. T. Chen

Xue Liu

Federated learning (FL) is a key solution for datadriven the Artificial Intelligence of Things (AIoT). Although much progress has been made,… (see more) scalability remains a core challenge for real-world FL deployments. Existing solutions either suffer from accuracy loss or do not fully address the connectivity dynamicity of FL systems. In this article, we tackle the scalability issue with a novel, adaptive FL framework called FedSwarm, which improves system scalability for AIoT by deploying multiple collaborative edge servers. FedSwarm has two novel features: 1) adaptiveness on the number of local updates and 2) dynamicity of the synchronization between edge devices and edge servers. We formulate FedSwarm as a local update adaptation and perdevice dynamic server selection problem and prove FedSwarm‘s convergence bound. We further design a control mechanism consisting of a learning-based algorithm for collaboratively providing local update adaptation on the servers’ side and a bonus-based strategy for spurring dynamic per-device server selection on the devices’ side. Our extensive evaluation shows that FedSwarm significantly outperforms other studies with better scalability, lower energy consumption, and higher model accuracy.

2024-02-29

IEEE Internet of Things Journal (published)

doi.org

Gaussian-process-based Bayesian optimization for neurostimulation interventions in rats

Leo Choiniere

Rose Guay-Hottin

Rémi Picard

Guillaume Lajoie

Marco Bonizzato

Numa Dancause

2024-02-29

STAR Protocols (published)

doi.org

Generalization of deep learning models for hepatic steatosis grading using B-mode ultrasound images

Pedro Vianna

Yue Qi

Michael Chassé

Guy Wolf

Eugene Belilovsky

An Tang

Guy Cloutier

Grayscale ultrasound remains a key modality for screening of hepatic steatosis due to its non-invasiveness and availability. While neural ne… (see more)tworks have shown promise in this field, their main drawback lies in their inability to generalize to diverse real-world settings. Variations in equipment, acquisition parameters, or population significantly affect model performance. Test-time adaptation, an unsupervised domain adaptation technique, overcomes these limitations by adjusting trained models during inference. Our retrospective study used two datasets collected in separate populations, with different scanners and protocols. We propose an adaptation method, using test-time batch normalization to selectively adjust BatchNorm layers based on test data for predicting steatosis grades. Comparing the non-adapted and adapted models, the mean absolute error (± standard deviation) in grading four severities of steatosis decreased from 0.92 ± 0.21 to 0.64 ± 0.22 . Specifically, for detection of steatosis the area under the curve increased from 0.76 ± 0.05 to 0.95 ± 0.02 when using the adapted model. Adapted models show promising results in improving performance compared to base models when testing data differ significantly from training data. Results suggest that the proposed method effectively addresses domain shift in diagnosing fatty liver using ultrasound images, reducing risks associated with deploying trained models.

2024-02-29

The Journal of the Acoustical Society of America (published)

doi.org

Heterogeneous ensemble prediction model of CO emission concentration in municipal solid waste incineration process using virtual data and real data hybrid-driven

Runyu Zhang

Jian Tang

Heng Xia

Jiakun Chen

Wen Yu

JunFei Qiao

2024-02-29

Journal of Cleaner Production (published)

doi.org

Implications of conscious AI in primary healthcare

Dorsai Ranjbari

Samira Abbasgholizadeh Rahimi

The conversation about consciousness of artificial intelligence (AI) is an ongoing topic since 1950s. Despite the numerous applications of A… (see more)I identified in healthcare and primary healthcare, little is known about how a conscious AI would reshape its use in this domain. While there is a wide range of ideas as to whether AI can or cannot possess consciousness, a prevailing theme in all arguments is uncertainty. Given this uncertainty and the high stakes associated with the use of AI in primary healthcare, it is imperative to be prepared for all scenarios including conscious AI systems being used for medical diagnosis, shared decision-making and resource management in the future. This commentary serves as an overview of some of the pertinent evidence supporting the use of AI in primary healthcare and proposes ideas as to how consciousnesses of AI can support or further complicate these applications. Given the scarcity of evidence on the association between consciousness of AI and its current state of use in primary healthcare, our commentary identifies some directions for future research in this area including assessing patients’, healthcare workers’ and policy-makers’ attitudes towards consciousness of AI systems in primary healthcare settings.

2024-02-29

Family Medicine and Community Health (published)

doi.org

Iterative Graph Self-Distillation

Hanlin Zhang

Shuai Lin

Weiyang Liu

Pan Zhou

Jian Tang

Xiaodan Liang

Eric P. Xing

Recently, there has been increasing interest in the challenge of how to discriminatively vectorize graphs. To address this, we propose a met… (see more)hod called Iterative Graph Self-Distillation (IGSD) which learns graph-level representation in an unsupervised manner through instance discrimination using a self-supervised contrastive learning approach. IGSD involves a teacher-student distillation process that uses graph diffusion augmentations and constructs the teacher model using an exponential moving average of the student model. The intuition behind IGSD is to predict the teacher network representation of the graph pairs under different augmented views. As a natural extension, we also apply IGSD to semi-supervised scenarios by jointly regularizing the network with both supervised and self-supervised contrastive loss. Finally, we show that fine-tuning the IGSD-trained models with self-training can further improve graph representation learning. Empirically, we achieve significant and consistent performance gain on various graph datasets in both unsupervised and semi-supervised settings, which well validates the superiority of IGSD.

2024-02-29

IEEE Transactions on Knowledge and Data Engineering (published)

doi.org

openreview.net

Neural network prediction of the effect of thermomechanical controlled processing on mechanical properties

Sushant Sinha

Denzel Guye

Xiaoping Ma

Kashif Rehman

Stephen Yue

Narges Armanfard

2024-02-29

Machine Learning with Applications (published)

doi.org

Novel community data in ecology-properties and prospects.

Florian Hartig

Nerea Abrego

Alex Bush

Jonathan M. Chase

G. Guillera‐Arroita

M. Leibold

Otso T. Ovaskainen

Loïc Pellissier

Maximilian Pichler

Giovanni Poggiato

Laura J. Pollock

Sara Si-moussi

Wilfried Thuiller

Duarte S Viana

D. Warton

Damaris Zurell

Douglas W. Yu

2024-02-29

Trends in Ecology & Evolution (published)

doi.org

arxiv.org

Reply to: Model uncertainty obscures major driver of soil carbon

Feng Tao

Benjamin Z. Houlton

Serita D. Frey

Johannes Lehmann

Stefano Manzoni

Yuanyuan Huang

Lifen Jiang

Umakant Mishra

Bruce A. Hungate

Michael W. I. Schmidt

Markus Reichstein

Nuno Carvalhais

Philippe Ciais

Ying-Ping Wang

Bernhard Ahrens

Gustaf Hugelius

Toby Dylan Hocking

Xingjie Lu

Zheng Shi

Kostiantyn Viatkin … (see 15 more)

K. Viatkin

Ronald Vargas

Yusuf Yigini

Christian Omuto

Ashish A. Malik

Guillermo Peralta

Rosa Cuevas-Corona

Luciano E. Di Paolo

Isabel Luotto

Cuijuan Liao

Yi-Shuang Liang

Yixin Liang

Vinisa S. Saynes

Xiaomeng Huang

Yiqi Luo

2024-02-29

Nature (published)

doi.org

Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models

Amal Rannen-Triki

Jörg Bornschein

Razvan Pascanu

Marcus Hutter

Andr'as Gyorgy

Alexandre Galashov

Yee Whye Teh

Michalis K. Titsias

We consider the problem of online fine tuning the parameters of a language model at test time, also known as dynamic evaluation. While it is… (see more) generally known that this approach improves the overall predictive performance, especially when considering distributional shift between training and evaluation data, we here emphasize the perspective that online adaptation turns parameters into temporally changing states and provides a form of context-length extension with memory in weights, more in line with the concept of memory in neuroscience. We pay particular attention to the speed of adaptation (in terms of sample efficiency),sensitivity to the overall distributional drift, and the computational overhead for performing gradient computations and parameter updates. Our empirical study provides insights on when online adaptation is particularly interesting. We highlight that with online adaptation the conceptual distinction between in-context learning and fine tuning blurs: both are methods to condition the model on previously observed tokens.

2024-02-29

arXiv (published)

doi.org

arxiv.org

Mila Ventures Founder in Residence

TRAIL: Responsible AI for Professionals and Leaders

AI Advantage: Productivity in Public Service

Publications

Mila Ventures Founder in Residence

TRAIL: Responsible AI for Professionals and Leaders

AI Advantage: Productivity in Public Service

Popular keywords:

Publications