Publications

The oneirogen hypothesis: modeling the hallucinatory effects of classical psychedelics in terms of replay-dependent plasticity mechanisms

Classical psychedelics induce complex visual hallucinations in humans, generating percepts that are co-herent at a low level, but which have… (voir plus) surreal, dream-like qualities at a high level. While there are many hypotheses as to how classical psychedelics could induce these effects, there are no concrete mechanistic models that capture the variety of observed effects in humans, while remaining consistent with the known pharmacological effects of classical psychedelics on neural circuits. In this work, we propose the “oneirogen hypothesis”, which posits that the perceptual effects of classical psychedelics are a result of their pharmacological actions inducing neural activity states that truly are more similar to dream-like states. We simulate classical psychedelics’ effects via manipulating neural network models trained on perceptual tasks with the Wake-Sleep algorithm. This established machine learning algorithm leverages two activity phases, a perceptual phase (wake) where sensory inputs are encoded, and a generative phase (dream) where the network internally generates activity consistent with stimulus-evoked responses. We simulate the action of psychedelics by partially shifting the model to the ‘Sleep’ state, which entails a greater influence of top-down connections, in line with the impact of psychedelics on apical dendrites. The effects resulting from this manipulation capture a number of experimentally observed phenomena including the emergence of hallucinations, increases in stimulus-conditioned variability, and large increases in synaptic plasticity. We further provide a number of testable predictions which could be used to validate or invalidate our oneirogen hypothesis.

2025-01-13

bioRxiv (prépublication)

AFRIDOC-MT: Document-level MT Corpus for African Languages

Jesujoba Oluwadara Alabi

Israel Abebe Azime

Miaoran Zhang

Cristina España-Bonet

Rachel Bawden

Dawei Zhu

Clement Odoje

Idris Akinade

Iffat Maab

Davis David

Shamsuddeen Hassan Muhammad

Neo Putini

David O. Ademuyiwa

Andrew Caines

Dietrich Klakow

This paper introduces AFRIDOC-MT, a document-level multi-parallel translation dataset covering English and five African languages: Amharic, … (voir plus)Hausa, Swahili, Yor\`ub\'a, and Zulu. The dataset comprises 334 health and 271 information technology news documents, all human-translated from English to these languages. We conduct document-level translation benchmark experiments by evaluating neural machine translation (NMT) models and large language models (LLMs) for translations between English and these languages, at both the sentence and pseudo-document levels. These outputs are realigned to form complete documents for evaluation. Our results indicate that NLLB-200 achieved the best average performance among the standard NMT models, while GPT-4o outperformed general-purpose LLMs. Fine-tuning selected models led to substantial performance gains, but models trained on sentences struggled to generalize effectively to longer documents. Furthermore, our analysis reveals that some LLMs exhibit issues such as under-generation, repetition of words or phrases, and off-target translations, especially for African languages.

2025-01-10

ArXiv (prépublication)

AFRIDOC-MT: Document-level MT Corpus for African Languages

Jesujoba Oluwadara Alabi

Israel Abebe Azime

Miaoran Zhang

Cristina España-Bonet

Rachel Bawden

D. Zhu

Clement Odoje

Idris Akinade

Iffat Maab

Davis David

Shamsuddeen Hassan Muhammad

Neo Putini

David O. Ademuyiwa

Andrew Caines

Dietrich Klakow

This paper introduces AFRIDOC-MT, a document-level multi-parallel translation dataset covering English and five African languages: Amharic, … (voir plus)Hausa, Swahili, Yor\`ub\'a, and Zulu. The dataset comprises 334 health and 271 information technology news documents, all human-translated from English to these languages. We conduct document-level translation benchmark experiments by evaluating neural machine translation (NMT) models and large language models (LLMs) for translations between English and these languages, at both the sentence and pseudo-document levels. These outputs are realigned to form complete documents for evaluation. Our results indicate that NLLB-200 achieved the best average performance among the standard NMT models, while GPT-4o outperformed general-purpose LLMs. Fine-tuning selected models led to substantial performance gains, but models trained on sentences struggled to generalize effectively to longer documents. Furthermore, our analysis reveals that some LLMs exhibit issues such as under-generation, repetition of words or phrases, and off-target translations, especially for African languages.

2025-01-10

ArXiv (prépublication)

AFRIDOC-MT: Document-level MT Corpus for African Languages

Jesujoba Oluwadara Alabi

Israel Abebe Azime

Miaoran Zhang

Cristina España-Bonet

Rachel Bawden

Dawei Zhu

Clement Odoje

Idris Akinade

Iffat Maab

Davis David

Shamsuddeen Hassan Muhammad

Neo Putini

David O. Ademuyiwa

Andrew Caines

Dietrich Klakow

This paper introduces AFRIDOC-MT, a document-level multi-parallel translation dataset covering English and five African languages: Amharic, … (voir plus)Hausa, Swahili, Yor\`ub\'a, and Zulu. The dataset comprises 334 health and 271 information technology news documents, all human-translated from English to these languages. We conduct document-level translation benchmark experiments by evaluating neural machine translation (NMT) models and large language models (LLMs) for translations between English and these languages, at both the sentence and pseudo-document levels. These outputs are realigned to form complete documents for evaluation. Our results indicate that NLLB-200 achieved the best average performance among the standard NMT models, while GPT-4o outperformed general-purpose LLMs. Fine-tuning selected models led to substantial performance gains, but models trained on sentences struggled to generalize effectively to longer documents. Furthermore, our analysis reveals that some LLMs exhibit issues such as under-generation, repetition of words or phrases, and off-target translations, especially for African languages.

2025-01-10

ArXiv (prépublication)

EPISeg: Automated segmentation of the spinal cord on echo planar images using open-access multi-center data

Rohan Banerjee

Merve Kaptan

Alexandra Tinnermann

Ali Khatibi

Alice Dabbagh

Christian W. Kündig

Csw Law

Dario Pfyffer

David J. Lythgoe

Dimitra Tsivaka

Dimitri Van De Ville

Falk Eippert

Fauziyya Muhammad

Gary H. Glover

Gergely David

Grace Haynes

Jan Haaker

Jonathan C. W. Brooks

Jürgen Finsterbusch

Katherine T. Martucci … (voir 20 de plus)

Kimberly J. Hemmerling

Mahdi Mobarak-Abadi

Mark A. Hoggarth

Matthew A. Howard

Molly G. Bright

Nawal Kinany

O. Kowalczyk

Patrick Freund

Robert L. Barry

Sean Mackey

Shahabeddin Vahdat

Simon Schading

Stephen B McMahon

Todd Parish

Véronique Marchand-Pauvert

Yufen Chen

Zachary A. Smith

KA Weber

Benjamin De Leener

Julien Cohen-Adad

Functional magnetic resonance imaging (fMRI) of the spinal cord is relevant for studying sensation, movement, and autonomic function. Prepro… (voir plus)cessing of spinal cord fMRI data involves segmentation of the spinal cord on gradient-echo echo planar imaging (EPI) images. Current automated segmentation methods do not work well on these data, due to the low spatial resolution, susceptibility artifacts causing distortions and signal drop-out, ghosting, and motion-related artifacts. Consequently, this segmentation task demands a considerable amount of manual effort which takes time and is prone to user bias. In this work, we (i) gathered a multi-center dataset of spinal cord gradient-echo EPI with ground-truth segmentations and shared it on OpenNeuro https://openneuro.org/datasets/ds005143/versions/1.3.0, and (ii) developed a deep learning-based model, EPISeg, for the automatic segmentation of the spinal cord on gradient-echo EPI data. We observe a significant improvement in terms of segmentation quality compared to other available spinal cord segmentation models. Our model is resilient to different acquisition protocols as well as commonly observed artifacts in fMRI data. The training code is available at https://github.com/sct-pipeline/fmri-segmentation/, and the model has been integrated into the Spinal Cord Toolbox as a command-line tool.

2025-01-10

bioRxiv (prépublication)

EPISeg: Automated segmentation of the spinal cord on echo planar images using open-access multi-center data

Rohan Banerjee

Merve Kaptan

Alexandra Tinnermann

Ali Khatibi

Alice Dabbagh

Christian W. Kündig

Csw Law

Dario Pfyffer

David J. Lythgoe

Dimitra Tsivaka

Dimitri Van De Ville

Falk Eippert

Fauziyya Muhammad

Gary H. Glover

Gergely David

Grace Haynes

Jan Haaker

Jonathan C. W. Brooks

Jürgen Finsterbusch

Katherine T. Martucci … (voir 20 de plus)

Kimberly J. Hemmerling

Mahdi Mobarak-Abadi

Mark A. Hoggarth

Matthew A. Howard

Molly G. Bright

Nawal Kinany

O. Kowalczyk

Patrick Freund

Robert L. Barry

Sean Mackey

Shahabeddin Vahdat

Simon Schading

Stephen B McMahon

Todd Parish

Véronique Marchand-Pauvert

Yufen Chen

Zachary A. Smith

KA Weber

Benjamin De Leener

Julien Cohen-Adad

Functional magnetic resonance imaging (fMRI) of the spinal cord is relevant for studying sensation, movement, and autonomic function. Prepro… (voir plus)cessing of spinal cord fMRI data involves segmentation of the spinal cord on gradient-echo echo planar imaging (EPI) images. Current automated segmentation methods do not work well on these data, due to the low spatial resolution, susceptibility artifacts causing distortions and signal drop-out, ghosting, and motion-related artifacts. Consequently, this segmentation task demands a considerable amount of manual effort which takes time and is prone to user bias. In this work, we (i) gathered a multi-center dataset of spinal cord gradient-echo EPI with ground-truth segmentations and shared it on OpenNeuro https://openneuro.org/datasets/ds005143/versions/1.3.0, and (ii) developed a deep learning-based model, EPISeg, for the automatic segmentation of the spinal cord on gradient-echo EPI data. We observe a significant improvement in terms of segmentation quality compared to other available spinal cord segmentation models. Our model is resilient to different acquisition protocols as well as commonly observed artifacts in fMRI data. The training code is available at https://github.com/sct-pipeline/fmri-segmentation/, and the model has been integrated into the Spinal Cord Toolbox as a command-line tool.

2025-01-10

bioRxiv (prépublication)

Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding

Fabian David Schmidt

Ivan Vuli'c

Goran Glavavs

2025-01-10

ArXiv (prépublication)

Open Problems in Machine Unlearning for AI Safety

Fazl Barez

Tingchen Fu

Ameya Prabhu

Stephen Casper

Amartya Sanyal

Adel Bibi

Aidan O'Gara

Robert Kirk

Benjamin Bucknall

Tim Fist

Luke Ong

Philip H. S. Torr

Kwok-Yan Lam

Robert F. Trager

David Scott Krueger

Sören Mindermann

Jose Hernandez-Orallo

Mor Geva

Yarin Gal

As AI systems become more capable, widely deployed, and increasingly autonomous in critical areas such as cybersecurity, biological research… (voir plus), and healthcare, ensuring their safety and alignment with human values is paramount. Machine unlearning -- the ability to selectively forget or suppress specific types of knowledge -- has shown promise for privacy and data removal tasks, which has been the primary focus of existing research. More recently, its potential application to AI safety has gained attention. In this paper, we identify key limitations that prevent unlearning from serving as a comprehensive solution for AI safety, particularly in managing dual-use knowledge in sensitive domains like cybersecurity and chemical, biological, radiological, and nuclear (CBRN) safety. In these contexts, information can be both beneficial and harmful, and models may combine seemingly harmless information for harmful purposes -- unlearning this information could strongly affect beneficial uses. We provide an overview of inherent constraints and open problems, including the broader side effects of unlearning dangerous knowledge, as well as previously unexplored tensions between unlearning and existing safety mechanisms. Finally, we investigate challenges related to evaluation, robustness, and the preservation of safety features during unlearning. By mapping these limitations and open challenges, we aim to guide future research toward realistic applications of unlearning within a broader AI safety framework, acknowledging its limitations and highlighting areas where alternative approaches may be required.

2025-01-09

ArXiv (prépublication)

Open Problems in Machine Unlearning for AI Safety

Fazl Barez

Tingchen Fu

Ameya Prabhu

Stephen Casper

Amartya Sanyal

Adel Bibi

Aidan O'Gara

Robert Kirk

Benjamin Bucknall

Timothy Fist

Luke Ong

Philip Torr

Kwok-Yan Lam

Robert Trager

David Scott Krueger

Sören Mindermann

Jose Hernandez-Orallo

Mor Geva

Yarin Gal

As AI systems become more capable, widely deployed, and increasingly autonomous in critical areas such as cybersecurity, biological research… (voir plus), and healthcare, ensuring their safety and alignment with human values is paramount. Machine unlearning -- the ability to selectively forget or suppress specific types of knowledge -- has shown promise for privacy and data removal tasks, which has been the primary focus of existing research. More recently, its potential application to AI safety has gained attention. In this paper, we identify key limitations that prevent unlearning from serving as a comprehensive solution for AI safety, particularly in managing dual-use knowledge in sensitive domains like cybersecurity and chemical, biological, radiological, and nuclear (CBRN) safety. In these contexts, information can be both beneficial and harmful, and models may combine seemingly harmless information for harmful purposes -- unlearning this information could strongly affect beneficial uses. We provide an overview of inherent constraints and open problems, including the broader side effects of unlearning dangerous knowledge, as well as previously unexplored tensions between unlearning and existing safety mechanisms. Finally, we investigate challenges related to evaluation, robustness, and the preservation of safety features during unlearning. By mapping these limitations and open challenges, we aim to guide future research toward realistic applications of unlearning within a broader AI safety framework, acknowledging its limitations and highlighting areas where alternative approaches may be required.

2025-01-09

ArXiv (prépublication)

Gintare Karolina Dziugaite

Soup to go: mitigating forgetting during continual learning with model averaging

Anat Kleiman

Jonathan Frankle

Sham M. Kakade

Mansheej Paul

In continual learning, where task data arrives in a sequence, fine-tuning on later tasks will often lead to performance degradation on earli… (voir plus)er tasks. This is especially pronounced when these tasks come from diverse domains. In this setting, how can we mitigate catastrophic forgetting of earlier tasks and retain what the model has learned with minimal computational expenses? Inspired by other merging methods, and L2-regression, we propose Sequential Fine-tuning with Averaging (SFA), a method that merges currently training models with earlier checkpoints during the course of training. SOTA approaches typically maintain a data buffer of past tasks or impose a penalty at each gradient step. In contrast, our method achieves comparable results without the need to store past data, or multiple copies of parameters for each gradient step. Furthermore, our method outperforms common merging techniques such as Task Arithmetic, TIES Merging, and WiSE-FT, as well as other penalty methods like L2 and Elastic Weight Consolidation. In turn, our method offers insight into the benefits of merging partially-trained models during training across both image and language domains.

2025-01-09

ArXiv (prépublication)

Gintare Karolina Dziugaite

openreview.net

Soup to go: mitigating forgetting during continual learning with model averaging

Anat Kleiman

Jonathan Frankle

Sham M. Kakade

Mansheej Paul

In continual learning, where task data arrives in a sequence, fine-tuning on later tasks will often lead to performance degradation on earli… (voir plus)er tasks. This is especially pronounced when these tasks come from diverse domains. In this setting, how can we mitigate catastrophic forgetting of earlier tasks and retain what the model has learned with minimal computational expenses? Inspired by other merging methods, and L2-regression, we propose Sequential Fine-tuning with Averaging (SFA), a method that merges currently training models with earlier checkpoints during the course of training. SOTA approaches typically maintain a data buffer of past tasks or impose a penalty at each gradient step. In contrast, our method achieves comparable results without the need to store past data, or multiple copies of parameters for each gradient step. Furthermore, our method outperforms common merging techniques such as Task Arithmetic, TIES Merging, and WiSE-FT, as well as other penalty methods like L2 and Elastic Weight Consolidation. In turn, our method offers insight into the benefits of merging partially-trained models during training across both image and language domains.

2025-01-09

ArXiv (prépublication)