Publications

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Rajiv Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh Bharadwaj Gundavarapu
Alex Lamb
Nan Rosemary Ke
On the benefits of representation regularization in invariance based domain generalization
Changjian Shui
Boyu Wang
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Hugo Laurençon
Lucile Saulnier
Thomas Wang
Christopher Akiki
Albert Villanova del Moral
Teven Le Scao
Leandro Von Werra
Chenghao Mou
Eduardo González Ponferrada
Huu Nguyen
Jörg Frohberg
Mario Šaško
Quentin Lhoest
Angelina McMillan-Major
Gérard Dupont
Stella Biderman
Anna Rogers
Loubna Ben allal
Francesco De Toni
Giada Pistilli … (see 34 more)
Olivier Nguyen
Somaieh Nikpoor
Maraim Masoud
Pierre Colombo
Javier de la Rosa
Paulo Villegas
Tristan Thrush
Shayne Longpre
Sebastian Nagel
Leon Weber
Manuel Romero Muñoz
Jian Zhu
Daniel Van Strien
Zaid Alyafeai
Khalid Almubarak
Vu Minh Chien
Itziar Gonzalez-Dios
Aitor Soroa
Kyle Lo
Manan Dey
Pedro Ortiz Suarez
Aaron Gokaslan
Shamik Bose
Long Phan
Hieu Tran
Ian Yu
Suhas Pai
Jenny Chim
Violette Lepercq
Suzana Ilic
Margaret Mitchell
Sasha Luccioni
Yacine Jernite
As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multili… (see more)ngual settings. The BigScience workshop, a 1-year international and multidisciplinary initiative, was formed with the goal of researching and training large language models as a values-driven undertaking, putting issues of ethics, harm, and governance in the foreground. This paper documents the data creation and curation efforts undertaken by BigScience to assemble the Responsible Open-science Open-collaboration Text Sources (ROOTS) corpus, a 1.6TB dataset spanning 59 languages that was used to train the 176-billion-parameter BigScience Large Open-science Open-access Multilingual (BLOOM) language model. We further release a large initial subset of the corpus and analyses thereof, and hope to empower large-scale monolingual and multilingual modeling projects with both the data and the processing tools, as well as stimulate research around this large multilingual corpus.
On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging
Chris Junchi Li
Yaodong Yu
Nicolas Loizou
Yitong Ma
Michael I. Jordan
We study the stochastic bilinear minimax optimization problem, presenting an analysis of the same-sample Stochastic ExtraGradient (SEG) meth… (see more)od with constant step size, and presenting variations of the method that yield favorable convergence. In sharp contrasts with the basic SEG method whose last iterate only contracts to a fixed neighborhood of the Nash equilibrium, SEG augmented with iteration averaging provably converges to the Nash equilibrium under the same standard settings, and such a rate is further improved by incorporating a scheduled restarting procedure. In the interpolation setting where noise vanishes at the Nash equilibrium, we achieve an optimal convergence rate up to tight constants. We present numerical experiments that validate our theoretical findings and demonstrate the effectiveness of the SEG method when equipped with iteration averaging and restarting.
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Dieuwke Hupkes
Adina Williams
On the Performance Implications of Deploying IoT Apps as FaaS
M. Aly
Soumaya Yacout
The Secret to Better AI and Better Software (Is Requirements Engineering)
Nelly Bencomo
Rachel. Harrison
Hans-Martin Heyn
Tim J Menzies
Recently, practitioners and researchers met to discuss the role of requirements, and AI and SE. We offer here notes on that fascinating disc… (see more)ussion. Also, have you considered writing for this column? This “SE for AI” column publishes commentaries on the growing field of SE for AI. Submissions are welcomed and encouraged (1,000–2,400 words, each figure and table counts as 250 words, try to use fewer than 12 references, and keep the discussion practitioner focused). Please submit your ideas to me at timm@ieee.org.—Tim Menzies
The Secret to Better AI and Better Software (Is Requirements Engineering)
Nelly Bencomo
Rachel Harrison
Hans-Martin Heyn
Tim Menzies
Recently, practitioners and researchers met to discuss the role of requirements, and AI and SE. We offer here notes on that fascinating disc… (see more)ussion. Also, have you considered writing for this column? This “SE for AI” column publishes commentaries on the growing field of SE for AI. Submissions are welcomed and encouraged (1,000–2,400 words, each figure and table counts as 250 words, try to use fewer than 12 references, and keep the discussion practitioner focused). Please submit your ideas to me at timm@ieee.org.—Tim Menzies
There is no fundamental trade-off between prediction accuracy and feature importance reliability
Jianzhong Chen
L.Q.R. Ooi
Jingwei Li
L. Christopher
Asplund
Simon B. Eickhoff
Avram J. Holmes
Blake T. Thomas
Yeo
There is significant interest in using neuroimaging data to predict behavior. The predictive models are often interpreted by the computation… (see more) of feature importance, which quantifies the predictive relevance of an imaging feature. Tian and Zalesky (2021) suggest that feature importance estimates exhibit low test-retest reliability, pointing to a potential trade-off between prediction accuracy and feature importance reliability. This trade-off is counter-intuitive because both prediction accuracy and test-retest reliability reflect the reliability of brain-behavior relationships across independent samples. Here, we revisit the relationship between prediction accuracy and feature importance reliability in a large well-powered dataset across a wide range of behavioral measures. We demonstrate that, with a sufficient sample size, feature importance (operationalized as Haufe-transformed weights) can achieve fair to excellent test-retest reliability. More specifically, with a sample size of about 2600 participants, Haufe-transformed weights achieve average intra-class correlation coefficients of 0.75, 0.57 and 0.53 for cognitive, personality and mental health measures respectively. Haufe-transformed weights are much more reliable than original regression weights and univariate FC-behavior correlations. Intriguingly, feature importance reliability is strongly positively correlated with prediction accuracy across phenotypes. Within a particular behavioral domain, there was no clear relationship between prediction performance and feature importance reliability across regression algorithms. Finally, we show mathematically that feature importance reliability is necessary, but not sufficient, for low feature importance error. In the case of linear models, lower feature importance error leads to lower prediction error (up to a scaling by the feature covariance matrix). Overall, we find no fundamental trade-off between feature importance reliability and prediction accuracy.
Title: Functional architecture of the aging brain
Roni Setton
Laetitia Mwilambwe-Tshilobo
Manesh Girn
Amber W. Lockrow
Giulia Baracchini
Alexander J. Lowe
Benjamin N. Cassidy
Jian Li
Wen-Ming Luh
Richard M. Leahy
Tian Ge
Daniel S. Margulies
Bratislav Mišić
Boris C Bernhardt
W. Dale Stevens
Felipe De Brigard
Prantik Kundu
Richard S. Gary
Gary R. Turner … (see 1 more)
R. Nathan Spreng
The intrinsic functional connectome can reveal how a lifetime of learning and lived experience is represented in the functional architecture… (see more) of the aging brain. We investigated whether network dedifferentiation, a hallmark of brain aging, reflects a global shift in network dynamics, or comprises network-specific changes that reflect the changing landscape of aging cognition. We implemented a novel multi-faceted strategy involving multi-echo fMRI acquisition and de-noising, individualized cortical parcellation, and multivariate (gradient and edge-level) functional connectivity methods. Twenty minutes of resting-state fMRI data and cognitive assessments were collected in younger (n=181) and older (n=120) adults. Dimensionality in the BOLD signal was lower for older adults, consistent with global network dedifferentiation. Functional connectivity gradients were largely age-invariant. In contrast, edge-level connectivity showed widespread changes with age, revealing discrete, network-specific dedifferentiation patterns. Visual and somatosensory regions were more integrated within the functional connectome; default and frontoparietal regions showed greater coupling; and the dorsal attention network was less differentiated from transmodal regions. Associations with cognition suggest that the formation and preservation of integrated, large-scale brain networks supports complex cognitive abilities. However, into older adulthood, the connectome is dominated by large-scale network disintegration, global dedifferentiation and network-specific dedifferentiation associated with age-related cognitive change.
Titre: Title: Comparison of Myelin Imaging Techniques in Ex Vivo Spinal Cord Auteur:
Nikola Stikov
Manh-Tung Vuong
Vuong Manh Tung
Myelin is a dielectric material that wraps around the axons of nerve fibers to enable fast conduction of signals throughout the nervous syst… (see more)em. Loss of myelin can cause anywhere from minor interruption to complete disruption of nerve impulses in a range of neurodegenerative diseases such as multiple sclerosis and Parkinson’s disease. There is an ongoing debate in the myelin imaging community about which biomarker based on Magnetic Resonance Imaging (MRI) is more correlated with myelin. In this work, we implemented and compared several MRI-based myelin imaging techniques (quantitative magnetization transfer imaging, myelin water imaging, and proton density imaging) by evaluating their repeatability and their relation to large-scale histology in the ex vivo spinal cords of a rat, a dog, and a human. While there are studies investigating the relationship between pairs of them as well as with histology, to the best of our knowledge, this is the first study that implemented and compared all those methods at the same time to evaluate their reproducibility and their correlation with myelin. Qualitatively the contrasts were similar, and all techniques had comparable scan-rescan and correlations with histology. Surprisingly, the voxel-wise correlations between the various myelin measures were almost as high as the scan-rescan correlations. The correlations decreased when only white matter was considered, which could be due to the small dynamic range of the measurement, or due to artifacts related to the preparation and panoramic scanning of the tissue. We conclude that the myelin imaging techniques explored in this thesis exhibit similar specificity to myelin, yet the histological correlations suggest that more work is needed to determine the optimal myelin imaging protocol. The study also pointed out some potential miscalibrations during acquisitions as well as data processing that may lead to anywhere from minor to major impact on the accuracy of the results. These include B1 mapping, insufficient spoiling and variation of the predelay time. We have also standardized the data processing routines by upgrading qMTLab to qMRLab which adds several quantitative MR methods to the toolbox, such as standard T1 mapping and field mapping. In addition, the data of the dog spinal cord in this study will be published together with the analysis scripts to help the interested reader to reproduce the findings from this thesis.
Toward Next-Generation Artificial Intelligence: Catalyzing the NeuroAI Revolution
Anthony Zador
Bence Ölveczky
Sean Escola
Kwabena Boahen
Matthew Botvinick
Dmitri Chklovskii
Anne Churchland
Claudia Clopath
James DiCarlo
Surya Ganguli
Jeff Hawkins
Konrad Paul Kording
Alexei Koulakov
Yann LeCun
Timothy P. Lillicrap
Adam Marblestone
Bruno Olshausen
Alexandre Pouget … (see 7 more)
Cristina Savin
Terrence Sejnowski
Eero Simoncelli
Sara Solla
David Sussillo
Andreas S. Tolias
Doris Tsao