Publications

Consultative engagement of stakeholders toward a roadmap for African language technologies

Kathleen Siminyu

Jade Abbott

Kọ́lá Túbọ̀sún

Aremu Anuoluwapo

Blessing Kudzaishe Sibanda

Kofi Yeboah

David Ifeoluwa Adelani

Masabata Mokgesi-Selinga

Frederick R. Apina

Angela Thandizwe Mthembu

Arshath Ramkilowan

Babatunde Oladimeji

2023-08-10

Patterns (publié)

doi.org

Pretrainable geometric graph neural network for antibody affinity maturation

Huiyu Cai

Zuobai Zhang

Mingkai Wang

Bozitao Zhong

Yanling Wu

Tianlei Ying

Jian Tang

Quanxiao Li

Yuxuan Zhong

Increasing the binding affinity of an antibody to its target antigen is a crucial task in antibody therapeutics development. This paper pres… (voir plus)ents a pretrainable geometric graph neural network, GearBind, and explores its potential inin silicoaffinity maturation. Leveraging multi-relational graph construction, multi-level geometric message passing and contrastive pretraining on mass-scale, unlabeled protein structural data, GearBind outperforms previous state-of-the-art approaches on SKEMPI and an independent test set. A powerful ensemble model based on GearBind is then derived and used to successfully enhance the binding of two antibodies with distinct formats and target antigens. ELISA EC50values of the designed antibody mutants are decreased by up to 17 fold, andKDvalues by up to 6.1 fold. These promising results underscore the utility of geometric deep learning and effective pretraining in macromolecule interaction modeling tasks.

2023-08-10

bioRxiv (accepté)

doi.org

A Systematic Literature Review of Fashion, Sustainability, and Consumption Using a Mixed Methods Approach

Osmud Rahman

Dingtao Hu

Benjamin C. M. Fung

With the growing global awareness of the environmental impact of clothing consumption, there has been a notable surge in the publication of … (voir plus)journal articles dedicated to “fashion sustainability” in the past decade, specifically from 2010 to 2020. However, despite this wealth of research, many studies remain disconnected and fragmented due to varying research objectives, focuses, and approaches. Conducting a systematic literature review with a mixed methods research approach can help identify key research themes, trends, and developmental patterns, while also shedding light on the complexity of fashion, sustainability, and consumption. To enhance the literature review and analytical process, the current systematic literature review employed text mining techniques and bibliometric visualization tools, including RAKE, VOSviewer, and CitNetExplorer. The findings revealed an increase in the number of publications focusing on “fashion and sustainability” between 2010 and 2021. Most studies were predominantly conducted in the United States, with a specific focus on female consumers. Moreover, a greater emphasis was placed on non-sustainable cues rather than the sustainable cues. Additionally, a higher number of case studies was undertaken to investigate three fast-fashion companies. To enhance our knowledge and understanding of this subject, this article highlights several valuable contributions and provides recommendations for future research.

2023-08-09

Sustainability (publié)

doi.org

XMQAs: Constructing Complex-Modified Question-Answering Dataset for Robust Question Understanding

Yuyan Chen

Yanghua Xiao

Zhixu Li

Bang Liu

Question understanding is an important issue to the success of a Knowledge-based Question Answering (KBQA) system.However, the existing stud… (voir plus)y does not pay enough attention to this issue given that the questions in the existing KBQA datasets are usually expressed in simple and straightforward way. This is not in line with the actual linguistic conventions, which often use a lot of modifiers. To facilitate the study on evaluating and enhancing the question understanding ability of the KBQA systems, this paper proposes to construct a complex-modified question-answering (XMQAs) dataset based on existing KBQA datasets. With the help of knowledge bases and dictionaries, three kinds of modifiers are defined and applied to original simple-expressed questions. These modifiers could make the expression of these questions complex without changing their semantics. Based on XMQAs, we then propose a novel question understanding algorithm upon existing KBQA models, which greatly improves the robustness of their question understanding abilities. We conduct extensive experiments on XMQAs and two widely acknowledged KBQA datasets. The empirical results demonstrate that our proposed algorithm can improve the performance of KBQA models on not only the complex-modified questions, but also simple-expressed questions.

2023-08-09

IEEE Transactions on Knowledge and Data Engineering (inconnu)

doi.org

AI4GCC - Track 3: Consumption and the Challenges of Multi-Agent RL

Marco Jiralerspong

Gauthier Gidel

2023-08-08

ArXiv (prépublication)

doi.org

arxiv.org

Teacher-Student Architecture for Knowledge Distillation: A Survey

Chengming Hu

Xuan Li

Danyang Liu

Haolun Wu

X. T. Chen

Ju Wang

Xue Liu

Although Deep neural networks (DNNs) have shown a strong capacity to solve large-scale problems in many areas, such DNNs are hard to be depl… (voir plus)oyed in real-world systems due to their voluminous parameters. To tackle this issue, Teacher-Student architectures were proposed, where simple student networks with a few parameters can achieve comparable performance to deep teacher networks with many parameters. Recently, Teacher-Student architectures have been effectively and widely embraced on various knowledge distillation (KD) objectives, including knowledge compression, knowledge expansion, knowledge adaptation, and knowledge enhancement. With the help of Teacher-Student architectures, current studies are able to achieve multiple distillation objectives through lightweight and generalized student networks. Different from existing KD surveys that primarily focus on knowledge compression, this survey first explores Teacher-Student architectures across multiple distillation objectives. This survey presents an introduction to various knowledge representations and their corresponding optimization objectives. Additionally, we provide a systematic overview of Teacher-Student architectures with representative learning algorithms and effective distillation schemes. This survey also summarizes recent applications of Teacher-Student architectures across multiple purposes, including classification, recognition, generation, ranking, and regression. Lastly, potential research directions in KD are investigated, focusing on architecture design, knowledge quality, and theoretical studies of regression-based learning, respectively. Through this comprehensive survey, industry practitioners and the academic community can gain valuable insights and guidelines for effectively designing, learning, and applying Teacher-Student architectures on various distillation objectives.

2023-08-07

ArXiv (prépublication)

doi.org

arxiv.org

Assemblies, synapse clustering, and network topology interact with plasticity to explain structure-function relationships of the cortical connectome

András Ecker

Daniela Egas Santander

Marwan Abdellah

Jorge Blanco Alonso

Sirio Bolaños-Puchet

Giuseppe Chindemi

Dhuruva Priyan Gowri Mariyappan

James B Isbister

James King

Pramod Kumbhar

Ioannis Magkanaris

Eilif B Muller

Michael W Reimann

Abstract Synaptic plasticity underlies the brain’s ability to learn and adapt. While experiments in brain slices have reve… (voir plus)aled mechanisms and protocols for the induction of plasticity between pairs of neurons, how these synaptic changes are coordinated in biological neuronal networks to ensure the emergence of learning remains poorly understood. Simulation and modeling have emerged as important tools to study learning in plastic networks, but have yet to achieve a scale that incorporates realistic network structure, active dendrites, and multi-synapse interactions, key determinants of synaptic plasticity. To rise to this challenge, we endowed an existing large-scale cortical network model, incorporating data-constrained dendritic processing and multi-synaptic connections, with a calcium-based model of functional plasticity that captures the diversity of excitatory connections extrapolated to in vivo-like conditions. This allowed us to study how dendrites and network structure interact with plasticity to shape stimulus representations at the microcircuit level. In our simulations, plasticity acted sparsely and specifically, firing rates and weight distributions remained stable without additional homeostatic mechanisms. At the circuit level, we found plasticity was driven by co-firing stimulus-evoked functional assemblies, spatial clustering of synapses on dendrites, and the topology of the network connectivity. As a result of the plastic changes, the network became more reliable with more stimulus-specific responses. We confirmed our testable predictions in the MICrONS datasets, an openly available electron microscopic reconstruction of a large volume of cortical tissue. Our results quantify at a large scale how the dendritic architecture and higher-order structure of cortical microcircuits play a central role in functional plasticity and provide a foundation for elucidating their role in learning.

2023-08-06

bioRxiv (prépublication)

doi.org

Bayesian modelling disentangles language versus executive control disruption in stroke

Gesa Hartwigsen

Jae-Sung Lim

Hee-Joon Bae

Kyung-Ho Yu

Hugo J. Kuijf

Nick A. Weaver

J. Matthijs Biesbroek

Jakub Kopal

Danilo Bzdok

Stroke is the leading cause of long-term disability worldwide. Incurred brain damage disrupts cognition, often with persisting deficits in l… (voir plus)anguage and executive capacities. Despite their clinical relevance, the commonalities, and differences of language versus executive control impairments remain under-specified. We tailored a Bayesian hierarchical modeling solution in a largest-of-its-kind cohort (1080 stroke patients) to deconvolve language and executive control in the brain substrates of stroke insults. Four cognitive factors distinguished left- and right-hemispheric contributions to ischemic tissue lesion. One factor delineated language and general cognitive performance and was mainly associated with damage to left-hemispheric brain regions in the frontal and temporal cortex. A factor for executive control summarized control and visual-constructional abilities. This factor was strongly related to right-hemispheric brain damage of posterior regions in the occipital cortex. The interplay of language and executive control was reflected in two factors: executive speech functions and verbal memory. Impairments on both were mainly linked to left-hemispheric lesions. These findings shed light onto the causal implications of hemispheric specialization for cognition; and make steps towards subgroup-specific treatment protocols after stroke.

2023-08-06

bioRxiv (prépublication)

doi.org

Exploring Security Practices in Infrastructure as Code: An Empirical Study

Alexandre Verdet

Mohammad Hamdaqa

Leuson Da Silva

Foutse Khomh

Cloud computing has become popular thanks to the widespread use of Infrastructure as Code (IaC) tools, allowing the community to convenientl… (voir plus)y manage and configure cloud infrastructure using scripts. However, the scripting process itself does not automatically prevent practitioners from introducing misconfigurations, vulnerabilities, or privacy risks. As a result, ensuring security relies on practitioners understanding and the adoption of explicit policies, guidelines, or best practices. In order to understand how practitioners deal with this problem, in this work, we perform an empirical study analyzing the adoption of IaC scripted security best practices. First, we select and categorize widely recognized Terraform security practices promulgated in the industry for popular cloud providers such as AWS, Azure, and Google Cloud. Next, we assess the adoption of these practices by each cloud provider, analyzing a sample of 812 open-source projects hosted on GitHub. For that, we scan each project configuration files, looking for policy implementation through static analysis (checkov). Additionally, we investigate GitHub measures that might be correlated with adopting these best practices. The category Access policy emerges as the most widely adopted in all providers, while Encryption in rest are the most neglected policies. Regarding GitHub measures correlated with best practice adoption, we observe a positive, strong correlation between a repository number of stars and adopting practices in its cloud infrastructure. Based on our findings, we provide guidelines for cloud practitioners to limit infrastructure vulnerability and discuss further aspects associated with policies that have yet to be extensively embraced within the industry.

2023-08-06

ArXiv (prépublication)

doi.org

arxiv.org

Multi-variable Hard Physical Constraints for Climate Model Downscaling

Jose Gonz'alez-Abad

'Alex Hern'andez-Garc'ia

Paula Harder

David Rolnick

Jos'e Manuel Guti'errez

2023-08-01

ArXiv (prépublication)