Publications

Sample Boosting Algorithm (SamBA) - An interpretable greedy ensemble classifier based on local expertise for fat data

Baptiste Bauvin

Cécile Capponi

Florence Clerc

Pascal Germain

Sokol Koço

J. Corbeil

2022-12-31

UAI (published)

proceedings.mlr.press

Scalar Invariant Networks with Zero Bias

Chuqin Geng

Xiaojie Xu

Haolin Ye

Xujie Si

Just like weights, bias terms are the learnable parameters of many popular machine learning models, including neural networks. Biases are th… (see more)ought to enhance the representational power of neural networks, enabling them to solve a variety of tasks in computer vision. However, we argue that biases can be disregarded for some image-related tasks such as image classification, by considering the intrinsic distribution of images in the input space and desired model properties from first principles. Our findings suggest that zero-bias neural networks can perform comparably to biased networks for practical image classification tasks. We demonstrate that zero-bias neural networks possess a valuable property called scalar (multiplication) invariance. This means that the prediction of the network remains unchanged when the contrast of the input image is altered. We extend scalar invariance to more general cases, enabling formal verification of certain convex regions of the input space. Additionally, we prove that zero-bias neural networks are fair in predicting the zero image. Unlike state-of-the-art models that may exhibit bias toward certain labels, zero-bias networks have uniform belief in all labels. We believe dropping bias terms can be considered as a geometric prior in designing neural network architecture for image classification, which shares the spirit of adapting convolutions as the transnational invariance prior. The robustness and fairness advantages of zero-bias neural networks may also indicate a promising path towards trustworthy and ethical AI.

2022-12-31

NeurReps (published)

doi.org

proceedings.mlr.press

Scaling Self-Supervised End-to-End Driving with Multi-View Attention Learning

Yi Xiao

Felipe Codevilla

Diego Porres

Antonio M. López

2022-12-31

arXiv.org (preprint)

doi.org

Screening methods for congenital anomalies in low and lower-middle income countries: A systematic review.

Justina O. Seyi-Olajide

Xiya Ma

Elena Guadagno

Adesoji Ademuyiwa

Dan Poenaru

2022-12-31

Journal of Pediatric Surgery (published)

doi.org

Self-Influence Guided Data Reweighting for Language Model Pre-training

Megh Thakkar

Tolga Bolukbasi

Sriram Ganapathy

Shikhar Vashishth

Sarath Chandar

Partha Talukdar

Language Models (LMs) pre-trained with self-supervision on large text corpora have become the default starting point for developing models f… (see more)or various NLP tasks. Once the pre-training corpus has been assembled, all data samples in the corpus are treated with equal importance during LM pre-training. However, due to varying levels of relevance and quality of data, equal importance to all the data samples may not be the optimal choice. While data reweighting has been explored in the context of task-specific supervised learning and LM fine-tuning, model-driven reweighting for pre-training data has not been explored. We fill this important gap and propose PRESENCE, a method for jointly reweighting samples by leveraging self-influence (SI) scores as an indicator of sample importance and pre-training. PRESENCE promotes novelty and stability for model pre-training. Through extensive analysis spanning multiple model sizes, datasets, and tasks, we present PRESENCE as an important first step in the research direction of sample reweighting for pre-training language models.

2022-12-31

EMNLP (published)

doi.org

openreview.net

SORBETmatcher results for OAEI 2023.

Francis Gosselin

Amal Zouaq

2022-12-31

OM@ISWC (published)

dblp.uni-trier.de

Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads

Salah Zaiem

Youcef Kemiche

Titouan Parcollet

Slim Essid

Mirco Ravanelli

Self-supervised learning (SSL) leverages large datasets of unlabeled speech to reach impressive performance with reduced amounts of annotate… (see more)d data. The high number of proposed approaches fostered the emergence of comprehensive benchmarks that evaluate their performance on a set of downstream tasks exploring various aspects of the speech signal. However, while the number of considered tasks has been growing, most proposals rely upon a single downstream architecture that maps the frozen SSL representations to the task labels. This study examines how benchmarking results are affected by changes in the probing head architecture. Interestingly, we found that altering the downstream architecture structure leads to significant fluctuations in the performance ranking of the evaluated models. Against common practices in speech SSL benchmarking, we evaluate larger-capacity probing heads, showing their impact on performance, inference costs, generalization and multi-level feature exploitation.

2022-12-31

arXiv (preprint)

doi.org

arxiv.org

Of Stances, Themes, and Anomalies in COVID-19 Mask-Wearing Tweets

Jwen Fai Low

Benjamin C. M. Fung

Farkhund Iqbal

COVID-19 is an opportunity to study public acceptance of a “new” healthcare intervention, universal masking, which unlike vaccination, i… (see more)s mostly alien to the Anglosphere public despite being practiced in ages past. Using a collection of over two million tweets, we studied the ways in which proponents and opponents of masking vied for influence as well as the themes driving the discourse. Pro-mask tweets encouraging others to mask up dominated Twitter early in the pandemic though its continued dominance has been eroded by anti-mask tweets criticizing others for their masking behavior. Engagement, represented by the counts of likes, retweets, and replies, and controversiality and disagreeableness, represented by ratios of the aforementioned counts, favored pro-mask tweets initially but with anti-mask tweets slowly gaining ground. Additional analysis raised the possibility of the platform owners suppressing certain parts of the mask-wearing discussion.

2022-12-31

IEEE Access (published)

doi.org

StarCoder: may the source be with you!

Raymond Li

Loubna Ben allal

Yangtian Zi

Niklas Muennighoff

Denis Kocetkov

Chenghao Mou

Marc Marone

Christopher Akiki

Jia LI

Jenny Chim

Qian Liu

Evgenii Zheltonozhskii

Terry Yue Zhuo

Thomas Wang

Olivier Dehaene

Mishig Davaadorj

Joel Lamy-Poirier

Joao Monteiro

Oleh Shliazhko

Nicolas Gontier … (see 47 more)

Nicholas Meade

Armel Zebaze

Ming-Ho Yee

Logesh Kumar Umapathi

Jian Zhu

Ben Lipkin

Muhtasham Oblokulov

Zhiruo Wang

Rudra Murthy

Jason T Stillerman

Siva Sankalp Patel

Dmitry Abulkhanov

Marco Zocca

Manan Dey

Zhihan Zhang

N. Fahmy

Urvashi Bhattacharyya

Wenhao Yu

Swayam Singh

Sasha Luccioni

Paulo Villegas

M. Kunakov

Fedor Zhdanov

Manuel Romero

Tony Lee

Nadav Timor

Jennifer Ding

Claire S Schlesinger

Hailey Schoelkopf

Jan Ebert

Tri Dao

Mayank Mishra

Alex Gu

Jennifer Robinson

Carolyn Jane Anderson

Brendan Dolan-Gavitt

Danish Contractor

Siva Reddy

Daniel Fried

Dzmitry Bahdanau

Yacine Jernite

Carlos Muñoz Ferrandis

Sean Hughes

Thomas Wolf

Arjun Guha

Leandro Von Werra

Harm de Vries

The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs)… (see more), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention. StarCoderBase is trained on 1 trillion tokens sourced from The Stack, a large collection of permissively licensed GitHub repositories with inspection tools and an opt-out process. We fine-tuned StarCoderBase on 35B Python tokens, resulting in the creation of StarCoder. We perform the most comprehensive evaluation of Code LLMs to date and show that StarCoderBase outperforms every open Code LLM that supports multiple programming languages and matches or outperforms the OpenAI code-cushman-001 model. Furthermore, StarCoder outperforms every model that is fine-tuned on Python and still retains its performance on other programming languages. We take several important steps towards a safe open-access model release, including an improved PII redaction pipeline and a novel attribution tracing tool, and make the StarCoder models publicly available under a more commercially viable version of the Open Responsible AI Model license.

2022-12-31

Trans. Mach. Learn. Res. (published)

doi.org

openreview.net

Stochastic Generative Flow Networks

Ling Pan

Dinghuai Zhang

Moksh J. Jain

Longbo Huang

Yoshua Bengio

2022-12-31

UAI (published)

doi.org

proceedings.mlr.press

Structure-aware Protein Self-supervised Learning

Can (Sam) Chen

Jingbo Zhou

Fan Wang

Xue Liu

Dejing Dou

Protein representation learning methods have shown great potential to yield useful representation for many downstream tasks, especially on p… (see more)rotein classification. Moreover, a few recent studies have shown great promise in addressing insufficient labels of proteins with self-supervised learning methods. However, existing protein language models are usually pretrained on protein sequences without considering the important protein structural information. To this end, we propose a novel structure-aware protein self-supervised learning method to effectively capture structural information of proteins. In particular, a well-designed graph neural network (GNN) model is pretrained to preserve the protein structural information with self-supervised tasks from a pairwise residue distance perspective and a dihedral angle perspective, respectively. Furthermore, we propose to leverage the available protein language model pretrained on protein sequences to enhance the self-supervised learning. Specifically, we identify the relation between the sequential information in the protein language model and the structural information in the specially designed GNN model via a novel pseudo bi-level optimization scheme. Experiments on several supervised downstream tasks verify the effectiveness of our proposed method.The code of the proposed method is available in https://github.com/GGchen1997/STEPS_Bioinformatics.

2022-12-31

Bioinformatics (published)

doi.org

arxiv.org

SUMMIT: Scaffolding OSS Issue Discussion Through Summarization

Saskia Gilmer

Avinash Bhat

Shuvam Shah

Kevin Cherry

Jinghui Cheng

Jin L.C. Guo

2022-12-31

arXiv.org (preprint)

doi.org

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Publications

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Popular keywords:

Publications