Publications

Of Stances, Themes, and Anomalies in COVID-19 Mask-Wearing Tweets

Jwen Fai Low

Benjamin C. M. Fung

Farkhund Iqbal

COVID-19 is an opportunity to study public acceptance of a “new” healthcare intervention, universal masking, which unlike vaccination, i… (voir plus)s mostly alien to the Anglosphere public despite being practiced in ages past. Using a collection of over two million tweets, we studied the ways in which proponents and opponents of masking vied for influence as well as the themes driving the discourse. Pro-mask tweets encouraging others to mask up dominated Twitter early in the pandemic though its continued dominance has been eroded by anti-mask tweets criticizing others for their masking behavior. Engagement, represented by the counts of likes, retweets, and replies, and controversiality and disagreeableness, represented by ratios of the aforementioned counts, favored pro-mask tweets initially but with anti-mask tweets slowly gaining ground. Additional analysis raised the possibility of the platform owners suppressing certain parts of the mask-wearing discussion.

2022-12-31

IEEE Access (publié)

doi.org

StarCoder: may the source be with you!

Raymond Li

Loubna Ben allal

Yangtian Zi

Niklas Muennighoff

Denis Kocetkov

Chenghao Mou

Marc Marone

Christopher Akiki

Jia LI

Jenny Chim

Qian Liu

Evgenii Zheltonozhskii

Terry Yue Zhuo

Thomas Wang

Olivier Dehaene

Mishig Davaadorj

Joel Lamy-Poirier

Joao Monteiro

Oleh Shliazhko

Nicolas Gontier … (voir 47 de plus)

Nicholas Meade

Armel Zebaze

Ming-Ho Yee

Logesh Kumar Umapathi

Jian Zhu

Ben Lipkin

Muhtasham Oblokulov

Zhiruo Wang

Rudra Murthy

Jason T Stillerman

Siva Sankalp Patel

Dmitry Abulkhanov

Marco Zocca

Manan Dey

Zhihan Zhang

N. Fahmy

Urvashi Bhattacharyya

Wenhao Yu

Swayam Singh

Sasha Luccioni

Paulo Villegas

M. Kunakov

Fedor Zhdanov

Manuel Romero

Tony Lee

Nadav Timor

Jennifer Ding

Claire S Schlesinger

Hailey Schoelkopf

Jan Ebert

Tri Dao

Mayank Mishra

Alex Gu

Jennifer Robinson

Carolyn Jane Anderson

Brendan Dolan-Gavitt

Danish Contractor

Siva Reddy

Daniel Fried

Dzmitry Bahdanau

Yacine Jernite

Carlos Muñoz Ferrandis

Sean Hughes

Thomas Wolf

Arjun Guha

Leandro Von Werra

Harm de Vries

The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs)… (voir plus), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention. StarCoderBase is trained on 1 trillion tokens sourced from The Stack, a large collection of permissively licensed GitHub repositories with inspection tools and an opt-out process. We fine-tuned StarCoderBase on 35B Python tokens, resulting in the creation of StarCoder. We perform the most comprehensive evaluation of Code LLMs to date and show that StarCoderBase outperforms every open Code LLM that supports multiple programming languages and matches or outperforms the OpenAI code-cushman-001 model. Furthermore, StarCoder outperforms every model that is fine-tuned on Python and still retains its performance on other programming languages. We take several important steps towards a safe open-access model release, including an improved PII redaction pipeline and a novel attribution tracing tool, and make the StarCoder models publicly available under a more commercially viable version of the Open Responsible AI Model license.

2022-12-31

Trans. Mach. Learn. Res. (publié)

doi.org

openreview.net

Stochastic Generative Flow Networks

Ling Pan

Dinghuai Zhang

Moksh J. Jain

Longbo Huang

Yoshua Bengio

2022-12-31

UAI (publié)

doi.org

proceedings.mlr.press

Structure-aware Protein Self-supervised Learning

Can (Sam) Chen

Jingbo Zhou

Fan Wang

Xue Liu

Dejing Dou

Protein representation learning methods have shown great potential to yield useful representation for many downstream tasks, especially on p… (voir plus)rotein classification. Moreover, a few recent studies have shown great promise in addressing insufficient labels of proteins with self-supervised learning methods. However, existing protein language models are usually pretrained on protein sequences without considering the important protein structural information. To this end, we propose a novel structure-aware protein self-supervised learning method to effectively capture structural information of proteins. In particular, a well-designed graph neural network (GNN) model is pretrained to preserve the protein structural information with self-supervised tasks from a pairwise residue distance perspective and a dihedral angle perspective, respectively. Furthermore, we propose to leverage the available protein language model pretrained on protein sequences to enhance the self-supervised learning. Specifically, we identify the relation between the sequential information in the protein language model and the structural information in the specially designed GNN model via a novel pseudo bi-level optimization scheme. Experiments on several supervised downstream tasks verify the effectiveness of our proposed method.The code of the proposed method is available in https://github.com/GGchen1997/STEPS_Bioinformatics.

2022-12-31

Bioinformatics (publié)

doi.org

arxiv.org

SUMMIT: Scaffolding OSS Issue Discussion Through Summarization

Saskia Gilmer

Avinash Bhat

Shuvam Shah

Kevin Cherry

Jinghui Cheng

Jin L.C. Guo

2022-12-31

arXiv.org (prépublication)

doi.org

SUNMASK: Mask Enhanced Control in Step Unrolled Denoising Autoencoders

2022-12-31

EvoMUSART@EvoStar (publié)

doi.org

openreview.net

Supplementary Material for MixupE

Yingtian Zou

Vikas Verma

Sarthak Mittal

Wai Hoh Tang

Hieu Pham

Juho Kannala

Yoshua Bengio

Arno Solin

Kenji Kawaguchi

We denote by z = (x,y) the input and output pair where x ∈ X ⊆ R and y ∈ Y ⊆ R . Let fθ(x) ∈ R be the output of the logits (i.e.,… (voir plus) the last layer before the softmax or sigmoid) of the model parameterized by θ. We use l(θ, z) = h(fθ(x)) − yfθ(x) to denote the loss function. Let g(·) be the activation function. We use x(i) to index i-th element of the vector x and xj to represent j-th variable in a set. The notation list is:

2022-12-31

(publié)

www.semanticscholar.org

A Survey of Diversification Metrics and Approaches in Retrieval Systems: From the Perspective of Search and Recommendation

Haolun Wu

Yansen Zhang

Chen Ma

Fuyuan Lyu

Fernando Diaz

Xue Liu

Diversifying search results is an important research topic in retrieval systems in order to satisfy both the various interests of customers … (voir plus)and the equal market exposure of providers. There has been a growing attention on diversity-aware research during recent years, accompanied by a proliferation of literature on methods to promote diversity in search and recommendation. However, the diversity-aware studies in retrieval systems lack a systematic organization and are rather fragmented. In this survey, we are the first to propose a unified taxonomy for classifying the metrics and approaches of diversification in both search and recommendation, which are two of the most extensively researched fields of retrieval systems. We begin the survey with a brief discussion of why diversity is important in retrieval systems

2022-12-31

(publié)

www.semanticscholar.org

Survey of Scientific Rigor Studied in Machine Learning

D. Sculley

Gary R. Holt

Daniel R. Golovin

Eugene V. Davydov

Todd Phillips

Dietmar Ebner

Michael Young

Jean-francois Crespo

Dan Dennison

Emily Fox

Hugo Larochelle

The concern that Artificial Intelligence (AI) and Machine Learning (ML) are entering a “reproducibility crisis” has spurred significant … (voir plus)research in the past few years. Yet with each paper, it is often unclear what someone means by “reproducibility” and where it fits in the larger scope of what we will call the “scientific rigor” literature. Ultimately, the lack of clear rigor standards can affect the manner in which businesses seeking to adopt AI/ML implement such capabilities. In this survey, we will use 66 papers published since 2017 to construct a proposed set of 8 high-level categories of scientific rigor, what they are, and the history of work conducted in each. Our proposal is that these eight rigor types are not mutually exclusive and present a model for how they influence each other. To encourage more to study these questions, we map these rigors to the adoption process in real-world business use cases. In doing so, we can quantify gaps in the literature that suggest an under focus on the issues necessary for scientific rigor research to transition to practice

2022-12-31

(publié)

www.semanticscholar.org

Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning

Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding … (voir plus)is limited. In this work, we provide evidence that disentangled representations coupled with sparse base-predictors improve generalization. In the context of multi-task learning, we prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations. Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem. Finally, we explore a meta-learning version of this algorithm based on group Lasso multiclass SVM base-predictors, for which we derive a tractable dual formulation. It obtains competitive results on standard few-shot classification benchmarks, while each task is using only a fraction of the learned representations.

2022-12-31

ICML (publié)

doi.org

proceedings.mlr.press

Syntactic Substitutability as Unsupervised Dependency Syntax

Jasper Jian

Siva Reddy

Syntax is a latent hierarchical structure which underpins the robust and compositional nature of human language. In this work, we explore th… (voir plus)e hypothesis that syntactic dependencies can be represented in language model attention distributions and propose a new method to induce these structures theory-agnostically. Instead of modeling syntactic relations as defined by annotation schemata, we model a more general property implicit in the definition of dependency relations, syntactic substitutability. This property captures the fact that words at either end of a dependency can be substituted with words from the same category. Substitutions can be used to generate a set of syntactically invariant sentences whose representations are then used for parsing. We show that increasing the number of substitutions used improves parsing accuracy on natural data. On long-distance subject-verb agreement constructions, our method achieves 79.5% recall compared to 8.9% using a previous method. Our method also provides improvements when transferred to a different parsing setup, demonstrating that it generalizes.

2022-12-31

EMNLP (publié)

doi.org

openreview.net

Technology-enhanced trauma training in low-resource settings: A scoping review and feasibility analysis of educational technologies.

Minahil Khan

Fabio Botelho

Laura Pinkham

Elena Guadagno

Dan Poenaru

2022-12-31

Journal of Pediatric Surgery (publié)

doi.org

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Publications

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications