Publications

Iterative Graph Self-Distillation

Hanlin Zhang

Shuai Lin

Weiyang Liu

Pan Zhou

Xiaodan Liang

Eric P. Xing

Recently, there has been increasing interest in the challenge of how to discriminatively vectorize graphs. To address this, we propose a met… (voir plus)hod called Iterative Graph Self-Distillation (IGSD) which learns graph-level representation in an unsupervised manner through instance discrimination using a self-supervised contrastive learning approach. IGSD involves a teacher-student distillation process that uses graph diffusion augmentations and constructs the teacher model using an exponential moving average of the student model. The intuition behind IGSD is to predict the teacher network representation of the graph pairs under different augmented views. As a natural extension, we also apply IGSD to semi-supervised scenarios by jointly regularizing the network with both supervised and self-supervised contrastive loss. Finally, we show that fine-tuning the IGSD-trained models with self-training can further improve graph representation learning. Empirically, we achieve significant and consistent performance gain on various graph datasets in both unsupervised and semi-supervised settings, which well validates the superiority of IGSD.

2024-03-01

IEEE Transactions on Knowledge and Data Engineering (publié)

doi.org

openreview.net

Neural network prediction of the effect of thermomechanical controlled processing on mechanical properties

Sushant Sinha

Denzel Guye

Xiaoping Ma

Kashif Rehman

S. Yue

Narges Armanfard

2024-03-01

Machine Learning with Applications (publié)

doi.org

Novel community data in ecology-properties and prospects.

Florian Hartig

Nerea Abrego

Alex Bush

Jonathan M. Chase

G. Guillera‐Arroita

M. Leibold

Otso T. Ovaskainen

Loïc Pellissier

Maximilian Pichler

Giovanni Poggiato

Laura J. Pollock

Sara Si-moussi

Wilfried Thuiller

Duarte S Viana

D. Warton

Damaris Zurell

Douglas W. Yu

2024-03-01

Trends in Ecology & Evolution (publié)

doi.org

arxiv.org

Socially Assistive Robots for patients with Alzheimer's Disease: A scoping review.

Vania Karami

Mark J. Yaffe

Genevieve Gore

AJung Moon

Samira Abbasgholizadeh-Rahimi

2024-03-01

Archives of gerontology and geriatrics (Print) (publié)

doi.org

Sources of richness and ineffability for phenomenally conscious states

Xu Ji

Eric Elmoznino

George Deane

Axel Constant

Guillaume Dumas

Guillaume Lajoie

Jonathan Simon

Yoshua Bengio

2024-03-01

Neuroscience of Consciousness (publié)

doi.org

arxiv.org

The « jingle-jangle fallacy » of empathy: Delineating affective, cognitive and motor components of empathy from behavioral synchrony using a virtual agent

Julia Ayache

Guillaume Dumas

Alexander Sumich

D. Kuss

Darren Rhodes

Nadja Heym

2024-03-01

Personality and Individual Differences (publié)

doi.org

COSMIC: Mutual Information for Task-Agnostic Summarization Evaluation

Maxime Darrin

Philippe Formont

Jackie Chi Kit Cheung

Pablo Piantanida

Assessing the quality of summarizers poses significant challenges. In response, we propose a novel task-oriented evaluation approach that as… (voir plus)sesses summarizers based on their capacity to produce summaries that are useful for downstream tasks, while preserving task outcomes. We theoretically establish a direct relationship between the resulting error probability of these tasks and the mutual information between source texts and generated summaries. We introduce

2024-02-29

ArXiv (prépublication)

doi.org

arxiv.org

Crowdkeeping in Last-mile Delivery

Xin Wang

Okan Arslan

Érick Delage

2024-02-29

Transportation Science (publié)

doi.org

Disentangling the Causes of Plasticity Loss in Neural Networks

Clare Lyle

Zeyu Zheng

Khimya Khetarpal

Hado van Hasselt

Razvan Pascanu

James Martens

Will Dabney

2024-02-29

ArXiv (prépublication)

doi.org

arxiv.org

StarCoder 2 and The Stack v2: The Next Generation

Anton Lozhkov

Raymond Li

Loubna Ben allal

Federico Cassano

Joel Lamy-Poirier

Nouamane Tazi

Ao Tang

Dmytro Pykhtar

Jiawei Liu

Yuxiang Wei

Tianyang Liu

Max Tian

Denis Kocetkov

Arthur Zucker

Younes Belkada

Zijian Wang

Qian Liu

Dmitry Abulkhanov

Indraneil Paul

Zhuang Li … (voir 46 de plus)

Wen-Ding Li

Megan L. Risdal

Jia LI

Jian Zhu

Terry Yue Zhuo

Evgenii Zheltonozhskii

Nii Osae Osae Dade

Wenhao Yu

Lucas Krauss

Naman Jain

Yixuan Su

Xuanli He

Manan Dey

Edoardo Abati

Yekun Chai

Niklas Muennighoff

Xiangru Tang

Muhtasham Oblokulov

Christopher Akiki

Marc Marone

Chenghao Mou

Mayank Mishra

Alex Gu

Binyuan Hui

Tri Dao

Armel Zebaze

Olivier Dehaene

Nicolas Patry

Canwen Xu

Julian McAuley

Han Hu

Torsten Scholak

Sebastien Paquet

Jennifer Robinson

Carolyn Jane Anderson

Nicolas Chapados

Mostofa Ali Patwary

Nima Tajbakhsh

Yacine Jernite

Carlos Muñoz Ferrandis

Lingming Zhang

Sean Hughes

Thomas Wolf

Arjun Guha

Leandro Von Werra

Harm de Vries

The BigCode project, an open-scientific collaboration focused on the responsible development of Large Language Models for Code (Code LLMs), … (voir plus)introduces StarCoder2. In partnership with Software Heritage (SWH), we build The Stack v2 on top of the digital commons of their source code archive. Alongside the SWH repositories spanning 619 programming languages, we carefully select other high-quality data sources, such as GitHub pull requests, Kaggle notebooks, and code documentation. This results in a training set that is 4x larger than the first StarCoder dataset. We train StarCoder2 models with 3B, 7B, and 15B parameters on 3.3 to 4.3 trillion tokens and thoroughly evaluate them on a comprehensive set of Code LLM benchmarks. We find that our small model, StarCoder2-3B, outperforms other Code LLMs of similar size on most benchmarks, and also outperforms StarCoderBase-15B. Our large model, StarCoder2- 15B, significantly outperforms other models of comparable size. In addition, it matches or outperforms CodeLlama-34B, a model more than twice its size. Although DeepSeekCoder- 33B is the best-performing model at code completion for high-resource languages, we find that StarCoder2-15B outperforms it on math and code reasoning benchmarks, as well as several low-resource languages. We make the model weights available under an OpenRAIL license and ensure full transparency regarding the training data by releasing the SoftWare Heritage persistent IDentifiers (SWHIDs) of the source code data.

2024-02-29

ArXiv (prépublication)

doi.org

arxiv.org

StarCoder 2 and The Stack v2: The Next Generation

Anton Lozhkov

Raymond Li

Loubna Ben allal

Federico Cassano

Joel Lamy-Poirier

Nouamane Tazi

Ao Tang

Dmytro Pykhtar

Jiawei Liu

Yuxiang Wei

Tianyang Liu

Max Tian

Denis Kocetkov

Arthur Zucker

Younes Belkada

Zijian Wang

Qian Liu

Dmitry Abulkhanov

Indraneil Paul

Zhuang Li … (voir 46 de plus)

Wen-Ding Li

Megan L. Risdal

Jia LI

Jian Zhu

Terry Yue Zhuo

Evgenii Zheltonozhskii

Nii Osae Osae Dade

Wenhao Yu

Lucas Krauss

Naman Jain

Yixuan Su

Xuanli He

Manan Dey

Edoardo Abati

Yekun Chai

Niklas Muennighoff

Xiangru Tang

Muhtasham Oblokulov

Christopher Akiki

Marc Marone

Chenghao Mou

Mayank Mishra

Alex Gu

Binyuan Hui

Tri Dao

Armel Zebaze

Olivier Dehaene

Nicolas Patry

Canwen Xu

Julian McAuley

Han Hu

Torsten Scholak

Sebastien Paquet

Jennifer Robinson

Carolyn Jane Anderson

Nicolas Chapados

Md. Mostofa Ali Patwary

Nima Tajbakhsh

Yacine Jernite

Carlos Muñoz Ferrandis

Lingming Zhang

Sean Hughes

Thomas Wolf

Arjun Guha

Leandro Von Werra

Harm de Vries

The BigCode project, an open-scientific collaboration focused on the responsible development of Large Language Models for Code (Code LLMs), … (voir plus)introduces StarCoder2. In partnership with Software Heritage (SWH), we build The Stack v2 on top of the digital commons of their source code archive. Alongside the SWH repositories spanning 619 programming languages, we carefully select other high-quality data sources, such as GitHub pull requests, Kaggle notebooks, and code documentation. This results in a training set that is 4x larger than the first StarCoder dataset. We train StarCoder2 models with 3B, 7B, and 15B parameters on 3.3 to 4.3 trillion tokens and thoroughly evaluate them on a comprehensive set of Code LLM benchmarks. We find that our small model, StarCoder2-3B, outperforms other Code LLMs of similar size on most benchmarks, and also outperforms StarCoderBase-15B. Our large model, StarCoder2- 15B, significantly outperforms other models of comparable size. In addition, it matches or outperforms CodeLlama-34B, a model more than twice its size. Although DeepSeekCoder- 33B is the best-performing model at code completion for high-resource languages, we find that StarCoder2-15B outperforms it on math and code reasoning benchmarks, as well as several low-resource languages. We make the model weights available under an OpenRAIL license and ensure full transparency regarding the training data by releasing the SoftWare Heritage persistent IDentifiers (SWHIDs) of the source code data.

2024-02-29

ArXiv (prépublication)

doi.org

arxiv.org

The use of dose surface maps as a tool to investigate spatial dose delivery accuracy for the rectum during prostate radiotherapy

Haley Patrick

John Kildea

2024-02-29

Journal of Applied Clinical Medical Physics (publié)

doi.org

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Publications

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications