Publications

auto-fpt: Automating Free Probability Theory Calculations for Machine Learning Theory

Arjun Subramonian

Elvis Dohmatob

2025-04-01

arXiv (publié)

doi.org

arxiv.org

Does Generative AI speak Nigerian-Pidgin?: Issues about Representativeness and Bias for Multilingualism in LLMs

David Ifeoluwa Adelani

A. Seza Doğruöz

Iyanuoluwa Shode

Aremu Anuoluwapo

2025-04-01

Findings of the Association for Computational Linguistics: NAACL 2025 (publié)

doi.org

arxiv.org

Efficient and scalable construction of clinical variable networks for complex diseases with RAMEN.

Yiwei Xiong

Jingtao Wang

Xiaoxiao Shang

Tingting Chen

Douglas D. Fraser

Gregory Fonseca

Simon Rousseau

Jun Ding

2025-04-01

Cell Reports Methods (publié)

doi.org

Evaluating and Enhancing Segmentation Model Robustness with Metamorphic Testing

Seif Mzoughi

Mohamed Elshafeia

Foutse Khomh

2025-04-01

arXiv (publié)

doi.org

arxiv.org

Evaluating Numeracy of Language Models as a Natural Language Inference Task.

Rahmad Mahendra

Damiano Spina

Lawrence Cavedon

Karin Verspoor

Zhangir Azerbayev

Hailey Schoelkopf

Keiran Paster

Marco Dos Santos

Stephen Marcus McAleer

Al-bert Q. Jiang

Jia Deng

Stella Biderman

Sean Welleck. 2024

Llemma

Taylor Berg-Kirkpatrick

Daniel Spokoyny. 2020

Samuel R. Bowman

Gabor Angeli

Christopher Potts

Christopher D. Manning. 2015 … (voir 480 de plus)

Tom Brown

Benjamin Mann

Nick Ryder

Melanie Subbiah

Jared Kaplan

Prafulla Dhariwal

Arvind Neelakantan

Pranav Shyam

Girish Sastry

Amanda Askell

Sandhini Agarwal

Ariel Herbert-Voss

Gretchen Krueger

T. Henighan

Rewon Child

Aditya Ramesh

Daniel M. Ziegler

Jeffrey Wu

Clemens Winter

Chris Hesse

Mark Chen

Eric Sigler

Ma-teusz Litwin

Scott Gray

Benjamin Chess

J. Clark

Christopher Berner

Sam McCandlish

Alec Radford

Ilya Sutskever

Dario Amodei. 2020

Samuel Cahyawijaya

Holy Lovenia

Alham Fikri Aji

Genta Indra Winata

Bryan Wilie

Fajri Koto

Christian Wibisono

Ade Romadhony

Karissa Vincentio

Jennifer Santoso

David Moel-jadi

Cahya Wirawan

Frederikus Hudi

Muham-mad Satrio Wicaksono

Ivan Halim Parmonangan

Ika Al-fina

Ilham Firdausi Putra

Samsul Rahmadani

Yulianti Oenang

Ali Akbar Septiandri

James Jaya

Kaustubh Dhole

Arie Suryani

Rifki Afina

Dan Putri

Keith Su

Made Nindyatama Stevens

Muhammad Nityasya

Ryan Adilazuarda

R. Hadiwijaya

Diandaru Tiezheng

Vito Yu

Wenliang Ghifari

Yan Dai

Xu Dyah

Haryo Damapuspita

Cuk Wibowo

Ich-wanul Tho

Karo Karo

T. Fatyanosa

Ziwei Ji

Graham Neubig

Timothy Baldwin

Zheng Cai

Maosong Cao

Haojiong Chen

Kai Chen

Keyu Chen

Xin Chen

Xun Chen

Ze-yu Chen

Zhi Chen

Pei Chu

Xiaoyi Dong

Haodong Duan

Qi Fan

Zhaoye Fei

Yan Gao

Jiaye Ge

Chenya Gu

Yuzhe Gu

Tao Gui

Aijia Guo

Qipeng Guo

Conghui He

Yingfan Hu

Ting Huang

T. Jiang

Penglong Jiao

Hongwei Liu

Jiangning Liu

Jiawei Hong

Kaiwen Liu

Kuikun Liu

Xiaoran Liu

Chen Lv

Haijun Lv

Kai Lv 0001

Li Ma

Runyuan Ma

Zerun Ma

Wenchang Ning

Linke Ouyang

Jiantao Qiu

Yuan Qu

Fukai Shang

Yunfan Shao

Hyung Won

Le Hou

Shayne Longpre

Barret Zoph

Yi Tay

William Fedus

Yunxuan Li

Xuezhi Wang

Mostafa Dehghani

Siddhartha Brahma

Alex Webson

Shixiang Shane

Zhuyun Gu

Menghua Dai

Xinyun Suzgun

Aakanksha Chen

Alex Chowdhery

Marie Castro-Ros

Kevin Pellat

Dasha Robinson

Sharan Valter

Gaurav Narang

Adams Mishra

Y. YuVincent

Yanping Zhao

Andrew Huang

Dai

Kevin Clark

Minh-Thang Luong

Quoc V. Le

Christopher D. Manning. 2020

Electra

Karl Cobbe

Vineet Kosaraju

Mo Bavarian

Heewoo Jun

Lukasz Kaiser

Matthias Plappert

Jerry Tworek

Jacob Hilton

Reiichiro Nakano

Xiao Bi

Deli Chen

Guanting Chen

Shanhuang Chen

Damai Dai

Cheng Deng

Honghui Ding

Kai Dong

Qiushi Du

Zhe Fu

Huazuo Gao

Kaige Gao

Wenjun Gao

Ruiqi Ge

Kang Guan

Daya Guo

Jianzhong Guo

Guangbo Hao

Zhewen Hao

Ying He

Panpan Wenjie Hu

Didem Foss

Dingkang Wang

Duc Le

Dustin Hol-land

Edward Dowling

Eissa Jamil

Elaine Mont-gomery

Eleonora Presani

Emily Hahn

Emily Wood

Erik Brinkman

Esteban Arcaute

Evan Dunbar

Evan Smothers

Fei Sun

Felix Kreuk

Feng Tian

Firat Ozgenel

Francesco Caggioni

F. Guzm’an

Frank J. Kanayet

Frank Seide

Gabriela Medina Florez

Gabriella Schwarz

Gada Badeer

Georgia Swee

Gil Halpern

G. Thattai

Grant Herman

G. Sizov

Guangyi Zhang

Guna Lakshmi-narayanan

Hamid Shojanazeri

Han Zou

Hannah Wang

Han Zha

Haroun Habeeb

Harrison Rudolph

Helen Suk

Henry Aspegren

Hunter Goldman

Igor Molybog

Igor Tufanov

Irina-Elena Veliche

Itai Gat

Jake Weissman

James Geboski

James Kohli

Japhet Asher

Jean-Baptiste Gaya

Jeff Marcus

Jeff Tang

Jennifer Chan

Jenny Zhen

Jeremy Reizen-stein

J. Teboul

Jessica Zhong

Jian Jin

Jingyi Yang

Joe Cummings

Jon Carvill

Jon Shepard

J. McPhie

Jonathan Torres

Josh Ginsburg

Junjie Wang

Kai Wu

U. KamHou

Karan Saxena

Karthik Prasad

Kartikay Khandelwal

Katayoun Zand

Kathy Matosich

Kaushik Veeraragha-van

Kelly Michelena

Keqian Li

Kun Huang

Kushal Chawla

Kushal Lakhotia

Kyle Huang

Lailin Chen

Lakshya Garg

A. Lavender

Leandro Silva

Lee Bell

Lei Zhang

Liangpeng Guo

Licheng Yu

Liron Moshkovich

Luca Wehrstedt

Madian Khabsa

Manav Avalani

Manish Bhatt

Maria Tsim-poukelli

Martynas Mankus

Matan Hasson

Matthias Lennie

Matthias Reso

Maxim Groshev

Maxim Naumov

Maya Lathi

Meghan Keneally

Michal Seltzer

Michal Valko

Michelle Restrepo

Mihir Patel

Mik Vyatskov

Mikayel Samvelyan

Mike Clark

Mike Macey

Mike Wang

Miquel Jubert

Mo Metanat

Mohammad Rastegari

Munish Bansal

Nandhini Santhanam

Natascha Parks

Natasha White

Navyata Bawa

Nayan Singhal

Nick Egebo

Nicolas Usunier

Nikolay Pavlovich

Laptev Ning

Ning Dong

Norman Zhang

Oleg Cheng

Olivia Chernoguz

Omkar Hart

Ozlem Salpekar

Parkin Kalinli

Parth Kent

Paul Parekh

Pa-van Saab

Pedro Balaji

Philip Rittner

Pierre Bontrager

Piotr Roux

Polina Dollár

P. Zvyagina

Pritish Yuvraj

Qian Liang

Rachad Alao

Rachel Rodriguez

Rafi Ayub

Raghotham Murthy

Raghu Nayani

Rahul Mitra

Raymond Li

Rebekkah Hogan

Robin Battey

Rocky Wang

Rohan Mah-eswari

Russell Howes

Ruty Rinott

Sai Jayesh

Bondu Samyak

Sara Datta

Sara Chugh

Sargun Hunt

Sasha Dhillon

Satadru Sidorov

Saurabh Pan

Verma Seiji

Sharadh Yamamoto

Shaun Ramaswamy

Sheng Lind-say

Sheng Feng

Shengxin Cindy Lin

Shiva Zha

Shuqiang Shankar

Sinong Zhang

Wang Sneha

Soji Agarwal

Soumith Sajuyigbe

Chintala Stephanie

Stephen Max

Steve Chen

Steve Kehoe

Sudarshan Satterfield

S. Govindaprasad

Gupta Sung-Bae

Sunny Cho

Suraj Virk

Subramanian Sy

Sy Choudhury

Tal Goldman

T. Remez

Tamara Glaser

Thilo Best

Thomas Kohler

Tianhe Robinson

Tianjun Li

Tim Zhang

Tim Matthews

Tzook Chou

Varun Shaked

Victoria Vontimitta

Victoria Ajayi

Vijai Montanez

Vinay Satish Mohan

Vishal Kumar

Vlad Mangla

Ionescu

Vlad Andrei

V. Poenaru

Vlad T. Mihailescu

Wei Ivanov

Wenchen Li

Wen-wen Wang

Wes Jiang

Bouaziz

Yilin Zhang

Ying Zhang

Yossi Adi

Youngjin Nam

Yu Wang

Yuchen Hao

Yundi Qian

Yuzi He

Zach Rait

Zachary DeVito

Zef Rosnbrick

Zhaoduo Wen

Zhenyu Yang

Zhiwei Zhao. 2024

The Llama

Gemma Team

Thomas Mesnard

Cassidy Hardin

Robert Dadashi

Surya Bhupatiraju

Shreya Pathak

L. Sifre

Morgane Rivière

Mihir Kale

Pouya Christo-pher Love

Dehghani Tafti

L'eonard Hussenot

Aakanksha Chowdhery

Adam Roberts

Aditya Barua

Alex Botev

Alex Castro-Ros

Ambrose Slone

Amélie Héliou

A. Tacchetti

Anna Bulanova

Antonia Paterson

Beth Tsai

Bobak Shahriari

Le Lan

Christopher A. Choquette-Choo

Clé-ment Crepy

Daniel Matthew Cer

Daphne Ippolito

David Reid

Elena Buchatskaya

Eric Ni

Eric Noland

Geng Yan

George Tucker

George-Christian Muraru

Grigory Rozhdestvenskiy

Henryk Michalewski

Ian Ten-ney

Ivan Grishchenko

Jacob Austin

James Keel-ing

Jane Labanowski

Jean-Baptiste Lespiau

Jeff Stanway

Jenny Brennan

Jeremy Chen

Johan Fer-ret

Justin Chiu

Justin Mao-jones

Kather-ine Lee

Kathy Yu

Katie Millican

Lars Lowe Sjoesund

Lisa Lee

Lucas Dixon

Machel Reid

Maciej Mikuła

Mateo Wirth

Michael Sharman

Nikolai Chinaev

Nithum Thain

Olivier Bachem

Oscar Chang

O. Wahltinez

Paige Bailey

Paul Michel

Petko Yotov Pier

Giuseppe Sessa

Rahma Chaabouni

Ramona Comanescu

Reena Jana

Rohan Anil

2025-04-01

Findings of the Association for Computational Linguistics: NAACL 2025 (publié)

doi.org

Genetic Analysis of Polyunsaturated Fatty Acids Biosynthesis Pathway Determines Four Distinct Thraustochytrid Types.

Sou-Yu Cheng

Yi-Jing Chen

Hsiu-Chin Lin

Hsin-Yang Chang

Ming-Der Huang

2025-04-01

Environmental Microbiology (publié)

doi.org

InfoGain Wavelets: Furthering the Design of Diffusion Wavelets for Graph-Structured Data

David R. Johnson

Smita Krishnaswamy

Michael Perlmutter

Diffusion wavelets extract information from graph signals at different scales of resolution by utilizing graph diffusion operators raised to… (voir plus) various powers, known as diffusion scales. Traditionally, the diffusion scales are chosen to be dyadic integers,

2025-04-01

arXiv (publié)

doi.org

arxiv.org

Learning from Stochastic Teacher Representations Using Student-Guided Knowledge Distillation

Muhammad Haseeb Aslam

Clara Martinez

Marco Pedersoli

Alessandro Lameiras Koerich

Ali Etemad

Eric Granger

Advances in self-distillation have shown that when knowledge is distilled from a teacher to a student using the same deep learning (DL) arch… (voir plus)itecture, the student performance can surpass the teacher particularly when the network is overparameterized and the teacher is trained with early stopping. Alternatively, ensemble learning also improves performance, although training, storing, and deploying multiple models becomes impractical as the number of models grows. Even distilling an ensemble to a single student model or weight averaging methods first requires training of multiple teacher models and does not fully leverage the inherent stochasticity for generating and distilling diversity in DL models. These constraints are particularly prohibitive in resource-constrained or latency-sensitive applications such as wearable devices. This paper proposes to train only one model and generate multiple diverse teacher representations using distillation-time dropout. However, generating these representations stochastically leads to noisy representations that are misaligned with the learned task. To overcome this problem, a novel stochastic self-distillation (SSD) training strategy is introduced for filtering and weighting teacher representation to distill from task-relevant representations only, using student-guided knowledge distillation (SGKD). The student representation at each distillation step is used as authority to guide the distillation process. Experimental results on real-world affective computing, wearable/biosignal datasets from the UCR Archive, the HAR dataset, and image classification datasets show that the proposed SSD method can outperform state-of-the-art methods without increasing the model size at both training and testing time, and incurs negligible computational complexity compared to state-of-the-art ensemble learning and weight averaging methods.

2025-04-01

arXiv (publié)

doi.org

arxiv.org

Leveraging Machine Learning Techniques in Intrusion Detection Systems for Internet of Things

Saeid Jamshidi

Amin Nikanjam

Kawser Wazed Nafi

Foutse Khomh

As the Internet of Things (IoT) continues to expand, ensuring the security of connected devices has become increasingly critical. Traditiona… (voir plus)l Intrusion Detection Systems (IDS) often fall short in managing the dynamic and large-scale nature of IoT networks. This paper explores how Machine Learning (ML) and Deep Learning (DL) techniques can significantly enhance IDS performance in IoT environments. We provide a thorough overview of various IDS deployment strategies and categorize the types of intrusions common in IoT systems. A range of ML methods -- including Support Vector Machines, Naive Bayes, K-Nearest Neighbors, Decision Trees, and Random Forests -- are examined alongside advanced DL models such as LSTM, CNN, Autoencoders, RNNs, and Deep Belief Networks. Each technique is evaluated based on its accuracy, efficiency, and suitability for real-world IoT applications. We also address major challenges such as high false positive rates, data imbalance, encrypted traffic analysis, and the resource constraints of IoT devices. In addition, we highlight the emerging role of Generative AI and Large Language Models (LLMs) in improving threat detection, automating responses, and generating intelligent security policies. Finally, we discuss ethical and privacy concerns, underscoring the need for responsible and transparent implementation. This paper aims to provide a comprehensive framework for developing adaptive, intelligent, and secure IDS solutions tailored for the evolving landscape of IoT.

2025-04-01

arXiv (publié)

doi.org

arxiv.org

LitLLMs, LLMs for Literature Review: Are we there yet?

Shubham Agarwal

Gaurav Sahu

Abhay Puri

Issam Hadj Laradji

Krishnamurthy Dj Dvijotham

Jason Stanley

Laurent Charlin