Dimension-adapted Momentum Outscales SGD
Damien Ferbach
Katie Everett
Elliot Paquette
We investigate scaling laws for stochastic momentum algorithms with small batch on the power law random features model, parameterized by dat… (see more)a complexity, target complexity, and model size. When trained with a stochastic momentum algorithm, our analysis reveals four distinct loss curve shapes determined by varying data-target complexities. While traditional stochastic gradient descent with momentum (SGD-M) yields identical scaling law exponents to SGD, dimension-adapted Nesterov acceleration (DANA) improves these exponents by scaling momentum hyperparameters based on model size and data complexity. This outscaling phenomenon, which also improves compute-optimal scaling behavior, is achieved by DANA across a broad range of data and target complexities, while traditional methods fall short. Extensive experiments on high-dimensional synthetic quadratics validate our theoretical predictions and large-scale text experiments with LSTMs show DANA's improved loss exponents over SGD hold in a practical setting.
Structure-Aligned Protein Language Model
Can Chen
David Heurtel-Depeiges
Robert M. Vernon
Christopher J. Langmead
Quentin Fournier
Structure-Aligned Protein Language Model
Can Chen
David Heurtel-Depeiges
Robert M. Vernon
Christopher J. Langmead
Quentin Fournier
ImmunoStruct: a multimodal neural network framework for immunogenicity prediction from peptide-MHC sequence, structure, and biochemical properties
Kevin Bijan Givechian
João Felipe Rocha
Edward Yang
Chen Liu
Kerrie Greene
Rex Ying
Etienne Caron
Akiko Iwasaki
Adaptive Cyclic Diffusion for Inference Scaling
Gyubin Lee
Truong Nhat Nguyen Bao
Jaesik Yoon
Dongwoo Lee
Minsu Kim
Sungjin Ahn
Adaptive Cyclic Diffusion for Inference Scaling
Gyubin Lee
Truong Nhat Nguyen Bao
Jaesik Yoon
Dongwoo Lee
Minsu Kim
Sungjin Ahn
Determinants of surgical approach to pediatric appendicitis in Brazil.
Ayla Gerk
Paulo Henrique Moreira Melo
Mohsen Amoei
Shreenik Kundu
Luiza Telles
Justina O. Seyi-Olajide
Dunya Moghul
Gabriel Schnitman
Cristina Camargo
David P. Mooney
Joaquim Bustorff-Silva
Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy
Max Schwarzer
Jesse Farebrother
Joshua Greaves
Ekin Dogus Cubuk
Sergei Kalinin
Igor Mordatch
Kevin M Roccapriore
We introduce a machine learning approach to determine the transition dynamics of silicon atoms on a single layer of carbon atoms, when stimu… (see more)lated by the electron beam of a scanning transmission electron microscope (STEM). Our method is data-centric, leveraging data collected on a STEM. The data samples are processed and filtered to produce symbolic representations, which we use to train a neural network to predict transition probabilities. These learned transition dynamics are then leveraged to guide a single silicon atom throughout the lattice to pre-determined target destinations. We present empirical analyses that demonstrate the efficacy and generality of our approach.
Multi‐center benchmarking of cervical spinal cord <scp>RF</scp> coils for 7 T <scp>MRI</scp>: A traveling spines study
Eva Alonso‐Ortiz
Daniel Papp
Robert L. Barry
Kyota Poëti
Alan C. Seifert
Kyle M. Gilbert
Nibardo Lopez‐Rios
Jan Paska
Falk Eippert
Nikolaus Weiskopf
Laura Beghini
Nadine N. Graedel
Robert Trampel
Martina F. Callaghan
Christoph S. Aigner
Patrick Freund
Maryam Seif
Aurélien Destruel
Virginie Callot
Johanna Vannesjo … (see 1 more)
Multi-center benchmarking of cervical spinal cord RF coils for 7 T MRI: A traveling spines study
Eva Alonso‐Ortiz
Daniel Papp
Robert L. Barry
Kyota Poëti
Alan C. Seifert
Kyle M. Gilbert
Nibardo Lopez‐Rios
Jan Paska
Falk Eippert
Nikolaus Weiskopf
Laura Beghini
Nadine Graedel
Robert Trampel
Martina F Callaghan
Christoph S. Aigner
Patrick Freund
Maryam Seif
Aurélien Destruel
Virginie Callot
Johanna Vannesjo … (see 1 more)
SDLog: A Deep Learning Framework for Detecting Sensitive Information in Software Logs
Roozbeh Aghili
Xingfang Wu
Heng Li
SDLog: A Deep Learning Framework for Detecting Sensitive Information in Software Logs
Roozbeh Aghili
Xingfang Wu
Heng Li