Felix Dangel

Membre académique associé

Concordia University

Sujets de recherche

Apprentissage profond

IA pour la science

Optimisation

Paysages de perte

Symétrie

Site web

Google Scholar

Publications

Structured Inverse-Free Natural Gradient Descent: Memory-Efficient & Numerically-Stable KFAC

Wu Lin

Felix Dangel

Runa Eschenhagen

Kirill Neklyudov

Agustinus Kristiadi

Richard E. Turner

Alireza Makhzani

Second-order methods such as KFAC can be useful for neural net training. However, they are often memory-inefficient since their precondition… (voir plus)ing Kronecker factors are dense, and numerically unstable in low precision as they require matrix inversion or decomposition. These limitations render such methods unpopular for modern mixed-precision training. We address them by (i) formulating an inverse-free KFAC update and (ii) imposing structures in the Kronecker factors, resulting in structured inverse-free natural gradient descent (SINGD). On modern neural networks, we show that SINGD is memory-efficient and numerically robust, in contrast to KFAC, and often outperforms AdamW even in half precision. Our work closes a gap between first- and second-order methods in modern low-precision training.

2024-01-01

ICML (publié)

proceedings.mlr.press

arxiv.org

Conférence d'ouverture | Créer une IA plus sécuritaire pour la santé mentale des jeunes

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Éclaireurs autochtones en IA

Felix Dangel

Publications

Conférence d'ouverture | Créer une IA plus sécuritaire pour la santé mentale des jeunes

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Éclaireurs autochtones en IA

Mots-clés populaires:

Felix Dangel

Publications