Portrait de Gaurav Iyer n'est pas disponible

Gaurav Iyer

Maîtrise recherche - McGill
Superviseur⋅e principal⋅e
Sujets de recherche
Apprentissage profond

Publications

Linear Weight Interpolation Leads to Transient Performance Gains
Linear Weight Interpolation Leads to Transient Performance Gains
Maximal Initial Learning Rates in Deep ReLU Networks
Maximal Initial Learning Rates in Deep ReLU Networks
Training a neural network requires choosing a suitable learning rate, which involves a trade-off between speed and effectiveness of converge… (voir plus)nce. While there has been considerable theoretical and empirical analysis of how large the learning rate can be, most prior work focuses only on late-stage training. In this work, we introduce the maximal initial learning rate