Portrait of Gaurav Iyer is unavailable

Gaurav Iyer

Master's Research - McGill University
Supervisor
Research Topics
Deep Learning

Publications

Linear Weight Interpolation Leads to Transient Performance Gains
Linear Weight Interpolation Leads to Transient Performance Gains
Maximal Initial Learning Rates in Deep ReLU Networks
Maximal Initial Learning Rates in Deep ReLU Networks
Training a neural network requires choosing a suitable learning rate, which involves a trade-off between speed and effectiveness of converge… (see more)nce. While there has been considerable theoretical and empirical analysis of how large the learning rate can be, most prior work focuses only on late-stage training. In this work, we introduce the maximal initial learning rate