Sharan Vaswani

Alumni

Publications

How to make your optimizer generalize better

Sharan Vaswani

Reza Babenzhad

Sait AI Lab

Montreal

Jose Gallego

Aaron Mishkin

Simon Lacoste-Julien

Nicolas Roux

We study the implicit regularization of optimization methods for linear models interpolating the training data in the under-parametrized and… (voir plus) over-parametrized regimes. For over-parameterized linear regression, where there are inﬁnitely many interpolating solutions, different optimization methods can converge to solutions with varying generalization performance. In this setting, we show that projections onto linear spans can be used to move between solutions. Furthermore, via a simple reparameterization, we can ensure that an arbitrary optimizer converges to the minimum (cid:96) 2 -norm solution with favourable generalization properties. For under-parameterized linear clas-siﬁcation, optimizers can converge to different decision boundaries separating the data. We prove that for any such classiﬁer, there exists a family of quadratic norms (cid:107)·(cid:107) P such that the classiﬁer’s direction is the same as that of the maximum P -margin solution. We argue that analyzing convergence to the standard maximum (cid:96) 2 -margin is arbitrary and show that minimizing the norm induced by the data can result in better generalization. We validate our theoretical results via experiments on synthetic and real datasets.

2019-12-31

(publié)

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Sharan Vaswani

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Sharan Vaswani

Publications