Eduard Gorbunov

High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise.

Eduard Gorbunov

Abdurakhmon Sadiev

Marina Danilova

Samuel Horváth

Gauthier Gidel

Pavel Dvurechensky

Alexander Gasnikov

Peter Richtárik

2024-01-01

International Conference on Machine Learning (published)

doi.org

openreview.net

High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise

Eduard Gorbunov

Abdurakhmon Sadiev

Marina Danilova

Samuel Horváth

Gauthier Gidel

Pavel Dvurechensky

Alexander Gasnikov

Peter Richtárik

2023-10-03

ArXiv (preprint)

doi.org

openreview.net

Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity

Eduard Gorbunov

Adrien Taylor

Samuel Horváth

Gauthier Gidel

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (published)

proceedings.mlr.press

openreview.net

High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance

Abdurakhmon Sadiev

Marina Danilova

Eduard Gorbunov

Samuel Horváth

Gauthier Gidel

Pavel Dvurechensky

Alexander Gasnikov

Peter Richtárik

During the recent years the interest of optimization and machine learning communities in high-probability convergence of stochastic optimiza… (see more)tion methods has been growing. One of the main reasons for this is that high-probability complexity bounds are more accurate and less studied than in-expectation ones. However, SOTA high-probability non-asymptotic convergence results are derived under strong assumptions such as boundedness of the gradient noise variance or of the objective’s gradient itself. In this paper, we propose several algorithms with high-probability convergence results under less restrictive assumptions. In particular, we derive new high-probability convergence results under the assumption that the gradient/operator noise has bounded central

2023-07-03

Proceedings of the 40th International Conference on Machine Learning (published)

doi.org

openreview.net

High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance

Abdurakhmon Sadiev

Marina Danilova

Eduard Gorbunov

Samuel Horváth

Gauthier Gidel

Pavel Dvurechensky

Alexander Gasnikov

Peter Richtárik

During recent years the interest of optimization and machine learning communities in high-probability convergence of stochastic optimization… (see more) methods has been growing. One of the main reasons for this is that high-probability complexity bounds are more accurate and less studied than in-expectation ones. However, SOTA high-probability non-asymptotic convergence results are derived under strong assumptions such as the boundedness of the gradient noise variance or of the objective's gradient itself. In this paper, we propose several algorithms with high-probability convergence results under less restrictive assumptions. In particular, we derive new high-probability convergence results under the assumption that the gradient/operator noise has bounded central

2023-04-24

ICML.cc/2023/Conference (poster)

doi.org

openreview.net

Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity

Eduard Gorbunov

Adrien Taylor

Samuel Horváth

Gauthier Gidel

Algorithms for min-max optimization and variational inequalities are often studied under monotonicity assumptions. Motivated by non-monotone… (see more) machine learning applications, we follow the line of works (Diakonikolas et al., 2021; Lee & Kim, 2021; Pethick et al., 2022; Bohm,2022) aiming at going beyond monotonicity by considering the weaker *negative comonotonicity* assumption. In this work, we provide tight complexity analyses for the Proximal Point (PP), Extragradient (EG), and Optimistic Gradient (OG) methods in this setup, closing several questions on their working guarantees beyond monotonicity. In particular, we derive the first non-asymptotic convergence rates for PP under negative comonotonicity and star-negative comonotonicity and show their tightness via constructing worst-case examples; we also relax the assumptions for the last-iterate convergence guarantees for EG and OG and prove the tightness of the existing best-iterate guarantees for EG and OG via constructing counter-examples.

2023-01-01

ICML (published)

openreview.net

Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top

Eduard Gorbunov

Samuel Horváth

Peter Richtárik

Gauthier Gidel

Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in collaborative and federated learning. However,… (see more) many fruitful directions, such as the usage of variance reduction for achieving robustness and communication compression for reducing communication costs, remain weakly explored in the field. This work addresses this gap and proposes Byz-VR-MARINA - a new Byzantine-tolerant method with variance reduction and compression. A key message of our paper is that variance reduction is key to fighting Byzantine workers more effectively. At the same time, communication compression is a bonus that makes the process more communication efficient. We derive theoretical convergence guarantees for Byz-VR-MARINA outperforming previous state-of-the-art for general non-convex and Polyak-Lojasiewicz loss functions. Unlike the concurrent Byzantine-robust methods with variance reduction and/or compression, our complexity results are tight and do not rely on restrictive assumptions such as boundedness of the gradients or limited compression. Moreover, we provide the first analysis of a Byzantine-tolerant method supporting non-uniform sampling of stochastic gradients. Numerical experiments corroborate our theoretical findings.

2023-01-01

International Conference on Learning Representations (published)

doi.org

openreview.net

Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise

Eduard Gorbunov

Marina Danilova

David Dobre

Pavel Dvurechensky

Alexander Gasnikov

Gauthier Gidel

openreview.net