Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak–Łojasiewicz condition.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Additional Information
    • Abstract:
      In this work, we establish the linear convergence estimate for the gradient descent involving the delay τ ∈ ℕ when the cost function is μ -strongly convex and L -smooth. This result improves upon the well-known estimates in [Y. Arjevani, O. Shamir and N. Srebro, A tight convergence analysis for stochastic gradient descent with delayed updates, Proc. Mach. Learn. Res. 117 (2020) 111–132; S. U. Stich and S. P. Karimireddy, The error-feedback framework: Better rates for SGD with delayed gradients and compressed updates, J. Mach. Learn. Res. 21(1) (2020) 9613–9648] in the sense that it is non-ergodic and is still established in spite of weaker constraint of cost function. Also, the range of learning rate η can be extended from η ≤ 1 / (1 0 L τ) to η ≤ 1 / (4 L τ) for τ = 1 and η ≤ 3 / (1 0 L τ) for τ ≥ 2 , where L > 0 is the Lipschitz continuity constant of the gradient of cost function. In a further research, we show the linear convergence of cost function under the Polyak–Łojasiewicz (PL) condition, for which the available choice of learning rate is further improved as η ≤ 9 / (1 0 L τ) for the large delay τ. The framework of the proof for this result is also extended to the stochastic gradient descent with time-varying delay under the PL condition. Finally, some numerical experiments are provided in order to confirm the reliability of the analyzed results. [ABSTRACT FROM AUTHOR]
    • Abstract:
      Copyright of Analysis & Applications is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)