Subgradient

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 12192 Experts worldwide ranked by ideXlab platform

Lin Xiao - One of the best experts on this subject based on the ideXlab platform.

  • dual averaging methods for regularized stochastic learning and online optimization
    Journal of Machine Learning Research, 2010
    Co-Authors: Lin Xiao
    Abstract:

    We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as l1-norm for promoting sparsity. We develop extensions of Nesterov's dual averaging method, that can exploit the regularization structure in an online setting. At each iteration of these methods, the learning variables are adjusted by solving a simple minimization problem that involves the running average of all past Subgradients of the loss function and the whole regularization term, not just its Subgradient. In the case of l1-regularization, our method is particularly effective in obtaining sparse solutions. We show that these methods achieve the optimal convergence rates or regret bounds that are standard in the literature on stochastic and online convex optimization. For stochastic learning problems in which the loss functions have Lipschitz continuous gradients, we also present an accelerated version of the dual averaging method.

  • dual averaging method for regularized stochastic learning and online optimization
    Neural Information Processing Systems, 2009
    Co-Authors: Lin Xiao
    Abstract:

    We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as l1-norm for promoting sparsity. We develop a new online algorithm, the regularized dual averaging (RDA) method, that can explicitly exploit the regularization structure in an online setting. In particular, at each iteration, the learning variables are adjusted by solving a simple optimization problem that involves the running average of all past Subgradients of the loss functions and the whole regularization term, not just its Subgradient. Computational experiments show that the RDA method can be very effective for sparse online learning with l1-regularization.

Jason D Lee - One of the best experts on this subject based on the ideXlab platform.

  • stochastic Subgradient method converges on tame functions
    Foundations of Computational Mathematics, 2020
    Co-Authors: Damek Davis, Dmitriy Drusvyatskiy, Sham M Kakade, Jason D Lee
    Abstract:

    This work considers the question: what convergence guarantees does the stochastic Subgradient method have in the absence of smoothness and convexity? We prove that the stochastic Subgradient method, on any semialgebraic locally Lipschitz function, produces limit points that are all first-order stationary. More generally, our result applies to any function with a Whitney stratifiable graph. In particular, this work endows the stochastic Subgradient method, and its proximal extension, with rigorous convergence guarantees for a wide class of problems arising in data science—including all popular deep learning architectures.

Jorge Cortes - One of the best experts on this subject based on the ideXlab platform.

  • distributed saddle point Subgradient algorithms with laplacian averaging
    IEEE Transactions on Automatic Control, 2017
    Co-Authors: David Mateosnunez, Jorge Cortes
    Abstract:

    We present distributed Subgradient methods for min-max problems with agreement constraints on a subset of the arguments of both the convex and concave parts. Applications include constrained minimization problems where each constraint is a sum of convex functions in the local variables of the agents. In the latter case, the proposed algorithm reduces to primal-dual updates using local Subgradients and Laplacian averaging on local copies of the multipliers associated to the global constraints. For the case of general convex-concave saddle-point problems, our analysis establishes the convergence of the running time-averages of the local estimates to a saddle point under periodic connectivity of the communication digraphs. Specifically, choosing the gradient step-sizes in a suitable way, we show that the evaluation error is proportional to $1/\sqrt{t}$ , where $t$ is the iteration step. We illustrate our results in simulation for an optimization scenario with nonlinear constraints coupling the decisions of agents that cannot communicate directly.

  • distributed Subgradient methods for saddle point problems
    Conference on Decision and Control, 2015
    Co-Authors: David Mateosnunez, Jorge Cortes
    Abstract:

    We present provably correct distributed Subgradient methods for general min-max problems with agreement constraints on a subset of the arguments of both the convex and concave parts. Applications include separable constrained minimization problems where each constraint is a sum of convex functions of local variables for the agents. The proposed algorithm then reduces to primal-dual updates using local Subgradients and Laplacian averaging on local copies of the multipliers associated to the global constraints. The framework also encodes minimization problems with semidefinite constraints, which results in novel distributed strategies that are scalable if the order of the matrix inequalities is independent of the network size. Our analysis establishes for the case of general convex-concave functions the convergence of the running time-averages of the local estimates to a saddle point under periodic connectivity of the communication digraphs. Specifically, choosing the gradient step-sizes in a suitable way, we show that the evaluation error is proportional to 1/√t where t is the iteration step.

Damek Davis - One of the best experts on this subject based on the ideXlab platform.

  • stochastic Subgradient method converges on tame functions
    Foundations of Computational Mathematics, 2020
    Co-Authors: Damek Davis, Dmitriy Drusvyatskiy, Sham M Kakade, Jason D Lee
    Abstract:

    This work considers the question: what convergence guarantees does the stochastic Subgradient method have in the absence of smoothness and convexity? We prove that the stochastic Subgradient method, on any semialgebraic locally Lipschitz function, produces limit points that are all first-order stationary. More generally, our result applies to any function with a Whitney stratifiable graph. In particular, this work endows the stochastic Subgradient method, and its proximal extension, with rigorous convergence guarantees for a wide class of problems arising in data science—including all popular deep learning architectures.

  • Subgradient Methods for Sharp Weakly Convex Functions
    Journal of Optimization Theory and Applications, 2018
    Co-Authors: Damek Davis, Dmitriy Drusvyatskiy, Kellie J. Macphee, Courtney Paquette
    Abstract:

    Subgradient methods converge linearly on a convex function that grows sharply away from its solution set. In this work, we show that the same is true for sharp functions that are only weakly convex, provided that the Subgradient methods are initialized within a fixed tube around the solution set. A variety of statistical and signal processing tasks come equipped with good initialization and provably lead to formulations that are both weakly convex and sharp. Therefore, in such settings, Subgradient methods can serve as inexpensive local search procedures. We illustrate the proposed techniques on phase retrieval and covariance estimation problems.

Karl Henrik Johansson - One of the best experts on this subject based on the ideXlab platform.

  • Subgradient methods and consensus algorithms for solving convex optimization problems
    Conference on Decision and Control, 2008
    Co-Authors: Björn Johansson, Tamas Keviczky, Mikael Johansson, Karl Henrik Johansson
    Abstract:

    In this paper we propose a Subgradient method for solving coupled optimization problems in a distributed way given restrictions on the communication topology. The iterative procedure maintains local variables at each node and relies on local Subgradient updates in combination with a consensus process. The local Subgradient steps are applied simultaneously as opposed to the standard sequential or cyclic procedure. We study convergence properties of the proposed scheme using results from consensus theory and approximate Subgradient methods. The framework is illustrated on an optimal distributed finite-time rendezvous problem.

  • CDC - Subgradient methods and consensus algorithms for solving convex optimization problems
    2008 47th IEEE Conference on Decision and Control, 2008
    Co-Authors: Björn Johansson, Tamas Keviczky, Mikael Johansson, Karl Henrik Johansson
    Abstract:

    In this paper we propose a Subgradient method for solving coupled optimization problems in a distributed way given restrictions on the communication topology. The iterative procedure maintains local variables at each node and relies on local Subgradient updates in combination with a consensus process. The local Subgradient steps are applied simultaneously as opposed to the standard sequential or cyclic procedure. We study convergence properties of the proposed scheme using results from consensus theory and approximate Subgradient methods. The framework is illustrated on an optimal distributed finite-time rendezvous problem.