Variance Reduction

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 64206 Experts worldwide ranked by ideXlab platform

Eric Po Xing - One of the best experts on this subject based on the ideXlab platform.

  • Variance Reduction in stochastic gradient langevin dynamics
    2016
    Co-Authors: Kumar Avinava Dubey, Sashank J. Reddi, Barnabás Póczos, Alexander J. Smola, Sinead A Williamson, Eric Po Xing
    Abstract:

    Stochastic gradient-based Monte Carlo methods such as stochastic gradient Langevin dynamics are useful tools for posterior inference on large scale datasets in many machine learning applications. These methods scale to large datasets by using noisy gradients calculated using a mini-batch or subset of the dataset. However, the high Variance inherent in these noisy gradients degrades performance and leads to slower mixing. In this paper, we present techniques for reducing Variance in stochastic gradient Langevin dynamics, yielding novel stochastic Monte Carlo methods that improve performance by reducing the Variance in the stochastic gradient. We show that our proposed method has better theoretical guarantees on convergence rate than stochastic Langevin dynamics. This is complemented by impressive empirical results obtained on a variety of real world datasets, and on four different machine learning tasks (regression, classification, independent component analysis and mixture modeling). These theoretical and empirical contributions combine to make a compelling case for using Variance Reduction in stochastic Monte Carlo methods.

  • Variance Reduction for Stochastic Gradient Optimization
    2013
    Co-Authors: Chong Wang, Alex Smola, Xi Chen, Eric Po Xing
    Abstract:

    Stochastic gradient optimization is a class of widely used algorithms for training machine learning models. To optimize an objective, it uses the noisy gradient computed from the random data samples instead of the true gradient computed from the entire dataset. However, when the Variance of the noisy gradient is large, the algorithm might spend much time bouncing around, leading to slower convergence and worse performance. In this paper, we develop a general approach of using control variate for Variance Reduction in stochastic gradient. Data statistics such as low-order moments (pre-computed or estimated online) is used to form the control variate. We demonstrate how to construct the control variate for two practical problems using stochastic gradient optimization. One is convex—the MAP estimation for logistic regression, and the other is non-convex—stochastic variational inference for latent Dirichlet allocation. On both problems, our approach shows faster convergence and better performance than the classical approach. 1

Mohammadreza Mohammadi - One of the best experts on this subject based on the ideXlab platform.

  • dimension and Variance Reduction for monte carlo methods for high dimensional models in finance
    2015
    Co-Authors: Duyminh Dang, Kenneth R Jackson, Mohammadreza Mohammadi
    Abstract:

    One-way coupling often occurs in multi-dimensional models in finance. In this paper, we present a dimension Reduction technique for Monte Carlo (MC) methods, referred to as drMC, that exploits this structure for pricing plain-vanilla European options under a N-dimensional one-way coupled model, where N is arbitrary. The dimension Reduction also often produces a significant Variance Reduction.The drMC method is a dimension Reduction technique built upon (i) the conditional MC technique applied to one dimension and (ii) the derivation of a closed-form solution for the conditional Partial Differential Equation (PDE) that arises via Fourier transforms. In the drMC approach, the option price can be computed simply by taking the expectation of this closed-form solution. Hence, the approach results in a powerful dimension Reduction from N to one, which often results in a significant Variance Reduction as well, since the Variance associated with the other (N-1) factors in the original model are completely removed from the drMC simulation. Moreover, under the drMC framework, hedging parameters, or Greeks, can be computed in a much more efficient way than in traditional MC techniques.A Variance Reduction analysis of the method is presented and numerical results illustrating the method's efficiency are provided.

  • dimension and Variance Reduction for monte carlo methods for high dimensional models in finance
    2015
    Co-Authors: Duyminh Dang, Kenneth R Jackson, Mohammadreza Mohammadi
    Abstract:

    AbstractOne-way coupling often occurs in multi-dimensional models in finance. In this paper, we present a dimension Reduction technique for Monte Carlo (MC) methods, referred to as drMC, that exploits this structure for pricing plain-vanilla European options under an N-dimensional one-way coupled model, where N is arbitrary. The dimension Reduction also often produces a significant Variance Reduction.The drMC method is a dimension Reduction technique built upon (i) the conditional MC technique applied to one of the factors which does not depend on any other factors in the model, and (ii) the derivation of a closed-form solution to the conditional partial differential equation (PDE) that arises via Fourier transforms. In the drMC approach, the option price can be computed simply by taking the expectation of this closed-form solution. Hence, the approach results in a powerful dimension Reduction from N to one, which often results in a significant Variance Reduction as well, since the Variance associated wit...

Duyminh Dang - One of the best experts on this subject based on the ideXlab platform.

  • a dimension and Variance Reduction monte carlo method for option pricing under jump diffusion models
    2017
    Co-Authors: Duyminh Dang, Kenneth R Jackson, Scott Sues
    Abstract:

    We develop a highly efficient MC method for computing plain vanilla European option prices and hedging parameters under a very general jump-diffusion option pricing model which includes stochastic Variance and multi-factor Gaussian interest short rate(s). The focus of our MC approach is Variance Reduction via dimension Reduction. More specifically, the option price is expressed as an expectation of a unique solution to a conditional Partial Integro-Differential Equation (PIDE), which is then solved using a Fourier transform technique. Important features of our approach are (1) the analytical tractability of the conditional PIDE is fully determined by that of the Black–Scholes–Merton model augmented with the same jump component as in our model, and (2) the Variances associated with all the interest rate factors are completely removed when evaluating the expectation via iterated conditioning applied to only the Brownian motion associated with the Variance factor. For certain cases when numerical methods are either needed or preferred, we propose a discrete fast Fourier transform method to numerically solve the conditional PIDE efficiently. Our method can also effectively compute hedging parameters. Numerical results show that the proposed method is highly efficient.

  • dimension and Variance Reduction for monte carlo methods for high dimensional models in finance
    2015
    Co-Authors: Duyminh Dang, Kenneth R Jackson, Mohammadreza Mohammadi
    Abstract:

    One-way coupling often occurs in multi-dimensional models in finance. In this paper, we present a dimension Reduction technique for Monte Carlo (MC) methods, referred to as drMC, that exploits this structure for pricing plain-vanilla European options under a N-dimensional one-way coupled model, where N is arbitrary. The dimension Reduction also often produces a significant Variance Reduction.The drMC method is a dimension Reduction technique built upon (i) the conditional MC technique applied to one dimension and (ii) the derivation of a closed-form solution for the conditional Partial Differential Equation (PDE) that arises via Fourier transforms. In the drMC approach, the option price can be computed simply by taking the expectation of this closed-form solution. Hence, the approach results in a powerful dimension Reduction from N to one, which often results in a significant Variance Reduction as well, since the Variance associated with the other (N-1) factors in the original model are completely removed from the drMC simulation. Moreover, under the drMC framework, hedging parameters, or Greeks, can be computed in a much more efficient way than in traditional MC techniques.A Variance Reduction analysis of the method is presented and numerical results illustrating the method's efficiency are provided.

  • dimension and Variance Reduction for monte carlo methods for high dimensional models in finance
    2015
    Co-Authors: Duyminh Dang, Kenneth R Jackson, Mohammadreza Mohammadi
    Abstract:

    AbstractOne-way coupling often occurs in multi-dimensional models in finance. In this paper, we present a dimension Reduction technique for Monte Carlo (MC) methods, referred to as drMC, that exploits this structure for pricing plain-vanilla European options under an N-dimensional one-way coupled model, where N is arbitrary. The dimension Reduction also often produces a significant Variance Reduction.The drMC method is a dimension Reduction technique built upon (i) the conditional MC technique applied to one of the factors which does not depend on any other factors in the model, and (ii) the derivation of a closed-form solution to the conditional partial differential equation (PDE) that arises via Fourier transforms. In the drMC approach, the option price can be computed simply by taking the expectation of this closed-form solution. Hence, the approach results in a powerful dimension Reduction from N to one, which often results in a significant Variance Reduction as well, since the Variance associated wit...

Changyou Chen - One of the best experts on this subject based on the ideXlab platform.

  • a convergence analysis for a class of practical Variance Reduction stochastic gradient mcmc
    2019
    Co-Authors: Changyou Chen, Wenlin Wang, Yizhe Zhang, Lawrence Carin
    Abstract:

    Stochastic gradient Markov chain Monte Carlo (SG-MCMC) has been developed as a flexible family of scalable Bayesian sampling algorithms. However, there has been little theoretical analysis of the impact of minibatch size to the algorithm’s convergence rate. In this paper, we prove that at the beginning of an SG-MCMC algorithm, i.e., under limited computational budget/time, a larger minibatch size leads to a faster decrease of the mean squared error bound. The reason for this is due to the prominent noise in small minibatches when calculating stochastic gradients, motivating the necessity of Variance Reduction in SG-MCMC for practical use. By borrowing ideas from stochastic optimization, we propose a simple and practical Variance-Reduction technique for SG-MCMC, that is efficient in both computation and storage. More importantly, we develop the theory to prove that our algorithm induces a faster convergence rate than standard SG-MCMC. A number of large-scale experiments, ranging from Bayesian learning of logistic regression to deep neural networks, validate the theory and demonstrate the superiority of the proposed Variance-Reduction SG-MCMC framework.

  • Variance Reduction in stochastic particle optimization sampling
    2018
    Co-Authors: Jianyi Zhang, Yang Zhao, Changyou Chen
    Abstract:

    Stochastic particle-optimization sampling (SPOS) is a recently-developed scalable Bayesian sampling framework that unifies stochastic gradient MCMC (SG-MCMC) and Stein variational gradient descent (SVGD) algorithms based on Wasserstein gradient flows. With a rigorous non-asymptotic convergence theory developed recently, SPOS avoids the particle-collapsing pitfall of SVGD. Nevertheless, Variance Reduction in SPOS has never been studied. In this paper, we bridge the gap by presenting several Variance-Reduction techniques for SPOS. Specifically, we propose three variants of Variance-reduced SPOS, called SAGA particle-optimization sampling (SAGA-POS), SVRG particle-optimization sampling (SVRG-POS) and a variant of SVRG-POS which avoids full gradient computations, denoted as SVRG-POS$^+$. Importantly, we provide non-asymptotic convergence guarantees for these algorithms in terms of 2-Wasserstein metric and analyze their complexities. Remarkably, the results show our algorithms yield better convergence rates than existing Variance-reduced variants of stochastic Langevin dynamics, even though more space is required to store the particles in training. Our theory well aligns with experimental results on both synthetic and real datasets.

  • a convergence analysis for a class of practical Variance Reduction stochastic gradient mcmc
    2017
    Co-Authors: Changyou Chen, Wenlin Wang, Yizhe Zhang, Lawrence Carin
    Abstract:

    Stochastic gradient Markov Chain Monte Carlo (SG-MCMC) has been developed as a flexible family of scalable Bayesian sampling algorithms. However, there has been little theoretical analysis of the impact of minibatch size to the algorithm's convergence rate. In this paper, we prove that under a limited computational budget/time, a larger minibatch size leads to a faster decrease of the mean squared error bound (thus the fastest one corresponds to using full gradients), which motivates the necessity of Variance Reduction in SG-MCMC. Consequently, by borrowing ideas from stochastic optimization, we propose a practical Variance-Reduction technique for SG-MCMC, that is efficient in both computation and storage. We develop theory to prove that our algorithm induces a faster convergence rate than standard SG-MCMC. A number of large-scale experiments, ranging from Bayesian learning of logistic regression to deep neural networks, validate the theory and demonstrate the superiority of the proposed Variance-Reduction SG-MCMC framework.

Francis Bach - One of the best experts on this subject based on the ideXlab platform.

  • dual free stochastic decentralized optimization with Variance Reduction
    2020
    Co-Authors: Hadrien Hendrikx, Francis Bach, Laurent Massoulie
    Abstract:

    We consider the problem of training machine learning models on distributed data in a decentralized way. For finite-sum problems, fast single-machine algorithms for large datasets rely on stochastic updates combined with Variance Reduction. Yet, existing decentralized stochastic algorithms either do not obtain the full speedup allowed by stochastic updates, or require oracles that are more expensive than regular gradients. In this work, we introduce a Decentralized stochastic algorithm with Variance Reduction called DVR. DVR only requires computing stochastic gradients of the local functions, and is computationally as fast as a standard stochastic Variance-reduced algorithms run on a 1/n fraction of the dataset, where n is the number of nodes. To derive DVR, we use Bregman coordinate descent on a well-chosen dual problem, and obtain a dual-free algorithm using a specific Bregman divergence. We give an accelerated version of DVR based on the Catalyst framework, and illustrate its effectiveness with simulations on real data.

  • Stochastic Variance Reduction Methods for Saddle-Point Problems
    2016
    Co-Authors: Palaniappan Balamurugan, Francis Bach
    Abstract:

    We consider convex-concave saddle-point problems where the objective functions may be split in many components, and extend recent stochastic Variance Reduction methods (such as SVRG or SAGA) to provide the first large-scale linearly convergent algorithms for this class of problems which is common in machine learning. While the algorithmic extension is straightforward, it comes with challenges and opportunities: (a) the convex minimization analysis does not apply and we use the notion of monotone operators to prove convergence, showing in particular that the same algorithm applies to a larger class of problems, such as variational inequalities, (b) there are two notions of splits, in terms of functions, or in terms of partial derivatives, (c) the split does need to be done with convex-concave terms, (d) non-uniform sampling is key to an efficient algorithm, both in theory and practice, and (e) these incremental algorithms can be easily accelerated using a simple extension of the "catalyst" framework, leading to an algorithm which is always superior to accelerated batch algorithms.

  • stochastic Variance Reduction methods for saddle point problems
    2016
    Co-Authors: Palaniappan Balamurugan, Francis Bach
    Abstract:

    We consider convex-concave saddle-point problems where the objective functions may be split in many components, and extend recent stochastic Variance Reduction methods (such as SVRG or SAGA) to provide the first large-scale linearly convergent algorithms for this class of problems which are common in machine learning. While the algorithmic extension is straightforward, it comes with challenges and opportunities: (a) the convex minimization analysis does not apply and we use the notion of monotone operators to prove convergence, showing in particular that the same algorithm applies to a larger class of problems, such as variational inequalities, (b) there are two notions of splits, in terms of functions, or in terms of partial derivatives, (c) the split does need to be done with convex-concave terms, (d) non-uniform sampling is key to an efficient algorithm, both in theory and practice, and (e) these incremental algorithms can be easily accelerated using a simple extension of the "catalyst" framework, leading to an algorithm which is always superior to accelerated batch algorithms.