Latent Variable Models

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 19803 Experts worldwide ranked by ideXlab platform

Ryan P Adams - One of the best experts on this subject based on the ideXlab platform.

  • sumo unbiased estimation of log marginal probability for Latent Variable Models
    International Conference on Learning Representations, 2020
    Co-Authors: Yucen Luo, Ryan P Adams, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ricky T. Q. Chen
    Abstract:

    The standard variational lower bounds used to train Latent Variable Models produce biased estimates of most quantities of interest. We introduce an unbiased estimator of the log marginal likelihood and its gradients for Latent Variable Models based on randomized truncation of infinite series. If parameterized by an encoder-decoder architecture, the parameters of the encoder can be optimized to minimize its variance of this estimator. We show that Models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost. This estimator also allows use of Latent Variable Models for tasks where unbiased estimators, rather than marginal likelihood lower bounds, are preferred, such as minimizing reverse KL divergences and estimating score functions.

  • ICLR - SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models
    2020
    Co-Authors: Yucen Luo, Ryan P Adams, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ricky T. Q. Chen
    Abstract:

    The standard variational lower bounds used to train Latent Variable Models produce biased estimates of most quantities of interest. We introduce an unbiased estimator of the log marginal likelihood and its gradients for Latent Variable Models based on randomized truncation of infinite series. If parameterized by an encoder-decoder architecture, the parameters of the encoder can be optimized to minimize its variance of this estimator. We show that Models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost. This estimator also allows use of Latent Variable Models for tasks where unbiased estimators, rather than marginal likelihood lower bounds, are preferred, such as minimizing reverse KL divergences and estimating score functions.

  • SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models
    arXiv: Learning, 2020
    Co-Authors: Yucen Luo, Ryan P Adams, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ricky T. Q. Chen
    Abstract:

    Standard variational lower bounds used to train Latent Variable Models produce biased estimates of most quantities of interest. We introduce an unbiased estimator of the log marginal likelihood and its gradients for Latent Variable Models based on randomized truncation of infinite series. If parameterized by an encoder-decoder architecture, the parameters of the encoder can be optimized to minimize its variance of this estimator. We show that Models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost. This estimator also allows use of Latent Variable Models for tasks where unbiased estimators, rather than marginal likelihood lower bounds, are preferred, such as minimizing reverse KL divergences and estimating score functions.

  • priors for diversity in generative Latent Variable Models
    Neural Information Processing Systems, 2012
    Co-Authors: James T. Kwok, Ryan P Adams
    Abstract:

    Probabilistic Latent Variable Models are one of the cornerstones of machine learning. They offer a convenient and coherent way to specify prior distributions over unobserved structure in data, so that these unknown properties can be inferred via posterior inference. Such Models are useful for exploratory analysis and visualization, for building density Models of data, and for providing features that can be used for later discriminative tasks. A significant limitation of these Models, however, is that draws from the prior are often highly redundant due to i.i.d. assumptions on internal parameters. For example, there is no preference in the prior of a mixture model to make components non-overlapping, or in topic model to ensure that co-occurring words only appear in a small number of topics. In this work, we revisit these independence assumptions for probabilistic Latent Variable Models, replacing the underlying i.i.d. prior with a determinantal point process (DPP). The DPP allows us to specify a preference for diversity in our Latent Variables using a positive definite kernel function. Using a kernel between probability distributions, we are able to define a DPP on probability measures. We show how to perform MAP inference with DPP priors in Latent Dirichlet allocation and in mixture Models, leading to better intuition for the Latent Variable representation and quantitatively improved unsupervised feature extraction, without compromising the generative aspects of the model.

Jun Zhu - One of the best experts on this subject based on the ideXlab platform.

  • ICLR - SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models
    2020
    Co-Authors: Yucen Luo, Ryan P Adams, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ricky T. Q. Chen
    Abstract:

    The standard variational lower bounds used to train Latent Variable Models produce biased estimates of most quantities of interest. We introduce an unbiased estimator of the log marginal likelihood and its gradients for Latent Variable Models based on randomized truncation of infinite series. If parameterized by an encoder-decoder architecture, the parameters of the encoder can be optimized to minimize its variance of this estimator. We show that Models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost. This estimator also allows use of Latent Variable Models for tasks where unbiased estimators, rather than marginal likelihood lower bounds, are preferred, such as minimizing reverse KL divergences and estimating score functions.

  • sumo unbiased estimation of log marginal probability for Latent Variable Models
    International Conference on Learning Representations, 2020
    Co-Authors: Yucen Luo, Ryan P Adams, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ricky T. Q. Chen
    Abstract:

    The standard variational lower bounds used to train Latent Variable Models produce biased estimates of most quantities of interest. We introduce an unbiased estimator of the log marginal likelihood and its gradients for Latent Variable Models based on randomized truncation of infinite series. If parameterized by an encoder-decoder architecture, the parameters of the encoder can be optimized to minimize its variance of this estimator. We show that Models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost. This estimator also allows use of Latent Variable Models for tasks where unbiased estimators, rather than marginal likelihood lower bounds, are preferred, such as minimizing reverse KL divergences and estimating score functions.

  • SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models
    arXiv: Learning, 2020
    Co-Authors: Yucen Luo, Ryan P Adams, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ricky T. Q. Chen
    Abstract:

    Standard variational lower bounds used to train Latent Variable Models produce biased estimates of most quantities of interest. We introduce an unbiased estimator of the log marginal likelihood and its gradients for Latent Variable Models based on randomized truncation of infinite series. If parameterized by an encoder-decoder architecture, the parameters of the encoder can be optimized to minimize its variance of this estimator. We show that Models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost. This estimator also allows use of Latent Variable Models for tasks where unbiased estimators, rather than marginal likelihood lower bounds, are preferred, such as minimizing reverse KL divergences and estimating score functions.

  • diversity promoting bayesian learning of Latent Variable Models
    arXiv: Learning, 2017
    Co-Authors: Pengtao Xie, Jun Zhu, Eric P Xing
    Abstract:

    To address three important issues involved in Latent Variable Models (LVMs), including capturing infrequent patterns, achieving small-sized but expressive Models and alleviating overfitting, several studies have been devoted to "diversifying" LVMs, which aim at encouraging the components in LVMs to be diverse. Most existing studies fall into a frequentist-style regularization framework, where the components are learned via point estimation. In this paper, we investigate how to "diversify" LVMs in the paradigm of Bayesian learning. We propose two approaches that have complementary advantages. One is to define a diversity-promoting mutual angular prior which assigns larger density to components with larger mutual angles and use this prior to affect the posterior via Bayes' rule. We develop two efficient approximate posterior inference algorithms based on variational inference and MCMC sampling. The other approach is to impose diversity-promoting regularization directly over the post-data distribution of components. We also extend our approach to "diversify" Bayesian nonparametric Models where the number of components is infinite. A sampling algorithm based on slice sampling and Hamiltonian Monte Carlo is developed. We apply these methods to "diversify" Bayesian mixture of experts model and infinite Latent feature model. Experiments on various datasets demonstrate the effectiveness and efficiency of our methods.

  • diversity promoting bayesian learning of Latent Variable Models
    International Conference on Machine Learning, 2016
    Co-Authors: Pengtao Xie, Jun Zhu, Eric P Xing
    Abstract:

    In learning Latent Variable Models (LVMs), it is important to effectively capture infrequent patterns and shrink model size without sacrificing modeling power. Various studies have been done to "diversify" a LVM, which aim to learn a diverse set of Latent components in LVMs. Most existing studies fall into a frequentist-style regularization framework, where the components are learned via point estimation. In this paper, we investigate how to "diversify" LVMs in the paradigm of Bayesian learning, which has advantages complementary to point estimation, such as alleviating overfitting via model averaging and quantifying uncertainty. We propose two approaches that have complementary advantages. One is to define diversity-promoting mutual angular priors which assign larger density to components with larger mutual angles based on Bayesian network and von Mises-Fisher distribution and use these priors to affect the posterior via Bayes rule. We develop two efficient approximate posterior inference algorithms based on variational inference and Markov chain Monte Carlo sampling. The other approach is to impose diversity-promoting regularization directly over the post-data distribution of components. These two methods are applied to the Bayesian mixture of experts model to encourage the "experts" to be diverse and experimental results demonstrate the effectiveness and efficiency of our methods.

Matus Telgarsky - One of the best experts on this subject based on the ideXlab platform.

  • tensor decompositions for learning Latent Variable Models a survey for alt
    Algorithmic Learning Theory, 2015
    Co-Authors: Anima Anandkumar, Sham M Kakade, Daniel Hsu, Matus Telgarsky
    Abstract:

    This note is a short version of that in [1]. It is intended as a survey for the 2015 Algorithmic Learning Theory ALT conference. This work considers a computationally and statistically efficient parameter estimation method for a wide class of Latent Variable Models--including Gaussian mixture Models, hidden Markov Models, and Latent Dirichlet allocation--which exploits a certain tensor structure in their low-order observable moments typically, of second- and third-order. Specifically, parameter estimation is reduced to the problem of extracting a certain orthogonal decomposition of a symmetric tensor derived from the moments; this decomposition can be viewed as a natural generalization of the singular value decomposition for matrices. Although tensor decompositions are generally intractable to compute, the decomposition of these specially structured tensors can be efficiently obtained by a variety of approaches, including power iterations and maximization approaches similar to the case of matrices. A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices. This implies a robust and computationally tractable estimation approach for several popular Latent Variable Models.

  • tensor decompositions for learning Latent Variable Models
    arXiv: Learning, 2012
    Co-Authors: Anima Anandkumar, Rong Ge, Sham M Kakade, Matus Telgarsky
    Abstract:

    This work considers a computationally and statistically efficient parameter estimation method for a wide class of Latent Variable Models---including Gaussian mixture Models, hidden Markov Models, and Latent Dirichlet allocation---which exploits a certain tensor structure in their low-order observable moments (typically, of second- and third-order). Specifically, parameter estimation is reduced to the problem of extracting a certain (orthogonal) decomposition of a symmetric tensor derived from the moments; this decomposition can be viewed as a natural generalization of the singular value decomposition for matrices. Although tensor decompositions are generally intractable to compute, the decomposition of these specially structured tensors can be efficiently obtained by a variety of approaches, including power iterations and maximization approaches (similar to the case of matrices). A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices. This implies a robust and computationally tractable estimation approach for several popular Latent Variable Models.

Eric P Xing - One of the best experts on this subject based on the ideXlab platform.

  • diversity promoting bayesian learning of Latent Variable Models
    arXiv: Learning, 2017
    Co-Authors: Pengtao Xie, Jun Zhu, Eric P Xing
    Abstract:

    To address three important issues involved in Latent Variable Models (LVMs), including capturing infrequent patterns, achieving small-sized but expressive Models and alleviating overfitting, several studies have been devoted to "diversifying" LVMs, which aim at encouraging the components in LVMs to be diverse. Most existing studies fall into a frequentist-style regularization framework, where the components are learned via point estimation. In this paper, we investigate how to "diversify" LVMs in the paradigm of Bayesian learning. We propose two approaches that have complementary advantages. One is to define a diversity-promoting mutual angular prior which assigns larger density to components with larger mutual angles and use this prior to affect the posterior via Bayes' rule. We develop two efficient approximate posterior inference algorithms based on variational inference and MCMC sampling. The other approach is to impose diversity-promoting regularization directly over the post-data distribution of components. We also extend our approach to "diversify" Bayesian nonparametric Models where the number of components is infinite. A sampling algorithm based on slice sampling and Hamiltonian Monte Carlo is developed. We apply these methods to "diversify" Bayesian mixture of experts model and infinite Latent feature model. Experiments on various datasets demonstrate the effectiveness and efficiency of our methods.

  • diversity promoting bayesian learning of Latent Variable Models
    International Conference on Machine Learning, 2016
    Co-Authors: Pengtao Xie, Jun Zhu, Eric P Xing
    Abstract:

    In learning Latent Variable Models (LVMs), it is important to effectively capture infrequent patterns and shrink model size without sacrificing modeling power. Various studies have been done to "diversify" a LVM, which aim to learn a diverse set of Latent components in LVMs. Most existing studies fall into a frequentist-style regularization framework, where the components are learned via point estimation. In this paper, we investigate how to "diversify" LVMs in the paradigm of Bayesian learning, which has advantages complementary to point estimation, such as alleviating overfitting via model averaging and quantifying uncertainty. We propose two approaches that have complementary advantages. One is to define diversity-promoting mutual angular priors which assign larger density to components with larger mutual angles based on Bayesian network and von Mises-Fisher distribution and use these priors to affect the posterior via Bayes rule. We develop two efficient approximate posterior inference algorithms based on variational inference and Markov chain Monte Carlo sampling. The other approach is to impose diversity-promoting regularization directly over the post-data distribution of components. These two methods are applied to the Bayesian mixture of experts model to encourage the "experts" to be diverse and experimental results demonstrate the effectiveness and efficiency of our methods.

Zhiqiang Ge - One of the best experts on this subject based on the ideXlab platform.

  • process data analytics via probabilistic Latent Variable Models a tutorial review
    Industrial & Engineering Chemistry Research, 2018
    Co-Authors: Zhiqiang Ge
    Abstract:

    Dimensionality reduction is important for the high-dimensional nature of data in the process industry, which has made Latent Variable modeling methods popular in recent years. By projecting high-dimensional data into a lower-dimensional space, Latent Variables Models are able to extract key information from process data while simultaneously improving the efficiency of data analytics. Through a probabilistic viewpoint, this paper carries out a tutorial review of probabilistic Latent Variable Models on process data analytics. Detailed illustrations of different kinds of basic probabilistic Latent Variable Models (PLVM) are provided, as well as their research statuses. Additionally, more counterparts of those basic PLVMs are introduced and discussed for process data analytics. Several perspectives are highlighted for future research on this topic.