Joint Pmf

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 351 Experts worldwide ranked by ideXlab platform

David B Dunson - One of the best experts on this subject based on the ideXlab platform.

  • bayesian modeling of temporal dependence in large sparse contingency tables
    Journal of the American Statistical Association, 2013
    Co-Authors: Tsuyoshi Kunihama, David B Dunson
    Abstract:

    It is of interest in many applications to study trends over time in relationships among categorical variables, such as age group, ethnicity, religious affiliation, political party, and preference for particular policies. At each time point, a sample of individuals provides responses to a set of questions, with different individuals sampled at each time. In such settings, there tend to be an abundance of missing data and the variables being measured may change over time. At each time point, we obtained a large sparse contingency table, with the number of cells often much larger than the number of individuals being surveyed. To borrow information across time in modeling large sparse contingency tables, we propose a Bayesian autoregressive tensor factorization approach. The proposed model relies on a probabilistic Parafac factorization of the Joint Pmf characterizing the categorical data distribution at each time point, with autocorrelation included across times. We develop efficient computational methods th...

  • bayesian modeling of temporal dependence in large sparse contingency tables
    arXiv: Methodology, 2012
    Co-Authors: Tsuyoshi Kunihama, David B Dunson
    Abstract:

    In many applications, it is of interest to study trends over time in relationships among categorical variables, such as age group, ethnicity, religious affiliation, political party and preference for particular policies. At each time point, a sample of individuals provide responses to a set of questions, with different individuals sampled at each time. In such settings, there tends to be abundant missing data and the variables being measured may change over time. At each time point, one obtains a large sparse contingency table, with the number of cells often much larger than the number of individuals being surveyed. To borrow information across time in modeling large sparse contingency tables, we propose a Bayesian autoregressive tensor factorization approach. The proposed model relies on a probabilistic Parafac factorization of the Joint Pmf characterizing the categorical data distribution at each time point, with autocorrelation included across times. Efficient computational methods are developed relying on MCMC. The methods are evaluated through simulation examples and applied to social survey data.

Nicholas D Sidiropoulos - One of the best experts on this subject based on the ideXlab platform.

  • tensors learning and kolmogorov extension for finite alphabet random vectors
    IEEE Transactions on Signal Processing, 2018
    Co-Authors: Nikos Kargas, Nicholas D Sidiropoulos
    Abstract:

    Estimating the Joint probability mass function (Pmf) of a set of random variables lies at the heart of statistical learning and signal processing. Without structural assumptions, such as modeling the variables as a Markov chain, tree, or other graphical model, Joint Pmf estimation is often considered mission impossible—the number of unknowns grows exponentially with the number of variables. But who gives us the structural model? What if we only observe random subsets of the variables, can we still reliably estimate the Joint Pmf of all? This paper shows, perhaps surprisingly, that if the Joint Pmf of any three variables can be estimated, then the Joint Pmf of all the variables can be provably recovered under relatively mild conditions. The result is reminiscent of Kolmogorov's extension theorem – consistent specification of lower-dimensional distributions induces a unique probability measure for the entire process. The difference is that for processes of limited complexity (rank of the high-dimensional Pmf) it is possible to obtain complete characterization from only three-dimensional distributions. In fact not all three-dimensional Pmfs are needed; and under more stringent conditions even two-dimensional will do. Exploiting multilinear (tensor) algebra, this paper proves that such higher-dimensional Pmf completion can be guaranteed—several pertinent identifiability results are derived. It also provides a practical and efficient algorithm to carry out the recovery task. Judiciously designed simulations and real-data experiments on movie recommendation and data classification are presented to showcase the effectiveness of the approach.

  • completing a Joint Pmf from projections a low rank coupled tensor factorization approach
    arXiv: Learning, 2017
    Co-Authors: Nikos Kargas, Nicholas D Sidiropoulos
    Abstract:

    There has recently been considerable interest in completing a low-rank matrix or tensor given only a small fraction (or few linear combinations) of its entries. Related approaches have found considerable success in the area of recommender systems, under machine learning. From a statistical estimation point of view, the gold standard is to have access to the Joint probability distribution of all pertinent random variables, from which any desired optimal estimator can be readily derived. In practice high-dimensional Joint distributions are very hard to estimate, and only estimates of low-dimensional projections may be available. We show that it is possible to identify higher-order Joint Pmfs from lower-order marginalized Pmfs using coupled low-rank tensor factorization. Our approach features guaranteed identifiability when the full Joint Pmf is of low-enough rank, and effective approximation otherwise. We provide an algorithmic approach to compute the sought factors, and illustrate the merits of our approach using rating prediction as an example.

Tsuyoshi Kunihama - One of the best experts on this subject based on the ideXlab platform.

  • bayesian modeling of temporal dependence in large sparse contingency tables
    Journal of the American Statistical Association, 2013
    Co-Authors: Tsuyoshi Kunihama, David B Dunson
    Abstract:

    It is of interest in many applications to study trends over time in relationships among categorical variables, such as age group, ethnicity, religious affiliation, political party, and preference for particular policies. At each time point, a sample of individuals provides responses to a set of questions, with different individuals sampled at each time. In such settings, there tend to be an abundance of missing data and the variables being measured may change over time. At each time point, we obtained a large sparse contingency table, with the number of cells often much larger than the number of individuals being surveyed. To borrow information across time in modeling large sparse contingency tables, we propose a Bayesian autoregressive tensor factorization approach. The proposed model relies on a probabilistic Parafac factorization of the Joint Pmf characterizing the categorical data distribution at each time point, with autocorrelation included across times. We develop efficient computational methods th...

  • bayesian modeling of temporal dependence in large sparse contingency tables
    arXiv: Methodology, 2012
    Co-Authors: Tsuyoshi Kunihama, David B Dunson
    Abstract:

    In many applications, it is of interest to study trends over time in relationships among categorical variables, such as age group, ethnicity, religious affiliation, political party and preference for particular policies. At each time point, a sample of individuals provide responses to a set of questions, with different individuals sampled at each time. In such settings, there tends to be abundant missing data and the variables being measured may change over time. At each time point, one obtains a large sparse contingency table, with the number of cells often much larger than the number of individuals being surveyed. To borrow information across time in modeling large sparse contingency tables, we propose a Bayesian autoregressive tensor factorization approach. The proposed model relies on a probabilistic Parafac factorization of the Joint Pmf characterizing the categorical data distribution at each time point, with autocorrelation included across times. Efficient computational methods are developed relying on MCMC. The methods are evaluated through simulation examples and applied to social survey data.

Nikos Kargas - One of the best experts on this subject based on the ideXlab platform.

  • tensors learning and kolmogorov extension for finite alphabet random vectors
    IEEE Transactions on Signal Processing, 2018
    Co-Authors: Nikos Kargas, Nicholas D Sidiropoulos
    Abstract:

    Estimating the Joint probability mass function (Pmf) of a set of random variables lies at the heart of statistical learning and signal processing. Without structural assumptions, such as modeling the variables as a Markov chain, tree, or other graphical model, Joint Pmf estimation is often considered mission impossible—the number of unknowns grows exponentially with the number of variables. But who gives us the structural model? What if we only observe random subsets of the variables, can we still reliably estimate the Joint Pmf of all? This paper shows, perhaps surprisingly, that if the Joint Pmf of any three variables can be estimated, then the Joint Pmf of all the variables can be provably recovered under relatively mild conditions. The result is reminiscent of Kolmogorov's extension theorem – consistent specification of lower-dimensional distributions induces a unique probability measure for the entire process. The difference is that for processes of limited complexity (rank of the high-dimensional Pmf) it is possible to obtain complete characterization from only three-dimensional distributions. In fact not all three-dimensional Pmfs are needed; and under more stringent conditions even two-dimensional will do. Exploiting multilinear (tensor) algebra, this paper proves that such higher-dimensional Pmf completion can be guaranteed—several pertinent identifiability results are derived. It also provides a practical and efficient algorithm to carry out the recovery task. Judiciously designed simulations and real-data experiments on movie recommendation and data classification are presented to showcase the effectiveness of the approach.

  • completing a Joint Pmf from projections a low rank coupled tensor factorization approach
    arXiv: Learning, 2017
    Co-Authors: Nikos Kargas, Nicholas D Sidiropoulos
    Abstract:

    There has recently been considerable interest in completing a low-rank matrix or tensor given only a small fraction (or few linear combinations) of its entries. Related approaches have found considerable success in the area of recommender systems, under machine learning. From a statistical estimation point of view, the gold standard is to have access to the Joint probability distribution of all pertinent random variables, from which any desired optimal estimator can be readily derived. In practice high-dimensional Joint distributions are very hard to estimate, and only estimates of low-dimensional projections may be available. We show that it is possible to identify higher-order Joint Pmfs from lower-order marginalized Pmfs using coupled low-rank tensor factorization. Our approach features guaranteed identifiability when the full Joint Pmf is of low-enough rank, and effective approximation otherwise. We provide an algorithmic approach to compute the sought factors, and illustrate the merits of our approach using rating prediction as an example.

Wenwen Tu - One of the best experts on this subject based on the ideXlab platform.

  • On the simulatability condition in key generation over a non-authenticated public channel
    2015 IEEE International Symposium on Information Theory (ISIT), 2015
    Co-Authors: Wenwen Tu
    Abstract:

    Simulatability condition is a fundamental concept in studying key generation over a non-authenticated public channel, in which Eve is active and can intercept, modify and falsify messages exchanged over the non-authenticated public channel. Using this condition, Maurer and Wolf showed a remarkable “all or nothing” result: if the simulatability condition does not hold, the key capacity over the non-authenticated public channel will be the same as that of the case with a passive Eve, while the key capacity over the non-authenticated channel will be zero if the simulatability condition holds. However, two questions remain open so far: 1) For a given Joint probability mass function (Pmf), are there efficient algorithms (polynomial complexity algorithms) for checking whether the simulatability condition holds or not?; and 2) If the simulatability condition holds, are there efficient algorithms for finding the corresponding attack strategy? In this paper, we answer these two open questions affirmatively. In particular, for a given Joint Pmf, we construct a linear programming (LP) problem and show that the simulatability condition holds if and only if the optimal value obtained from the constructed LP is zero. Furthermore, we construct another LP and show that the minimizer of the newly constructed LP is a valid attack strategy. Both LPs can be solved with a polynomial complexity.