Cumulative Distribution

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

Brendan J Frey - One of the best experts on this subject based on the ideXlab platform.

  • Cumulative Distribution networks and the derivative sum product algorithm
    arXiv: Learning, 2012
    Co-Authors: Jim C Huang, Brendan J Frey
    Abstract:

    We introduce a new type of graphical model called a "Cumulative Distribution network" (CDN), which expresses a joint Cumulative Distribution as a product of local functions. Each local function can be viewed as providing evidence about possible orderings, or rankings, of variables. Interestingly, we find that the conditional independence properties of CDNs are quite different from other graphical models. We also describe a messagepassing algorithm that efficiently computes conditional Cumulative Distributions. Due to the unique independence properties of the CDN, these messages do not in general have a one-to-one correspondence with messages exchanged in standard algorithms, such as belief propagation. We demonstrate the application of CDNs for structured ranking learning using a previously-studied multi-player gaming dataset.

  • Cumulative Distribution networks and the derivative sum product algorithm models and inference for Cumulative Distribution functions on graphs
    Journal of Machine Learning Research, 2011
    Co-Authors: Jim C Huang, Brendan J Frey
    Abstract:

    We present a class of graphical models for directly representing the joint Cumulative Distribution function (CDF) of many random variables, called Cumulative Distribution networks (CDNs). Unlike graphs for probability density and mass functions, for CDFs the marginal probabilities for any subset of variables are obtained by computing limits of functions in the model, and conditional probabilities correspond to computing mixed derivatives. We will show that the conditional independence properties in a CDN are distinct from the conditional independence properties of directed, undirected and factor graphs, but include the conditional independence properties of bi-directed graphs. In order to perform inference in such models, we describe the `derivative-sum-product' (DSP) message-passing algorithm in which messages correspond to derivatives of the joint CDF. We will then apply CDNs to the problem of learning to rank players in multiplayer team-based games and suggest several future directions for research.

  • structured ranking learning using Cumulative Distribution networks
    Neural Information Processing Systems, 2008
    Co-Authors: Jim C Huang, Brendan J Frey
    Abstract:

    Ranking is at the heart of many information retrieval applications. Unlike standard regression or classification in which we predict outputs independently, in ranking we are interested in predicting structured outputs so that misranking one object can significantly affect whether we correctly rank the other objects. In practice, the problem of ranking involves a large number of objects to be ranked and either approximate structured prediction methods are required, or assumptions of independence between object scores must be made in order to make the problem tractable. We present a probabilistic method for learning to rank using the graphical modelling framework of Cumulative Distribution networks (CDNs), where we can take into account the structure inherent to the problem of ranking by modelling the joint Cumulative Distribution functions (CDFs) over multiple pairwise preferences. We apply our framework to the problem of document retrieval in the case of the OHSUMED benchmark dataset. We will show that the RankNet, ListNet and ListMLE probabilistic models can be viewed as particular instances of CDNs and that our proposed framework allows for the exploration of a broad class of flexible structured loss functionals for learning to rank.

Gustavo K Rohde - One of the best experts on this subject based on the ideXlab platform.

  • parametric signal estimation using the Cumulative Distribution transform
    IEEE Transactions on Signal Processing, 2020
    Co-Authors: Abu Hasnat Mohammad Rubaiyat, Kyla M Hallam, Jonathan M Nichols, Meredith N Hutchinson, Gustavo K Rohde
    Abstract:

    We present a new method for estimating signal model parameters using the Cumulative Distribution Transform (CDT). Our approach minimizes the Wasserstein distance between measured and model signals. We derive some useful properties of the CDT and show that the resulting estimation problem, while nonlinear in the original signal domain, becomes a linear least squares problem in the transform domain. Furthermore, we discuss the properties of the estimator in the presence of noise and present a novel approach for mitigating the impact of the noise on the estimates. The proposed estimation approach is evaluated by applying it to a source localization problem and comparing its performance against traditional approaches.

  • radon Cumulative Distribution transform subspace modeling for image classification
    arXiv: Computer Vision and Pattern Recognition, 2020
    Co-Authors: Mohammad Shifaterabbi, Soheil Kolouri, Abu Hasnat Mohammad Rubaiyat, Jonathan M Nichols, Xuwang Yin, Akram Aldroubi, Gustavo K Rohde
    Abstract:

    We present a new supervised image classification method for problems where the data at hand conform to certain deformation models applied to unknown prototypes or templates. The method makes use of the previously described Radon Cumulative Distribution Transform (R-CDT) for image data, whose mathematical properties are exploited to express the image data in a form that is more suitable for machine learning. While certain operations such as translation, scaling, and higher-order transformations are challenging to model in native image space, we show the R-CDT can capture some of these variations and thus render the associated image classification problems easier to solve. The method is simple to implement, non-iterative, has no hyper-parameters to tune, it is computationally efficient, and provides competitive accuracies to state-of-the-art neural networks for many types of classification problems, especially in a learning with few labels setting. Furthermore, we show improvements with respect to neural network-based methods in terms of computational efficiency (it can be implemented without the use of GPUs), number of training samples needed for training, as well as out-of-Distribution generalization. The Python code for reproducing our results is available at this https URL.

  • the Cumulative Distribution transform and linear pattern classification
    Applied and Computational Harmonic Analysis, 2018
    Co-Authors: Se Rim Park, Soheil Kolouri, Shinjini Kundu, Gustavo K Rohde
    Abstract:

    Abstract Discriminating data classes emanating from sensors is an important problem with many applications in science and technology. We describe a new transform for pattern representation that interprets patterns as probability density functions, and has special properties with regards to classification. The transform, which we denote as the Cumulative Distribution Transform (CDT), is invertible, with well defined forward and inverse operations. We show that it can be useful in ‘parsing out’ variations (confounds) that are ‘Lagrangian’ (displacement and intensity variations) by converting these to ‘Eulerian’ (intensity variations) in transform space. This conversion is the basis for our main result that describes when the CDT can allow for linear classification to be possible in transform space. We also describe several properties of the transform and show, with computational experiments that used both real and simulated data, that the CDT can help render a variety of real world problems simpler to solve.

  • detecting mammographically occult cancer in women with dense breasts using radon Cumulative Distribution transform a preliminary analysis
    Medical Imaging 2018: Computer-Aided Diagnosis, 2018
    Co-Authors: Juhun Lee, Robert M Nishikawa, Gustavo K Rohde
    Abstract:

    We propose using novel imaging biomarkers for detecting mammographically-occult (MO) cancer in women with dense breast tissue. MO cancer indicates visually occluded, or very subtle, cancer that radiologists fail to recognize as a sign of cancer. We used the Radon Cumulative Distribution Transform (RCDT) as a novel image transformation to project the difference between left and right mammograms into a space, increasing the detectability of occult cancer. We used a dataset of 617 screening full-field digital mammograms (FFDMs) of 238 women with dense breast tissue. Among 238 women, 173 were normal with 2 – 4 consecutive screening mammograms, 552 normal mammograms in total, and the remaining 65 women had an MO cancer with a negative screening mammogram. We used Principal Component Analysis (PCA) to find representative patterns in normal mammograms in the RCDT space. We projected all mammograms to the space constructed by the first 30 eigenvectors of the RCDT of normal cases. Under 10-fold crossvalidation, we conducted quantitative feature analysis to classify normal mammograms and mammograms with MO cancer. We used receiver operating characteristic (ROC) analysis to evaluate the classifier’s output using the area under the ROC curve (AUC) as the figure of merit. Four eigenvectors were selected via a feature selection method. The mean and standard deviation of the AUC of the trained classifier on the test set were 0.74 and 0.08, respectively. In conclusion, we utilized imaging biomarkers to highlight differences between left and right mammograms to detect MO cancer using novel imaging transformation.

  • the radon Cumulative Distribution transform and its application to image classification
    IEEE Transactions on Image Processing, 2016
    Co-Authors: Soheil Kolouri, Se Rim Park, Gustavo K Rohde
    Abstract:

    Invertible image representation methods (transforms) are routinely employed as low-level image processing operations based on which feature extraction and recognition algorithms are developed. Most transforms in current use (e.g., Fourier, wavelet, and so on) are linear transforms and, by themselves, are unable to substantially simplify the representation of image classes for classification. Here, we describe a nonlinear, invertible, low-level image processing transform based on combining the well-known Radon transform for image data, and the 1D Cumulative Distribution transform proposed earlier. We describe a few of the properties of this new transform, and with both theoretical and experimental results show that it can often render certain problems linearly separable in a transform space.

Jim C Huang - One of the best experts on this subject based on the ideXlab platform.

  • Cumulative Distribution networks and the derivative sum product algorithm
    arXiv: Learning, 2012
    Co-Authors: Jim C Huang, Brendan J Frey
    Abstract:

    We introduce a new type of graphical model called a "Cumulative Distribution network" (CDN), which expresses a joint Cumulative Distribution as a product of local functions. Each local function can be viewed as providing evidence about possible orderings, or rankings, of variables. Interestingly, we find that the conditional independence properties of CDNs are quite different from other graphical models. We also describe a messagepassing algorithm that efficiently computes conditional Cumulative Distributions. Due to the unique independence properties of the CDN, these messages do not in general have a one-to-one correspondence with messages exchanged in standard algorithms, such as belief propagation. We demonstrate the application of CDNs for structured ranking learning using a previously-studied multi-player gaming dataset.

  • Cumulative Distribution networks and the derivative sum product algorithm models and inference for Cumulative Distribution functions on graphs
    Journal of Machine Learning Research, 2011
    Co-Authors: Jim C Huang, Brendan J Frey
    Abstract:

    We present a class of graphical models for directly representing the joint Cumulative Distribution function (CDF) of many random variables, called Cumulative Distribution networks (CDNs). Unlike graphs for probability density and mass functions, for CDFs the marginal probabilities for any subset of variables are obtained by computing limits of functions in the model, and conditional probabilities correspond to computing mixed derivatives. We will show that the conditional independence properties in a CDN are distinct from the conditional independence properties of directed, undirected and factor graphs, but include the conditional independence properties of bi-directed graphs. In order to perform inference in such models, we describe the `derivative-sum-product' (DSP) message-passing algorithm in which messages correspond to derivatives of the joint CDF. We will then apply CDNs to the problem of learning to rank players in multiplayer team-based games and suggest several future directions for research.

  • Cumulative Distribution networks inference estimation and applications of graphical models for Cumulative Distribution functions
    2009
    Co-Authors: Jim C Huang
    Abstract:

    This thesis presents a class of graphical models for directly representing the joint Cumulative Distribution function (CDF) of many random variables, called Cumulative Distribution networks (CDNs). Unlike graphical models for probability density and mass functions, in a CDN, the marginal probabilities for any subset of variables are obtained by computing limits of functions in the model. We will show that the conditional independence properties in a CDN are distinct from the conditional independence properties of directed, undirected and factor graph models, but include the conditional independence properties of bidirected graphical models. As a result, CDNs are a parameterization for bidirected models that allows us to represent complex statistical dependence relationships between observable variables. We will provide a method for constructing a factor graph model with additional latent variables for which graph separation of variables in the corresponding CDN implies conditional independence of the separated variables in both the CDN and in the factor graph with the latent variables marginalized out. This will then allow us to construct multivariate extreme value Distributions for which both a CDN and a corresponding factor graph representation exist. In order to perform inference in such graphs, we describe the ‘derivative-sum-product’ (DSP) message-passing algorithm where messages correspond to derivatives of the joint Cumulative Distribution function. We will then apply CDNs to the problem of learning to rank, or estimating parametric models for ranking, where CDNs provide a natural means with which to model multivariate probabilities over ordinal variables such as pairwise preferences. We will show that many previous probability models for rank data, such as the Bradley-Terry and Plackett-Luce models, can be viewed as particular types of CDN. Applications of CDNs will be described for the problems of ranking players in multiplayer team-based games, document retrieval and discovering regulatory sequences in computational biology using the above methods for inference and estimation of CDNs.

  • structured ranking learning using Cumulative Distribution networks
    Neural Information Processing Systems, 2008
    Co-Authors: Jim C Huang, Brendan J Frey
    Abstract:

    Ranking is at the heart of many information retrieval applications. Unlike standard regression or classification in which we predict outputs independently, in ranking we are interested in predicting structured outputs so that misranking one object can significantly affect whether we correctly rank the other objects. In practice, the problem of ranking involves a large number of objects to be ranked and either approximate structured prediction methods are required, or assumptions of independence between object scores must be made in order to make the problem tractable. We present a probabilistic method for learning to rank using the graphical modelling framework of Cumulative Distribution networks (CDNs), where we can take into account the structure inherent to the problem of ranking by modelling the joint Cumulative Distribution functions (CDFs) over multiple pairwise preferences. We apply our framework to the problem of document retrieval in the case of the OHSUMED benchmark dataset. We will show that the RankNet, ListNet and ListMLE probabilistic models can be viewed as particular instances of CDNs and that our proposed framework allows for the exploration of a broad class of flexible structured loss functionals for learning to rank.

Nanjung Hsu - One of the best experts on this subject based on the ideXlab platform.

  • prediction of spatial Cumulative Distribution functions using subsampling
    Journal of the American Statistical Association, 1999
    Co-Authors: Soumendra N Lahiri, Mark S Kaiser, Noel A Cressie, Nanjung Hsu
    Abstract:

    Abstract The spatial Cumulative Distribution function (SCDF) is a random function that provides a statistical summary of a random field over a spatial domain of interest. In this article we develop a spatial subsampling method for predicting an SCDF based on observations made on a hexagonal grid, similar to the one used in the Environmental Monitoring and Assessment Program of the U.S. Environmental Protection Agency. We show that under quite general conditions, the proposed subsampling method provides accurate data-based approximations to the sampling Distributions of various functionals of the SCDF predictor. In particular, it produces estimators of different population characteristics, such as the quantiles and weighted mean integrated squared errors of the empirical predictor. As an illustration, we apply the subsampling method to construct large-sample prediction bands for the SCDF of an ecological index for foliage condition of red maple trees in the state of Maine.

Soumendra N Lahiri - One of the best experts on this subject based on the ideXlab platform.

  • asymptotic Distribution of the empirical Cumulative Distribution function predictor under nonstationarity
    2001
    Co-Authors: Jun Zhu, Soumendra N Lahiri, Noel A Cressie
    Abstract:

    In this paper, we establish a functional central limit theorem for the empirical predictor of a spatial Cumulative Distribution function for a random field with a nonstationary mean structure. The type of spatial asymptotic framework used here is somewhat nonstandard; it is a mixture of the so called “infill” and “increasing domain” asymptotic structures. The choice of the appropriate scaling sequence for the empirical predictor depends on certain characteristics of the spatial sampling design generating the sampling sites. A precise description of this dependence is given. The results obtained here extend a similar result of (1999) who considered only the stationary case.

  • asymptotic Distribution of the empirical spatial Cumulative Distribution function predictor and prediction bands based on a subsampling method
    Probability Theory and Related Fields, 1999
    Co-Authors: Soumendra N Lahiri
    Abstract:

    A spatial Cumulative Distribution function F^∞ (say) is a random Distribution function that provides a statistical summary of random field over a given region. This paper considers the empirical predictor of F^∞ based on a finite set of observations from a region in ℝ d under a uniform sampling design. A functional central limit theorem is proved for the predictor as a random element of the space D[−∞, ∞]. A striking feature of the result is that the rate of convergence of the predictor to the predictand F^∞ depends on the location of the data-sites specified by the sampling design. A precise description of the dependence is given. Furthermore, a subsampling method is proposed for integral-based functionals of random fields, which is then used to construct large sample prediction bands for F^∞.

  • prediction of spatial Cumulative Distribution functions using subsampling
    Journal of the American Statistical Association, 1999
    Co-Authors: Soumendra N Lahiri, Mark S Kaiser, Noel A Cressie, Nanjung Hsu
    Abstract:

    Abstract The spatial Cumulative Distribution function (SCDF) is a random function that provides a statistical summary of a random field over a spatial domain of interest. In this article we develop a spatial subsampling method for predicting an SCDF based on observations made on a hexagonal grid, similar to the one used in the Environmental Monitoring and Assessment Program of the U.S. Environmental Protection Agency. We show that under quite general conditions, the proposed subsampling method provides accurate data-based approximations to the sampling Distributions of various functionals of the SCDF predictor. In particular, it produces estimators of different population characteristics, such as the quantiles and weighted mean integrated squared errors of the empirical predictor. As an illustration, we apply the subsampling method to construct large-sample prediction bands for the SCDF of an ecological index for foliage condition of red maple trees in the state of Maine.