Proximity Data

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 140067 Experts worldwide ranked by ideXlab platform

Joachim M Buhmann - One of the best experts on this subject based on the ideXlab platform.

  • optimal cluster preserving embedding of nonmetric Proximity Data
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003
    Co-Authors: Volker Roth, Julian Laub, Motoaki Kawanabe, Joachim M Buhmann
    Abstract:

    For several major applications of Data analysis, objects are often not represented as feature vectors in a vector space, but rather by a matrix gathering pairwise proximities. Such pairwise Data often violates metricity and, therefore, cannot be naturally embedded in a vector space. Concerning the problem of unsupervised structure detection or clustering, in this paper, a new embedding method for pairwise Data into Euclidean vector spaces is introduced. We show that all clustering methods, which are invariant under additive shifts of the pairwise proximities, can be reformulated as grouping problems in Euclidian spaces. The most prominent property of this constant shift embedding framework is the complete preservation of the cluster structure in the embedding space. Restating pairwise clustering problems in vector spaces has several important consequences, such as the statistical description of the clusters by way of cluster prototypes, the generic extension of the grouping procedure to a discriminative prediction rule, and the applicability of standard preprocessing methods like denoising or dimensionality reduction.

  • unsupervised texture segmentation in a deterministic annealing framework
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998
    Co-Authors: Thomas Hofmann, Jan Puzicha, Joachim M Buhmann
    Abstract:

    We present a novel optimization framework for unsupervised texture segmentation that relies on statistical tests as a measure of homogeneity. Texture segmentation is formulated as a Data clustering problem based on sparse Proximity Data. Dissimilarities of pairs of textured regions are computed from a multiscale Gabor filter image representation. We discuss and compare a class of clustering objective functions which is systematically derived from invariance principles. As a general optimization framework, we propose deterministic annealing based on a mean-field approximation. The canonical way to derive clustering algorithms within this framework as well as an efficient implementation of mean-field annealing and the closely related Gibbs sampler are presented. We apply both annealing variants to Brodatz-like microtexture mixtures and real-word images.

  • pairwise Data clustering by deterministic annealing
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997
    Co-Authors: Thomas Hofmann, Joachim M Buhmann
    Abstract:

    Partitioning a Data set and extracting hidden structure from the Data arises in different application areas of pattern recognition, speech and image processing. Pairwise Data clustering is a combinatorial optimization method for Data grouping which extracts hidden structure from Proximity Data. We describe a deterministic annealing approach to pairwise clustering which shares the robustness properties of maximum entropy inference. The resulting Gibbs probability distributions are estimated by mean-field approximation. A new structure-preserving algorithm to cluster dissimilarity Data and to simultaneously embed these Data in a Euclidian vector space is discussed which can be used for dimensionality reduction and Data visualization. The suggested embedding algorithm which outperforms conventional approaches has been implemented to analyze dissimilarity Data from protein analysis and from linguistics. The algorithm for pairwise Data clustering is used to segment textured images.

  • Data visualization by multimensional scaling a deterministic annealing approach
    Pattern Recognition, 1996
    Co-Authors: Hansjoerg Klock, Joachim M Buhmann
    Abstract:

    Abstract Multidimensional scaling addresses the problem how Proximity Data can be faithfully visualized as points in a low-dimensional Euclidean space. The quality of a Data embedding is measured by a stress function which compares Proximity values with Euclidean distances of the respective points. The corresponding minimization problem is non-convex and sensitive to local minima. We present a novel deterministic annealing algorithm for the frequently used objective SSTRESS and for Sammon mapping, derived in the framework of maximum entropy estimation. Experimental results demonstrate the superiority of our optimization technique compared to conventional gradient descent methods.

Thomas Hofmann - One of the best experts on this subject based on the ideXlab platform.

  • unsupervised texture segmentation in a deterministic annealing framework
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998
    Co-Authors: Thomas Hofmann, Jan Puzicha, Joachim M Buhmann
    Abstract:

    We present a novel optimization framework for unsupervised texture segmentation that relies on statistical tests as a measure of homogeneity. Texture segmentation is formulated as a Data clustering problem based on sparse Proximity Data. Dissimilarities of pairs of textured regions are computed from a multiscale Gabor filter image representation. We discuss and compare a class of clustering objective functions which is systematically derived from invariance principles. As a general optimization framework, we propose deterministic annealing based on a mean-field approximation. The canonical way to derive clustering algorithms within this framework as well as an efficient implementation of mean-field annealing and the closely related Gibbs sampler are presented. We apply both annealing variants to Brodatz-like microtexture mixtures and real-word images.

  • pairwise Data clustering by deterministic annealing
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997
    Co-Authors: Thomas Hofmann, Joachim M Buhmann
    Abstract:

    Partitioning a Data set and extracting hidden structure from the Data arises in different application areas of pattern recognition, speech and image processing. Pairwise Data clustering is a combinatorial optimization method for Data grouping which extracts hidden structure from Proximity Data. We describe a deterministic annealing approach to pairwise clustering which shares the robustness properties of maximum entropy inference. The resulting Gibbs probability distributions are estimated by mean-field approximation. A new structure-preserving algorithm to cluster dissimilarity Data and to simultaneously embed these Data in a Euclidian vector space is discussed which can be used for dimensionality reduction and Data visualization. The suggested embedding algorithm which outperforms conventional approaches has been implemented to analyze dissimilarity Data from protein analysis and from linguistics. The algorithm for pairwise Data clustering is used to segment textured images.

Kimball A Romney - One of the best experts on this subject based on the ideXlab platform.

  • an equivalence relation between correspondence analysis and classical metric multidimensional scaling for the recovery of euclidean distances
    British Journal of Mathematical and Statistical Psychology, 1997
    Co-Authors: Douglas J Carroll, Ece Kumbasar, Kimball A Romney
    Abstract:

    A theorem is proved showing that a special variant of correspondence analysis (CA), like classical two-way metric multidimensional scaling (MMDS), recovers Euclidean distances (asymptotically, as a certain constant grows large) exactly, and in fact yields solutions equivalent up to a similarity transformation to MMDS, even in the case of ‘noisy’ Data. Specifically, a slight modification of a use of CA for analysis of Proximity Data proposed independently by Gifi and by Weller & Romney, which depends on a certain additive constant, k, which should be ‘large’, is shown, as ∞, to result in an R-dimensional solution equivalent, up to a scale factor, to that obtained by a certain form of MMDS. It is conjectured that this asymptotic result may account for the apparent success of the closely related ‘Gifi/Weller/Romney’ CA procedure in recovering multidimensional structure underlying Proximity Data.

Banerjee Soumya - One of the best experts on this subject based on the ideXlab platform.

  • Dynamic Graph Streaming Algorithm for Digital Contact Tracing
    2020
    Co-Authors: Mahapatra Gautam, Pradhan Priodyuti, Chattaraj Ranjan, Banerjee Soumya
    Abstract:

    Digital contact tracing of an infected person, testing the possible infection for the contacted persons, and isolation play a crucial role in alleviating the outbreak. Here, we design a dynamic graph streaming algorithm that can trace the contacts under the control of the Public Health Authorities (PHA). The algorithm can work as the augmented part of the PHA for the crisis period. Our algorithm receives Proximity Data from the mobile devices as contact Data streams and uses a sliding window model to construct a dynamic contact graph sketch. Prominently, we introduce the edge label of the contact graph as a contact vector, which acts like a sliding window and holds the latest D days of social interactions. Importantly, the algorithm prepares the direct and indirect (multilevel) contact list from the contact graph sketch for a given set of infected persons. The algorithm also uses a disjoint set Data structure to construct the infectious trees for the trace list. The present study offers the design of algorithms with underlying Data structures for digital contact trace relevant to the Proximity Data produced by Bluetooth enabled mobile devices. Our analysis reveals that for COVID-19 close contact parameters, the storage space requires maintaining the contact graph of ten million users having 14 days close contact Data in PHA server takes 55 Gigabytes of memory and preparation of the contact list for a given set of the infected person depends on the size of the infected list.Comment: 13 Pages, 6 Figures with Appendi

  • Dynamic Graph Streaming Algorithm for Digital Contact Tracing
    2020
    Co-Authors: Mahapatra Gautam, Pradhan Priodyuti, Chattaraj Ranjan, Banerjee Soumya
    Abstract:

    Digital contact tracing of an infected person, testing the possible infection for the contacted persons, and isolation play a crucial role in alleviating the outbreak. Here, we design a dynamic graph streaming algorithm that can trace the contacts under the control of the Public Health Authorities (PHA). Our algorithm receives Proximity Data from the mobile devices as contact Data streams and uses a sliding window model to construct a dynamic contact graph sketch. Prominently, we introduce the edge label of the contact graph as a binary contact vector, which acts like a sliding window and holds the latest D days (incubation period) of temporal social interactions. Notably, the algorithm prepares the direct and indirect (multilevel) contact list from the contact graph sketch for a given set of infected persons. Finally, the algorithm also uses a disjoint set Data structure to construct the infection pathways for the trace list. The present study offers the design of algorithms with underlying Data structures for digital contact trace relevant to the Proximity Data produced by Bluetooth enabled mobile devices. Our analysis reveals that for COVID-19 close contact parameters, the storage space requires maintaining the contact graph of ten million users; having 14 days of close contact Data in the PHA server takes 55 Gigabytes of memory and preparation of the contact list for a given set of the infected person depends on the size of the infected list. Our centralized digital contact tracing framework can also be applicable for other relevant diseases parameterized by an incubation period and Proximity duration of contacts.Comment: 13 Pages, 6 Figures with Supplementar

Klaus Obermayer - One of the best experts on this subject based on the ideXlab platform.

  • classification on Proximity Data with lp machines
    International Conference on Artificial Neural Networks, 1999
    Co-Authors: Thore Graepel, Ralf Herbrich, Bernhard Scholkopf, Alexander J Smola, Perry F Bartlett, Klausrobert Muller, Klaus Obermayer, Robert C Williamson
    Abstract:

    We provide a new linear program to deal with classification of Data in the case of Data given in terms of pairwise proximities. This allows to avoid the problems inherent in using feature spaces with indefinite metric in support vector machines, since the notion of a margin is purely needed in input space where the classification actually occurs. Moreover in our approach we can enforce sparsity in the Proximity representation by sacrificing training error. This turns out to be favorable for Proximity Data. Similar to /spl nu/-SV methods, the only parameter needed in the algorithm is the (asymptotical) number of Data points being classified with a margin. Finally, the algorithm is successfully compared with /spl nu/-SV learning in Proximity space and K-nearest-neighbors on real world Data from neuroscience and molecular biology.

  • a stochastic self organizing map for Proximity Data
    Neural Computation, 1999
    Co-Authors: Thore Graepel, Klaus Obermayer
    Abstract:

    We derive an efficient algorithm for topographic mapping of Proximity Data (TMP), which can be seen as an extension of Kohonen's self-organizing map to arbitrary distance measures. The TMP cost function is derived in a Baysian framework of folded Markov chains for the description of autoencoders. It incorporates the Data by a dissimilarity matrix and the topographic neighborhood by a matrix of transition probabilities. From the principle of maximum entropy, a nonfactorizing Gibbs distribution is obtained, which is approximated in a mean-field fashion. This allows for maximum likelihood estimation using an expectation-maximization algorithm. In analogy to the transition from topographic vector quantization to the self-organizing map, we suggest an approximation to TMP that is computationally more efficient. In order to prevent convergence to local minima, an annealing scheme in the temperature parameter is introduced, for which the critical temperature of the first phase transition is calculated in terms o...

  • classification on pairwise Proximity Data
    Neural Information Processing Systems, 1998
    Co-Authors: Thore Graepel, Ralf Herbrich, Peter Bollmannsdorra, Klaus Obermayer
    Abstract:

    We investigate the problem of learning a classification task on Data represented in terms of their pairwise proximities. This representation does not refer to an explicit feature representation of the Data items and is thus more general than the standard approach of using Euclidean feature vectors, from which pairwise proximities can always be calculated. Our first approach is based on a combined linear embedding and classification procedure resulting in an extension of the Optimal Hyperplane algorithm to pseudo-Euclidean Data. As an alternative we present another approach based on a linear threshold model in the Proximity values themselves, which is optimized using Structural Risk Minimization. We show that prior knowledge about the problem can be incorporated by the choice of distance measures and examine different metrics W.r.t. their generalization. Finally, the algorithms are successfully applied to protein structure Data and to Data from the cat's cerebral cortex. They show better performance than K-nearest-neighbor classification.