Correspondence Analysis

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 41166 Experts worldwide ranked by ideXlab platform

Michael Greenacre - One of the best experts on this subject based on the ideXlab platform.

  • the contributions of rare objects in Correspondence Analysis
    Ecology, 2013
    Co-Authors: Michael Greenacre
    Abstract:

    Correspondence Analysis, when used to visualize relationships in a table of counts (for example, abundance data in ecology), has been frequently criticized as being too sensitive to objects (for example, species) that occur with very low frequency or in very few samples. In this statistical report we show that this criticism is generally unfounded. We demonstrate this in several data sets by calculating the actual contributions of rare objects to the results of Correspondence Analysis and canonical Correspondence Analysis, both to the determination of the principal axes and to the chi-square distance. It is a fact that rare objects are often positioned as outliers in Correspondence Analysis maps, which gives the impression that they are highly influential, but their low weight offsets their distant positions and reduces their effect on the results. An alternative scaling of the Correspondence Analysis solution, the contribution biplot, is proposed as a way of mapping the results in order to avoid the problem of outlying and low contributing rare objects.

  • Correspondence Analysis of raw data
    Ecology, 2010
    Co-Authors: Michael Greenacre
    Abstract:

    Correspondence Analysis has found extensive use in ecology, archeology, linguistics and the social sciences as a method for visualizing the patterns of association in a table of frequencies or nonnegative ratio-scale data. Inherent to the method is the expression of the data in each row or each column relative to their respective totals, and it is these sets of relative values (called profiles) that are visualized. This ‘relativization’ of the data makes perfect sense when the margins of the table represent samples from sub-populations of inherently different sizes. But in some ecological applications sampling is performed on equal areas or equal volumes so that the absolute levels of the observed occurrences may be of relevance, in which case relativization may not be required. In this paper we define the Correspondence Analysis of the raw ‘unrelativized’ data and discuss its properties, comparing this new method to regular Correspondence Analysis and to a related variant of non-symmetric Correspondence Analysis.

  • power transformations in Correspondence Analysis
    Computational Statistics & Data Analysis, 2009
    Co-Authors: Michael Greenacre
    Abstract:

    Power transformations of positive data tables, prior to applying the Correspondence Analysis algorithm, are shown to open up a family of methods with direct connections to the Analysis of log-ratios. Two variations of this idea are illustrated. The first approach is simply to power transform the original data and perform a Correspondence Analysis - this method is shown to converge to unweighted log-ratio Analysis as the power parameter tends to zero. The second approach is to apply the power transformation to the contingency ratios, that is, the values in the table relative to expected values based on the marginals - this method converges to weighted log-ratio Analysis, or the spectral map. Two applications are described: first, a matrix of population genetic data which is inherently two-dimensional, and second, a larger cross-tabulation with higher dimensionality, from a linguistic Analysis of several books.

  • Canonical Correspondence Analysis in social science research
    SSRN Electronic Journal, 2009
    Co-Authors: Michael Greenacre
    Abstract:

    The use of simple and multiple Correspondence Analysis is well-established in social science research for understanding relationships between two or more categorical variables. By contrast, canonical Correspondence Analysis, which is a Correspondence Analysis with linear restrictions on the solution, has become one of the most popular multivariate techniques in ecological research. Multivariate ecological data typically consist of frequencies of observed species across a set of sampling locations, as well as a set of observed environmental variables at the same locations. In this context the principal dimensions of the biological variables are sought in a space that is constrained to be related to the environmental variables. This restricted form of Correspondence Analysis has many uses in social science research as well, as is demonstrated in this paper. We first illustrate the result that canonical Correspondence Analysis of an indicator matrix, restricted to be related an external categorical variable, reduces to a simple Correspondence Analysis of a set of concatenated (or “stacked”) tables. Then we show how canonical Correspondence Analysis can be used to focus on, or partial out, a particular set of response categories in sample survey data. For example, the method can be used to partial out the influence of missing responses, which usually dominate the results of a multiple Correspondence Analysis.

  • Log-Ratio Analysis Is a Limiting Case of Correspondence Analysis
    Mathematical Geosciences, 2009
    Co-Authors: Michael Greenacre
    Abstract:

    It is common practice in compositional data Analysis to perform the log-ratio transformation in order to preserve sub-compositional coherence in the Analysis. Correspondence Analysis is an alternative approach to analyzing ratio-scale data and is often contrasted with log-ratio Analysis. It turns out that if one introduces a power transformation into the Correspondence Analysis algorithm, then the limit of the power-transformed Correspondence Analysis, as the power parameter tends to zero, is exactly the log-ratio Analysis. Depending on how the power transformation is applied, we can obtain as limiting cases either Aitchison’s unweighted log-ratio Analysis or the weighted form called “spectral mapping”. The upshot of this is that one can come as close as one likes to the log-ratio Analysis, weighted or unweighted, using Correspondence Analysis.

Martin Vingron - One of the best experts on this subject based on the ideXlab platform.

  • Correspondence Analysis applied to microarray data
    Proceedings of the National Academy of Sciences of the United States of America, 2001
    Co-Authors: Kurt Fellenberg, Nicole C Hauser, Benedikt Brors, Albert Neutzner, Jorg D Hoheisel, Martin Vingron
    Abstract:

    Correspondence Analysis is an explorative computational method for the study of associations between variables. Much like principal component Analysis, it displays a low-dimensional projection of the data, e.g., into a plane. It does this, though, for two variables simultaneously, thus revealing associations between them. Here, we demonstrate the applicability of Correspondence Analysis to and high value for the Analysis of microarray data, displaying associations between genes and experiments. To introduce the method, we show its application to the well-known Saccharomyces cerevisiae cell-cycle synchronization data by Spellman et al. [Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D. & Futcher, B. (1998) Mol. Biol. Cell 9, 3273–3297], allowing for comparison with their visualization of this data set. Furthermore, we apply Correspondence Analysis to a non-time-series data set of our own, thus supporting its general applicability to microarray data of different complexity, underlying structure, and experimental strategy (both two-channel fluorescence-tag and radioactive labeling).

Vartan Choulakian - One of the best experts on this subject based on the ideXlab platform.

  • Some new aspects of taxicab Correspondence Analysis
    Statistical Methods and Applications, 2014
    Co-Authors: Vartan Choulakian, Biagio Simonetti
    Abstract:

    Correspondence Analysis (CA) and nonsymmetric Correspondence Analysis are based on generalized singular value decomposition, and, in general, they are not equivalent. Taxicab Correspondence Analysis (TCA) is a $$\hbox {L}_{1}$$ L 1 variant of CA, and it is based on the generalized taxicab singular value decomposition (GTSVD). Our aim is to study the taxicab variant of nonsymmetric Correspondence Analysis. We find that for diagonal metric matrices GTSVDs of a given data set are equivalent; from which we deduce the equivalence of TCA and taxicab nonsymmetric Correspondence Analysis. We also attempt to show that TCA stays as close as possible to the original Correspondence matrix without calculating a dissimilarity (or similarity) measure between rows or columns. Further, we discuss some new geometric and distance aspects of TCA. Copyright Springer-Verlag Berlin Heidelberg 2014

  • Graph Partitioning by Correspondence Analysis and Taxicab Correspondence Analysis
    Journal of Classification, 2013
    Co-Authors: Vartan Choulakian, Jules Tibeiro
    Abstract:

    We consider Correspondence Analysis (CA) and taxicab Correspondence Analysis (TCA) of relational datasets that can mathematically be described as weighted loopless graphs. Such data appear in particular in network Analysis. We present CA and TCA as relaxation methods for the graph partitioning problem. Examples of real datasets are provided.

  • Multiple taxicab Correspondence Analysis
    Advanced Data Analysis and Classification, 2008
    Co-Authors: Vartan Choulakian
    Abstract:

    We compare the statistical Analysis of multidimensional contingency tables by multiple Correspondence Analysis (MCA) and multiple taxicab Correspondence Analysis (MTCA). We will show in this paper: First, MTCA and MCA can produce different results. Second, taxicab Correspondence Analysis of a Burt table is equivalent to centroid Correspondence Analysis of the indicator matrix. Third, along the first principal axis, the projected response patterns in MTCA will be clustered and the number of cluster points is less than or equal to 1+ the number of variables. Fourth, visual maps produced by MTCA seem to be clearer and more readable in the presence of rarely occurring categories of the variables than the graphical displays produced by MCA. Two well known data sets are analyzed.

  • Taxicab Correspondence Analysis
    Psychometrika, 2006
    Co-Authors: Vartan Choulakian
    Abstract:

    Taxicab Correspondence Analysis is based on the taxicab singular value decomposition of a contingency table, and it shares some similar properties with Correspondence Analysis. It is more robust than the ordinary Correspondence Analysis, because it gives uniform weights to all the points. The visual map constructed by taxicab Correspondence Analysis has a larger sweep and clearer perspective than the map obtained by Correspondence Analysis. Two examples are provided.

Gastao Coelho Gomes - One of the best experts on this subject based on the ideXlab platform.

  • joint Correspondence Analysis versus multiple Correspondence Analysis a solution to an undetected problem
    Classification and Data Mining, 2013
    Co-Authors: Sergio Camiz, Gastao Coelho Gomes
    Abstract:

    The problem of the proper dimension of the solution of a Multiple Correspondence Analysis (MCA) is discussed, based on both the re-evaluation of the explained inertia sensu Benzecri (Les Cahiers de l’Analyse des Donnees 4:377–379, 1979) and Greenacre (Multiple Correspondence Analysis and related methods, Chapman and Hall (Kluwer), Dordrecht, 2006) and a test proposed by Ben Ammou and Saporta (Revue de Statistique Appliquee 46:21–35, 1998). This leads to the consideration of a better reconstruction of the off-diagonal sub-tables of the Burt’s table crossing the nominal characters taken into account. Thus, Greenacre (Biometrika 75:457–467, 1988) Joint Correspondence Analysis (JCA) is introduced, the results obtained on an application are shown, and the quality of reconstruction of both MCA and JCA solutions are compared to that of a series of Simple Correspondence Analyses run on the whole set of two-way tables. It results that JCA’s reduced-dimensional reconstruction is much better than the MCA’s one, that reveals highly biased and non-monotone, but also than the MCA’s re-evaluation, as suggested by Greenacre (Multiple Correspondence Analysis and related methods, Chapman and Hall (Kluwer), Dordrecht, 2006).

  • Classification and Data Mining - Joint Correspondence Analysis Versus Multiple Correspondence Analysis: A Solution to an Undetected Problem
    Classification and Data Mining, 2012
    Co-Authors: Sergio Camiz, Gastao Coelho Gomes
    Abstract:

    The problem of the proper dimension of the solution of a Multiple Correspondence Analysis (MCA) is discussed, based on both the re-evaluation of the explained inertia sensu Benzecri (Les Cahiers de l’Analyse des Donnees 4:377–379, 1979) and Greenacre (Multiple Correspondence Analysis and related methods, Chapman and Hall (Kluwer), Dordrecht, 2006) and a test proposed by Ben Ammou and Saporta (Revue de Statistique Appliquee 46:21–35, 1998). This leads to the consideration of a better reconstruction of the off-diagonal sub-tables of the Burt’s table crossing the nominal characters taken into account. Thus, Greenacre (Biometrika 75:457–467, 1988) Joint Correspondence Analysis (JCA) is introduced, the results obtained on an application are shown, and the quality of reconstruction of both MCA and JCA solutions are compared to that of a series of Simple Correspondence Analyses run on the whole set of two-way tables. It results that JCA’s reduced-dimensional reconstruction is much better than the MCA’s one, that reveals highly biased and non-monotone, but also than the MCA’s re-evaluation, as suggested by Greenacre (Multiple Correspondence Analysis and related methods, Chapman and Hall (Kluwer), Dordrecht, 2006).

Biagio Simonetti - One of the best experts on this subject based on the ideXlab platform.

  • Some new aspects of taxicab Correspondence Analysis
    Statistical Methods and Applications, 2014
    Co-Authors: Vartan Choulakian, Biagio Simonetti
    Abstract:

    Correspondence Analysis (CA) and nonsymmetric Correspondence Analysis are based on generalized singular value decomposition, and, in general, they are not equivalent. Taxicab Correspondence Analysis (TCA) is a $$\hbox {L}_{1}$$ L 1 variant of CA, and it is based on the generalized taxicab singular value decomposition (GTSVD). Our aim is to study the taxicab variant of nonsymmetric Correspondence Analysis. We find that for diagonal metric matrices GTSVDs of a given data set are equivalent; from which we deduce the equivalence of TCA and taxicab nonsymmetric Correspondence Analysis. We also attempt to show that TCA stays as close as possible to the original Correspondence matrix without calculating a dissimilarity (or similarity) measure between rows or columns. Further, we discuss some new geometric and distance aspects of TCA. Copyright Springer-Verlag Berlin Heidelberg 2014

  • A European perception of food using two methods of Correspondence Analysis
    Food Quality and Preference, 2011
    Co-Authors: Rosaria Lombardo, Biagio Simonetti
    Abstract:

    Abstract In a recent issue of this journal, Guerrero et al. (2010) studied an interesting data set involving the Analysis of consumer-driven associations to the word ‘‘Traditional”, from a food perspective, in six European countries. As part of their Analysis, they demonstrated the sources of association between the words studied and the country of origin of those interviewed using Correspondence Analysis. In this paper, we focus on this association by assuming that the country of origin is a predictor of the words associated with “Traditional”. This Analysis is performed using another member of the Correspondence Analysis family – non-symmetric Correspondence Analysis. This paper will also explore the use of both these Correspondence Analysis techniques on their data and consider the dendrogram and the semantic differential plot as alternative approaches to visually summarising the association.

  • Taxicab non-symmetrical Correspondence Analysis
    2008
    Co-Authors: Biagio Simonetti
    Abstract:

    En Non Symmetrical Correspondence Analysis (NSCA) is a variant of the classical Correspondence Analysis (CA) for analyze two-way contingency table with a structure of dependence between two variables. In order to overcome the influence due to the presence of outlier, in this paper, it is presented Taxicab Non Symmetrical Correspondence Analysis (TNSCA), based on the taxicab singular value decomposition. It will show that TNSCA it is more robust than the ordinary NSCA, because it gives uniform weights to all the points. The visual map constructed by TNSCA offers a clearer perspective than the map obtained by Correspondence Analysis. Examples are provided.