Data Mining Tool

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 31914 Experts worldwide ranked by ideXlab platform

Philippe Marc - One of the best experts on this subject based on the ideXlab platform.

  • yMGV: a cross‐species expression Data Mining Tool
    Nucleic Acids Research, 2004
    Co-Authors: Gaëlle Lelandais, Stéphane Le Crom, Frédéric Devaux, Stéphane Vialette, George M. Church, Claude Jacq, Philippe Marc
    Abstract:

    The yeast Microarray Global Viewer (yMGV @ http://transcriptome.ens.fr/ymgv) was created 3 years ago as a Database that houses a collection of Saccharomyces cerevisiae and Schizosaccharo myces pombe microarray Data sets published in 82 different articles. yMGV couples Data Mining Tools with a user-friendly web interface so that, with a few mouse clicks, one can identify the conditions that affect the expression of a gene or list of genes regulated in a set of experiments. One of the major new features we present here is a set of Tools that allows for inter-organism comparisons. This should enable the fission yeast community to take advantage of the large amount of available information on budding yeast transcriptome. New Tools and ongoing developments are also presented here.

  • yMGV: a cross-species expression Data Mining Tool.
    Nucleic acids research, 2004
    Co-Authors: Gaëlle Lelandais, Stéphane Le Crom, Frédéric Devaux, Stéphane Vialette, George M. Church, Claude Jacq, Philippe Marc
    Abstract:

    The yeast Microarray Global Viewer (yMGV @ http://transcriptome.ens.fr/ymgv) was created 3 years ago as a Database that houses a collection of Saccharomyces cerevisiae and Schizosaccharo myces pombe microarray Data sets published in 82 different articles. yMGV couples Data Mining Tools with a user-friendly web interface so that, with a few mouse clicks, one can identify the conditions that affect the expression of a gene or list of genes regulated in a set of experiments. One of the major new features we present here is a set of Tools that allows for inter-organism comparisons. This should enable the fission yeast community to take advantage of the large amount of available information on budding yeast transcriptome. New Tools and ongoing developments are also presented here.

Shenghsuan Wang - One of the best experts on this subject based on the ideXlab platform.

  • pattern recognition in time series Database a case study on financial Database
    Expert Systems With Applications, 2007
    Co-Authors: Yanping Huang, Chungchian Hsu, Shenghsuan Wang
    Abstract:

    Today, there are more and more time series Data that coexist with other Data. These Data exist in useful and understandable patterns. Data management of time series Data must take into account an integrated approach. However, many researches face numeric Data attributes. Therefore, the need for time series Data Mining Tool has become extremely important. The purpose of this paper is to provide a novel pattern in Mining architecture with mixed attributes that uses a systematic approach in the financial Database information Mining. Time series pattern Mining (TSPM) architecture combines the extended visualization-induced self-organizing map algorithm and the extended Naive Bayesian algorithm. This Mining architecture can simulate human intelligence and discover patterns automatically. The TSPM approach also demonstrates good returns in pattern research.

Gaëlle Lelandais - One of the best experts on this subject based on the ideXlab platform.

  • yMGV: a cross‐species expression Data Mining Tool
    Nucleic Acids Research, 2004
    Co-Authors: Gaëlle Lelandais, Stéphane Le Crom, Frédéric Devaux, Stéphane Vialette, George M. Church, Claude Jacq, Philippe Marc
    Abstract:

    The yeast Microarray Global Viewer (yMGV @ http://transcriptome.ens.fr/ymgv) was created 3 years ago as a Database that houses a collection of Saccharomyces cerevisiae and Schizosaccharo myces pombe microarray Data sets published in 82 different articles. yMGV couples Data Mining Tools with a user-friendly web interface so that, with a few mouse clicks, one can identify the conditions that affect the expression of a gene or list of genes regulated in a set of experiments. One of the major new features we present here is a set of Tools that allows for inter-organism comparisons. This should enable the fission yeast community to take advantage of the large amount of available information on budding yeast transcriptome. New Tools and ongoing developments are also presented here.

  • yMGV: a cross-species expression Data Mining Tool.
    Nucleic acids research, 2004
    Co-Authors: Gaëlle Lelandais, Stéphane Le Crom, Frédéric Devaux, Stéphane Vialette, George M. Church, Claude Jacq, Philippe Marc
    Abstract:

    The yeast Microarray Global Viewer (yMGV @ http://transcriptome.ens.fr/ymgv) was created 3 years ago as a Database that houses a collection of Saccharomyces cerevisiae and Schizosaccharo myces pombe microarray Data sets published in 82 different articles. yMGV couples Data Mining Tools with a user-friendly web interface so that, with a few mouse clicks, one can identify the conditions that affect the expression of a gene or list of genes regulated in a set of experiments. One of the major new features we present here is a set of Tools that allows for inter-organism comparisons. This should enable the fission yeast community to take advantage of the large amount of available information on budding yeast transcriptome. New Tools and ongoing developments are also presented here.

Cheikh Loucoubar - One of the best experts on this subject based on the ideXlab platform.

  • An Exhaustive, Non-Euclidean, Non-Parametric Data Mining Tool for Unraveling the Complexity of Biological Systems – Novel Insights into Malaria
    PLoS ONE, 2011
    Co-Authors: Cheikh Loucoubar, Richard Paul, Avner Bar-hen, Augustin Huret, Adama Tall, Cheikh Sokhna, Jean-francois Trape, Alioune Badara Ly, Joseph Faye, Abdoulaye Badiane
    Abstract:

    Complex, high-dimensional Data sets pose significant analytical challenges in the post-genomic era. Such Data sets are not exclusive to genetic analyses and are also pertinent to epidemiology. There has been considerable effort to develop hypothesis-free Data Mining and machine learning methodologies. However, current methodologies lack exhaustivity and general applicability. Here we use a novel non-parametric, non-euclidean Data Mining Tool, HyperCubeH, to explore exhaustively a complex epidemiological malaria Data set by searching for over density of events in m-dimensional space. Hotspots of over density correspond to strings of variables, rules, that determine, in this case, the occurrence of Plasmodium falciparum clinical malaria episodes. The Data set contained 46,837 outcome events from 1,653 individuals and 34 explanatory variables. The best predictive rule contained 1,689 events from 148 individuals and was defined as: individuals present during 1992-2003, aged 1-5 years old, having hemoglobin AA, and having had previous Plasmodium malariae malaria parasite infection #10 times. These individuals had 3.71 times more P. falciparum clinical malaria episodes than the general population. We validated the rule in two different cohorts. We compared and contrasted the HyperCubeH rule with the rules using variables identified by both traditional statistical methods and non-parametric regression tree methods. In addition, we tried all possible sub-stratified quantitative variables. No other model with equal or greater representativity gave a higher Relative Risk. Although three of the four variables in the rule were intuitive, the effect of number of P. malariae episodes was not. HyperCubeH efficiently sub-stratified quantitative variables to optimize the rule and was able to identify interactions among the variables, tasks not easy to perform using standard Data Mining methods. Search of local over density in m-dimensional space, explained by easily interpretable rules, is thus seemingly ideal for generating hypotheses for large Datasets to unravel the complexity inherent in biological systems.

  • an exhaustive non euclidean non parametric Data Mining Tool for unraveling the complexity of biological systems novel insights into malaria
    PLOS ONE, 2011
    Co-Authors: Cheikh Loucoubar, Richard Paul, Augustin Huret, Adama Tall, Cheikh Sokhna, Jean-francois Trape, Avner Barhen, Joseph Faye
    Abstract:

    Complex, high-dimensional Data sets pose significant analytical challenges in the post-genomic era. Such Data sets are not exclusive to genetic analyses and are also pertinent to epidemiology. There has been considerable effort to develop hypothesis-free Data Mining and machine learning methodologies. However, current methodologies lack exhaustivity and general applicability. Here we use a novel non-parametric, non-euclidean Data Mining Tool, HyperCube®, to explore exhaustively a complex epidemiological malaria Data set by searching for over density of events in m-dimensional space. Hotspots of over density correspond to strings of variables, rules, that determine, in this case, the occurrence of Plasmodium falciparum clinical malaria episodes. The Data set contained 46,837 outcome events from 1,653 individuals and 34 explanatory variables. The best predictive rule contained 1,689 events from 148 individuals and was defined as: individuals present during 1992–2003, aged 1–5 years old, having hemoglobin AA, and having had previous Plasmodium malariae malaria parasite infection ≤10 times. These individuals had 3.71 times more P. falciparum clinical malaria episodes than the general population. We validated the rule in two different cohorts. We compared and contrasted the HyperCube® rule with the rules using variables identified by both traditional statistical methods and non-parametric regression tree methods. In addition, we tried all possible sub-stratified quantitative variables. No other model with equal or greater representativity gave a higher Relative Risk. Although three of the four variables in the rule were intuitive, the effect of number of P. malariae episodes was not. HyperCube® efficiently sub-stratified quantitative variables to optimize the rule and was able to identify interactions among the variables, tasks not easy to perform using standard Data Mining methods. Search of local over density in m-dimensional space, explained by easily interpretable rules, is thus seemingly ideal for generating hypotheses for large Datasets to unravel the complexity inherent in biological systems.

Yanping Huang - One of the best experts on this subject based on the ideXlab platform.

  • pattern recognition in time series Database a case study on financial Database
    Expert Systems With Applications, 2007
    Co-Authors: Yanping Huang, Chungchian Hsu, Shenghsuan Wang
    Abstract:

    Today, there are more and more time series Data that coexist with other Data. These Data exist in useful and understandable patterns. Data management of time series Data must take into account an integrated approach. However, many researches face numeric Data attributes. Therefore, the need for time series Data Mining Tool has become extremely important. The purpose of this paper is to provide a novel pattern in Mining architecture with mixed attributes that uses a systematic approach in the financial Database information Mining. Time series pattern Mining (TSPM) architecture combines the extended visualization-induced self-organizing map algorithm and the extended Naive Bayesian algorithm. This Mining architecture can simulate human intelligence and discover patterns automatically. The TSPM approach also demonstrates good returns in pattern research.