Data Standardization

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 71160 Experts worldwide ranked by ideXlab platform

Jun Wang - One of the best experts on this subject based on the ideXlab platform.

  • authenticating cherry tomato juices discussion of different Data Standardization and fusion approaches based on electronic nose and tongue
    Food Research International, 2014
    Co-Authors: Xuezhen Hong, Jun Wang
    Abstract:

    Abstract The study presented six approaches (two e-nose measurements, an e-tongue measurement and three fusion approaches using both of the instruments) for recognition and quantitative analysis of four tomato juice groups: unadulterated and three adulterated tomato juices with different adulteration levels. Recognition of the juices was performed by principle component analysis (PCA) and cluster analysis (CA). Quantitative calibration with respect to pH and soluble solids content (SSC) was performed using four regression methods (principle components regression (PCR) based on stepwise selection, multiple linear regression (MLR) based on raw feature vector, forward selection and stepwise selection features). CA based on different Data Standardization and distance calculation methods were compared, and precision-recall measure was applied to quantify clustering outcomes. The result implies that it is important to explore the optimum Standardization and distance calculation methods for every Dataset studied prior to CA. Humidity effect was also explored and the result showed that employing desiccant for e-nose measurement presented no improvement. The fusion Dataset that consists of variables selected by analysis of variance (ANOVA) presented the best authentication ability, and the quality indices highly correlated to this Dataset.

  • Authenticating cherry tomato juices—Discussion of different Data Standardization and fusion approaches based on electronic nose and tongue
    Food Research International, 2014
    Co-Authors: Xuezhen Hong, Jun Wang, Shanshan Qiu
    Abstract:

    Abstract The study presented six approaches (two e-nose measurements, an e-tongue measurement and three fusion approaches using both of the instruments) for recognition and quantitative analysis of four tomato juice groups: unadulterated and three adulterated tomato juices with different adulteration levels. Recognition of the juices was performed by principle component analysis (PCA) and cluster analysis (CA). Quantitative calibration with respect to pH and soluble solids content (SSC) was performed using four regression methods (principle components regression (PCR) based on stepwise selection, multiple linear regression (MLR) based on raw feature vector, forward selection and stepwise selection features). CA based on different Data Standardization and distance calculation methods were compared, and precision-recall measure was applied to quantify clustering outcomes. The result implies that it is important to explore the optimum Standardization and distance calculation methods for every Dataset studied prior to CA. Humidity effect was also explored and the result showed that employing desiccant for e-nose measurement presented no improvement. The fusion Dataset that consists of variables selected by analysis of variance (ANOVA) presented the best authentication ability, and the quality indices highly correlated to this Dataset.

Xuezhen Hong - One of the best experts on this subject based on the ideXlab platform.

  • authenticating cherry tomato juices discussion of different Data Standardization and fusion approaches based on electronic nose and tongue
    Food Research International, 2014
    Co-Authors: Xuezhen Hong, Jun Wang
    Abstract:

    Abstract The study presented six approaches (two e-nose measurements, an e-tongue measurement and three fusion approaches using both of the instruments) for recognition and quantitative analysis of four tomato juice groups: unadulterated and three adulterated tomato juices with different adulteration levels. Recognition of the juices was performed by principle component analysis (PCA) and cluster analysis (CA). Quantitative calibration with respect to pH and soluble solids content (SSC) was performed using four regression methods (principle components regression (PCR) based on stepwise selection, multiple linear regression (MLR) based on raw feature vector, forward selection and stepwise selection features). CA based on different Data Standardization and distance calculation methods were compared, and precision-recall measure was applied to quantify clustering outcomes. The result implies that it is important to explore the optimum Standardization and distance calculation methods for every Dataset studied prior to CA. Humidity effect was also explored and the result showed that employing desiccant for e-nose measurement presented no improvement. The fusion Dataset that consists of variables selected by analysis of variance (ANOVA) presented the best authentication ability, and the quality indices highly correlated to this Dataset.

  • Authenticating cherry tomato juices—Discussion of different Data Standardization and fusion approaches based on electronic nose and tongue
    Food Research International, 2014
    Co-Authors: Xuezhen Hong, Jun Wang, Shanshan Qiu
    Abstract:

    Abstract The study presented six approaches (two e-nose measurements, an e-tongue measurement and three fusion approaches using both of the instruments) for recognition and quantitative analysis of four tomato juice groups: unadulterated and three adulterated tomato juices with different adulteration levels. Recognition of the juices was performed by principle component analysis (PCA) and cluster analysis (CA). Quantitative calibration with respect to pH and soluble solids content (SSC) was performed using four regression methods (principle components regression (PCR) based on stepwise selection, multiple linear regression (MLR) based on raw feature vector, forward selection and stepwise selection features). CA based on different Data Standardization and distance calculation methods were compared, and precision-recall measure was applied to quantify clustering outcomes. The result implies that it is important to explore the optimum Standardization and distance calculation methods for every Dataset studied prior to CA. Humidity effect was also explored and the result showed that employing desiccant for e-nose measurement presented no improvement. The fusion Dataset that consists of variables selected by analysis of variance (ANOVA) presented the best authentication ability, and the quality indices highly correlated to this Dataset.

John H. Zhang - One of the best experts on this subject based on the ideXlab platform.

  • Data Standardization and Quality Management
    Translational Stroke Research, 2018
    Co-Authors: Paul A. Lapchak, John H. Zhang
    Abstract:

    Important questions regarding the conduct of scientific research and Data transparency have been raised in various scientific forums over the last 10 years. It is becoming clear, that in spite of published RIGOR guidelines, that improvement in the transparency of scientific research is required to focus on the discovery and drug development process so that a treatment can be provided to stroke patients. We have the unique privilege of conducting research using animal models of a disease so that we can address the development of a new therapy, and we should do this with great care and vigilance. This document identifies valuable resources for researchers to become Good Laboratory Practices compliant and increase and improve Data transparency and provides guidelines for accurate Data management to continue to propel the translational stroke research field forward while recognizing that there is a shortage of research funds worldwide. While Data audits are being considered worldwide by funding agencies and they are used extensively by industry, they are still quite controversial for basic researchers. Due to the special exploratory nature of basic and translational science research, the current challenging funding environment, and independent and individualized laboratory activities, it is debatable if current individualized non-standardized Data management and monitoring represents the best approach. Thus, herein, we propose steps to prepare research study Data in an acceptable form for archival purposes so that standards for translational research Data can be comparable to those that are accepted and adhered to by the clinical community. If all translational research laboratories follow and institute the guidelines while conducting translational research, Data from all sources may be more comparable and reliable.

Hongwei Zhu - One of the best experts on this subject based on the ideXlab platform.

  • ICIS - Data Standardization and Quality Degradation of Human-readable Data
    2015
    Co-Authors: Hongwei Zhu
    Abstract:

    Data Standardization is a widely recommended solution to improving Data quality. Despite the potential benefits, we examine if it has any unintended, especially undesirable, side effects on Data quality. The eXtensible Business Reporting Language (XBRL) is an XML-based open standard that aims to facilitate the preparation, exchange and comparison of financial reports. Leveraging the unique opportunity created by the exogenous mandatory XBRL adoption enforced by the U.S. SEC, we use a difference-in-differences (DID) research design to establish the causal relationship between XBRL adoption and quality of HTML-formatted financial reports, an important source for investors and analysts to obtain firms’ financial information. Surprisingly, we find the mandatory XBRL adoption has degraded the quality of the adopting firms’ HTML-formatted financial reports, as measured by a number of Data quality metrics, including spelling errors and readability. The U.S. SEC and adopting firms need design appropriate policies to minimize the undesirable side effects.

  • Data Standardization and Quality Degradation of Human-readable Data Research-in-Progress
    2015
    Co-Authors: Hongwei Zhu
    Abstract:

    Data Standardization is a widely recommended solution to improving Data quality. Despite the potential benefits, we examine if it has any unintended, especially undesirable, side effects on Data quality. The eXtensible Business Reporting Language (XBRL) is an XML-based open standard that aims to facilitate the preparation, exchange and comparison of financial reports. Leveraging the unique opportunity created by the exogenous mandatory XBRL adoption enforced by the U.S. SEC, we use a difference-in-differences (DID) research design to establish the causal relationship between XBRL adoption and quality of HTML-formatted financial reports, an important source for investors and analysts to obtain firms’ financial information. Surprisingly, we find the mandatory XBRL adoption has degraded the quality of the adopting firms’ HTML-formatted financial reports, as measured by a number of Data quality metrics, including spelling errors and readability. The U.S. SEC and adopting firms need design appropriate policies to minimize the undesirable side effects.

Paul A. Lapchak - One of the best experts on this subject based on the ideXlab platform.

  • Data Standardization and Quality Management
    Translational Stroke Research, 2018
    Co-Authors: Paul A. Lapchak, John H. Zhang
    Abstract:

    Important questions regarding the conduct of scientific research and Data transparency have been raised in various scientific forums over the last 10 years. It is becoming clear, that in spite of published RIGOR guidelines, that improvement in the transparency of scientific research is required to focus on the discovery and drug development process so that a treatment can be provided to stroke patients. We have the unique privilege of conducting research using animal models of a disease so that we can address the development of a new therapy, and we should do this with great care and vigilance. This document identifies valuable resources for researchers to become Good Laboratory Practices compliant and increase and improve Data transparency and provides guidelines for accurate Data management to continue to propel the translational stroke research field forward while recognizing that there is a shortage of research funds worldwide. While Data audits are being considered worldwide by funding agencies and they are used extensively by industry, they are still quite controversial for basic researchers. Due to the special exploratory nature of basic and translational science research, the current challenging funding environment, and independent and individualized laboratory activities, it is debatable if current individualized non-standardized Data management and monitoring represents the best approach. Thus, herein, we propose steps to prepare research study Data in an acceptable form for archival purposes so that standards for translational research Data can be comparable to those that are accepted and adhered to by the clinical community. If all translational research laboratories follow and institute the guidelines while conducting translational research, Data from all sources may be more comparable and reliable.