Improving Data Quality

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 155715 Experts worldwide ranked by ideXlab platform

Cornelis P Balde - One of the best experts on this subject based on the ideXlab platform.

  • enhancing e waste estimates Improving Data Quality by multivariate input output analysis
    Waste Management, 2013
    Co-Authors: Feng Wang, Jaco Huisman, A L N Stevels, Cornelis P Balde
    Abstract:

    Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lack of high Quality Data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase Data Quality. An advanced, flexible and multivariate Input–Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various Data points. By applying this method, the Data consolidation steps can generate more accurate time-series Datasets from available Data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without Data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete Datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of Data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve Data Quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies.

  • Enhancing e-waste estimates: Improving Data Quality by multivariate Input–Output Analysis
    Waste Management, 2013
    Co-Authors: Feng Wang, Jaco Huisman, A L N Stevels, Cornelis P Balde
    Abstract:

    Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lack of high Quality Data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase Data Quality. An advanced, flexible and multivariate Input–Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various Data points. By applying this method, the Data consolidation steps can generate more accurate time-series Datasets from available Data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without Data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete Datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of Data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve Data Quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies.

Feng Wang - One of the best experts on this subject based on the ideXlab platform.

  • enhancing e waste estimates Improving Data Quality by multivariate input output analysis
    Waste Management, 2013
    Co-Authors: Feng Wang, Jaco Huisman, A L N Stevels, Cornelis P Balde
    Abstract:

    Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lack of high Quality Data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase Data Quality. An advanced, flexible and multivariate Input–Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various Data points. By applying this method, the Data consolidation steps can generate more accurate time-series Datasets from available Data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without Data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete Datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of Data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve Data Quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies.

  • Enhancing e-waste estimates: Improving Data Quality by multivariate Input–Output Analysis
    Waste Management, 2013
    Co-Authors: Feng Wang, Jaco Huisman, A L N Stevels, Cornelis P Balde
    Abstract:

    Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lack of high Quality Data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase Data Quality. An advanced, flexible and multivariate Input–Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various Data points. By applying this method, the Data consolidation steps can generate more accurate time-series Datasets from available Data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without Data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete Datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of Data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve Data Quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies.

Tapan S Parikh - One of the best experts on this subject based on the ideXlab platform.

  • Usher: Improving Data Quality with Dynamic Forms
    IEEE Transactions on Knowledge and Data Engineering, 2011
    Co-Authors: Kuang Chen, Harr Chen, Neil Conway, Joseph M Hellerstein, Tapan S Parikh
    Abstract:

    Data Quality is a critical problem in modern Databases. Data-entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for Improving Data Quality at entry time. In this paper, we propose Usher, an end-to-end system for form design, entry, and Data Quality assurance. Using previous form submissions, Usher learns a probabilistic model over the questions of the form. Usher then applies this model at every step of the Data-entry process to improve Data Quality. Before entry, it induces a form layout that captures the most important Data values of a form instance as quickly as possible and reduces the complexity of error-prone questions. During entry, it dynamically adapts the form to the values being entered by providing real-time interface feedback, reasking questions with dubious responses, and simplifying questions by reformulating them. After entry, it revisits question responses that it deems likely to have been entered incorrectly by reasking the question or a reformulation thereof. We evaluate these components of Usher using two real-world Data sets. Our results demonstrate that Usher can improve Data Quality considerably at a reduced cost when compared to current practice.

  • usher Improving Data Quality with dynamic forms
    International Conference on Data Engineering, 2010
    Co-Authors: Kuang Chen, Harr Chen, Neil Conway, Joseph M Hellerstein, Tapan S Parikh
    Abstract:

    Data Quality is a critical problem in modern Databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for Improving Data Quality at entry time. In this paper, we propose USHER, an end-to-end system for form design, entry, and Data Quality assurance. Using previous form submissions, USHER learns a probabilistic model over the questions of the form. USHER then applies this model at every step of the Data entry process to improve Data Quality. Before entry, it induces a form layout that captures the most important Data values of a form instance as quickly as possible. During entry, it dynamically adapts the form to the values being entered, and enables real-time feedback to guide the Data enterer toward their intended values. After entry, it re-asks questions that it deems likely to have been entered incorrectly. We evaluate all three components of USHER using two real-world Data sets. Our results demonstrate that each component has the potential to improve Data Quality considerably, at a reduced cost when compared to current practice.

  • ICDE - USHER: Improving Data Quality with dynamic forms
    2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), 2010
    Co-Authors: Kuang Chen, Harr Chen, Neil Conway, Joseph M Hellerstein, Tapan S Parikh
    Abstract:

    Data Quality is a critical problem in modern Databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for Improving Data Quality at entry time. In this paper, we propose USHER, an end-to-end system for form design, entry, and Data Quality assurance. Using previous form submissions, USHER learns a probabilistic model over the questions of the form. USHER then applies this model at every step of the Data entry process to improve Data Quality. Before entry, it induces a form layout that captures the most important Data values of a form instance as quickly as possible. During entry, it dynamically adapts the form to the values being entered, and enables real-time feedback to guide the Data enterer toward their intended values. After entry, it re-asks questions that it deems likely to have been entered incorrectly. We evaluate all three components of USHER using two real-world Data sets. Our results demonstrate that each component has the potential to improve Data Quality considerably, at a reduced cost when compared to current practice.

  • Improving Data Quality with dynamic forms
    Information and Communication Technologies and Development, 2009
    Co-Authors: Kuang Chen, Harr Chen, Neil Conway, Joseph M Hellerstein, Heather Dolan, Tapan S Parikh
    Abstract:

    Organizations in developing regions want to efficiently collect digital Data, but standard Data gathering practices from the developed world are often inappropriate. Traditional techniques for form design and Data Quality are expensive and labour-intensive. We propose a new Data-driven approach to form design, execution (filling) and Quality assurance. We demonstrate USHER, an end-to-end system that automatically generates Data entry forms that enforce and maintain Data Quality constraints during execution. The system features a probabilistic engine that drives form-user interactions to encourage correct answers.

  • ICTD - Improving Data Quality with dynamic forms
    2009 International Conference on Information and Communication Technologies and Development (ICTD), 2009
    Co-Authors: Kuang Chen, Harr Chen, Neil Conway, Joseph M Hellerstein, Heather Dolan, Tapan S Parikh
    Abstract:

    Organizations in developing regions want to efficiently collect digital Data, but standard Data gathering practices from the developed world are often inappropriate. Traditional techniques for form design and Data Quality are expensive and labour-intensive. We propose a new Data-driven approach to form design, execution (filling) and Quality assurance. We demonstrate USHER, an end-to-end system that automatically generates Data entry forms that enforce and maintain Data Quality constraints during execution. The system features a probabilistic engine that drives form-user interactions to encourage correct answers.

Jaco Huisman - One of the best experts on this subject based on the ideXlab platform.

  • enhancing e waste estimates Improving Data Quality by multivariate input output analysis
    Waste Management, 2013
    Co-Authors: Feng Wang, Jaco Huisman, A L N Stevels, Cornelis P Balde
    Abstract:

    Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lack of high Quality Data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase Data Quality. An advanced, flexible and multivariate Input–Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various Data points. By applying this method, the Data consolidation steps can generate more accurate time-series Datasets from available Data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without Data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete Datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of Data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve Data Quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies.

  • Enhancing e-waste estimates: Improving Data Quality by multivariate Input–Output Analysis
    Waste Management, 2013
    Co-Authors: Feng Wang, Jaco Huisman, A L N Stevels, Cornelis P Balde
    Abstract:

    Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lack of high Quality Data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase Data Quality. An advanced, flexible and multivariate Input–Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various Data points. By applying this method, the Data consolidation steps can generate more accurate time-series Datasets from available Data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without Data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete Datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of Data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve Data Quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies.

Kuang Chen - One of the best experts on this subject based on the ideXlab platform.

  • Usher: Improving Data Quality with Dynamic Forms
    IEEE Transactions on Knowledge and Data Engineering, 2011
    Co-Authors: Kuang Chen, Harr Chen, Neil Conway, Joseph M Hellerstein, Tapan S Parikh
    Abstract:

    Data Quality is a critical problem in modern Databases. Data-entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for Improving Data Quality at entry time. In this paper, we propose Usher, an end-to-end system for form design, entry, and Data Quality assurance. Using previous form submissions, Usher learns a probabilistic model over the questions of the form. Usher then applies this model at every step of the Data-entry process to improve Data Quality. Before entry, it induces a form layout that captures the most important Data values of a form instance as quickly as possible and reduces the complexity of error-prone questions. During entry, it dynamically adapts the form to the values being entered by providing real-time interface feedback, reasking questions with dubious responses, and simplifying questions by reformulating them. After entry, it revisits question responses that it deems likely to have been entered incorrectly by reasking the question or a reformulation thereof. We evaluate these components of Usher using two real-world Data sets. Our results demonstrate that Usher can improve Data Quality considerably at a reduced cost when compared to current practice.

  • usher Improving Data Quality with dynamic forms
    International Conference on Data Engineering, 2010
    Co-Authors: Kuang Chen, Harr Chen, Neil Conway, Joseph M Hellerstein, Tapan S Parikh
    Abstract:

    Data Quality is a critical problem in modern Databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for Improving Data Quality at entry time. In this paper, we propose USHER, an end-to-end system for form design, entry, and Data Quality assurance. Using previous form submissions, USHER learns a probabilistic model over the questions of the form. USHER then applies this model at every step of the Data entry process to improve Data Quality. Before entry, it induces a form layout that captures the most important Data values of a form instance as quickly as possible. During entry, it dynamically adapts the form to the values being entered, and enables real-time feedback to guide the Data enterer toward their intended values. After entry, it re-asks questions that it deems likely to have been entered incorrectly. We evaluate all three components of USHER using two real-world Data sets. Our results demonstrate that each component has the potential to improve Data Quality considerably, at a reduced cost when compared to current practice.

  • ICDE - USHER: Improving Data Quality with dynamic forms
    2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), 2010
    Co-Authors: Kuang Chen, Harr Chen, Neil Conway, Joseph M Hellerstein, Tapan S Parikh
    Abstract:

    Data Quality is a critical problem in modern Databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for Improving Data Quality at entry time. In this paper, we propose USHER, an end-to-end system for form design, entry, and Data Quality assurance. Using previous form submissions, USHER learns a probabilistic model over the questions of the form. USHER then applies this model at every step of the Data entry process to improve Data Quality. Before entry, it induces a form layout that captures the most important Data values of a form instance as quickly as possible. During entry, it dynamically adapts the form to the values being entered, and enables real-time feedback to guide the Data enterer toward their intended values. After entry, it re-asks questions that it deems likely to have been entered incorrectly. We evaluate all three components of USHER using two real-world Data sets. Our results demonstrate that each component has the potential to improve Data Quality considerably, at a reduced cost when compared to current practice.

  • Improving Data Quality with dynamic forms
    Information and Communication Technologies and Development, 2009
    Co-Authors: Kuang Chen, Harr Chen, Neil Conway, Joseph M Hellerstein, Heather Dolan, Tapan S Parikh
    Abstract:

    Organizations in developing regions want to efficiently collect digital Data, but standard Data gathering practices from the developed world are often inappropriate. Traditional techniques for form design and Data Quality are expensive and labour-intensive. We propose a new Data-driven approach to form design, execution (filling) and Quality assurance. We demonstrate USHER, an end-to-end system that automatically generates Data entry forms that enforce and maintain Data Quality constraints during execution. The system features a probabilistic engine that drives form-user interactions to encourage correct answers.

  • ICTD - Improving Data Quality with dynamic forms
    2009 International Conference on Information and Communication Technologies and Development (ICTD), 2009
    Co-Authors: Kuang Chen, Harr Chen, Neil Conway, Joseph M Hellerstein, Heather Dolan, Tapan S Parikh
    Abstract:

    Organizations in developing regions want to efficiently collect digital Data, but standard Data gathering practices from the developed world are often inappropriate. Traditional techniques for form design and Data Quality are expensive and labour-intensive. We propose a new Data-driven approach to form design, execution (filling) and Quality assurance. We demonstrate USHER, an end-to-end system that automatically generates Data entry forms that enforce and maintain Data Quality constraints during execution. The system features a probabilistic engine that drives form-user interactions to encourage correct answers.