Data Quality Dimension

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 75 Experts worldwide ranked by ideXlab platform

Mohammad Shamsul Islam - One of the best experts on this subject based on the ideXlab platform.

Marijn Janssen - One of the best experts on this subject based on the ideXlab platform.

  • Relating Big Data and Data Quality in Financial Service Organizations
    2018
    Co-Authors: Agung Wahyudi, Adiska Farhani, Marijn Janssen
    Abstract:

    Today’s financial service organizations have a Data deluge. A number of V’s are often used to characterize big Data, whereas traditional Data Quality is characterized by a number of Dimensions. Our objective is to investigate the complex relationship between big Data and Data Quality. We do this by comparing the big Data characteristics with Data Quality Dimensions. Data Quality has been researched for decades and there are well-defined Dimensions which were adopted, whereas big Data characteristics represented by eleven V’s were used to characterize big Data. Literature review and ten cases in financial service organizations were invested to analyze the relationship between Data Quality and big Data. Whereas the big Data characteristics and Data Quality have been viewed as separated domain ours findings show that these domains are intertwined and closely related. Findings from this study suggest that variety is the most dominant big Data characteristic relating with most Data Quality Dimensions, such as accuracy, objectivity, believability, understandability, interpretability, consistent representation, accessibility, ease of operations, relevance, completeness, timeliness, and value-added. Not surprisingly, the most dominant Data Quality Dimension is value-added which relates with variety, validity, visibility, and vast resources. The most mentioned pair of big Data characteristic and Data Quality Dimension is Velocity-Timeliness. Our findings suggest that term ‘big Data’ is misleading as that mostly volume (‘big’) was not an issue and variety, validity and veracity were found to be more important.

  • I3E - Relating big Data and Data Quality in financial service organizations
    Lecture Notes in Computer Science, 2018
    Co-Authors: Agung Wahyudi, Adiska Farhani, Marijn Janssen
    Abstract:

    Today’s financial service organizations have a Data deluge. A number of V’s are often used to characterize big Data, whereas traditional Data Quality is characterized by a number of Dimensions. Our objective is to investigate the complex relationship between big Data and Data Quality. We do this by comparing the big Data characteristics with Data Quality Dimensions. Data Quality has been researched for decades and there are well-defined Dimensions which were adopted, whereas big Data characteristics represented by eleven V’s were used to characterize big Data. Literature review and ten cases in financial service organizations were invested to analyze the relationship between Data Quality and big Data. Whereas the big Data characteristics and Data Quality have been viewed as separated domain ours findings show that these domains are intertwined and closely related. Findings from this study suggest that variety is the most dominant big Data characteristic relating with most Data Quality Dimensions, such as accuracy, objectivity, believability, understandability, interpretability, consistent representation, accessibility, ease of operations, relevance, completeness, timeliness, and value-added. Not surprisingly, the most dominant Data Quality Dimension is value-added which relates with variety, validity, visibility, and vast resources. The most mentioned pair of big Data characteristic and Data Quality Dimension is Velocity-Timeliness. Our findings suggest that term ‘big Data’ is misleading as that mostly volume (‘big’) was not an issue and variety, validity and veracity were found to be more important.

Markus Helfert - One of the best experts on this subject based on the ideXlab platform.

Agung Wahyudi - One of the best experts on this subject based on the ideXlab platform.

  • Relating Big Data and Data Quality in Financial Service Organizations
    2018
    Co-Authors: Agung Wahyudi, Adiska Farhani, Marijn Janssen
    Abstract:

    Today’s financial service organizations have a Data deluge. A number of V’s are often used to characterize big Data, whereas traditional Data Quality is characterized by a number of Dimensions. Our objective is to investigate the complex relationship between big Data and Data Quality. We do this by comparing the big Data characteristics with Data Quality Dimensions. Data Quality has been researched for decades and there are well-defined Dimensions which were adopted, whereas big Data characteristics represented by eleven V’s were used to characterize big Data. Literature review and ten cases in financial service organizations were invested to analyze the relationship between Data Quality and big Data. Whereas the big Data characteristics and Data Quality have been viewed as separated domain ours findings show that these domains are intertwined and closely related. Findings from this study suggest that variety is the most dominant big Data characteristic relating with most Data Quality Dimensions, such as accuracy, objectivity, believability, understandability, interpretability, consistent representation, accessibility, ease of operations, relevance, completeness, timeliness, and value-added. Not surprisingly, the most dominant Data Quality Dimension is value-added which relates with variety, validity, visibility, and vast resources. The most mentioned pair of big Data characteristic and Data Quality Dimension is Velocity-Timeliness. Our findings suggest that term ‘big Data’ is misleading as that mostly volume (‘big’) was not an issue and variety, validity and veracity were found to be more important.

  • I3E - Relating big Data and Data Quality in financial service organizations
    Lecture Notes in Computer Science, 2018
    Co-Authors: Agung Wahyudi, Adiska Farhani, Marijn Janssen
    Abstract:

    Today’s financial service organizations have a Data deluge. A number of V’s are often used to characterize big Data, whereas traditional Data Quality is characterized by a number of Dimensions. Our objective is to investigate the complex relationship between big Data and Data Quality. We do this by comparing the big Data characteristics with Data Quality Dimensions. Data Quality has been researched for decades and there are well-defined Dimensions which were adopted, whereas big Data characteristics represented by eleven V’s were used to characterize big Data. Literature review and ten cases in financial service organizations were invested to analyze the relationship between Data Quality and big Data. Whereas the big Data characteristics and Data Quality have been viewed as separated domain ours findings show that these domains are intertwined and closely related. Findings from this study suggest that variety is the most dominant big Data characteristic relating with most Data Quality Dimensions, such as accuracy, objectivity, believability, understandability, interpretability, consistent representation, accessibility, ease of operations, relevance, completeness, timeliness, and value-added. Not surprisingly, the most dominant Data Quality Dimension is value-added which relates with variety, validity, visibility, and vast resources. The most mentioned pair of big Data characteristic and Data Quality Dimension is Velocity-Timeliness. Our findings suggest that term ‘big Data’ is misleading as that mostly volume (‘big’) was not an issue and variety, validity and veracity were found to be more important.

Bernd Heinrich - One of the best experts on this subject based on the ideXlab platform.

  • Data Quality in recommender systems: the impact of completeness of item content Data on prediction accuracy of recommender systems
    Electronic Markets, 2019
    Co-Authors: Bernd Heinrich, Marcus Hopf, Daniel Lohninger, Alexander Schiller, Michael Szubartowicz
    Abstract:

    Recommender systems strive to guide users, especially in the field of e-commerce, to their individually best choice when a large number of alternatives is available. In general, literature suggests that the Quality of Data which a recommender system is based on may have important impact on recommendation Quality. In this paper, we focus on the Data Quality Dimension completeness of item content Data (i.e., features of items and their feature values) and investigate its impact on the prediction accuracy of recommender systems. In particular, we examine the increase in completeness per item, per user and per feature as moderators for this impact. To this end, we present a theoretical model based on the literature and derive ten hypotheses. We test these hypotheses on two real-world Data sets, one from two leading web portals for restaurant reviews and another one from a movie review portal. The results strongly support that, in general, the prediction accuracy is positively influenced by increased completeness. However, the results also reveal, contrary to existing literature, that among others increasing completeness by adding features which differ significantly from already existing features (i.e., a high diversity) does not positively influence the prediction accuracy of recommender systems.

  • Ein metrikbasierter Ansatz zur Messung der Aktualität von Daten in Informationssystemen
    Zeitschrift für Betriebswirtschaft, 2012
    Co-Authors: Bernd Heinrich, Mathias Klier, Quirin Görz
    Abstract:

    Die Verbesserung der Aktualität von Daten in Informationssystemen wird in Wissenschaft und Praxis intensiv diskutiert. In diesem Zuge werden auch geeignete Metriken zur Messung der Aktualität von Daten gefordert. Deshalb wird im Beitrag eine wahrscheinlichkeitstheoretisch fundierte Metrik zur weitgehend automatisierbaren Messung der Aktualität konstruiert, die im Vergleich zu bestehenden Ansätzen eine Kardinalskalierung und Interpretierbarkeit der Metrikergebnisse als Wahrscheinlichkeiten gewährleistet. Damit können die Metrikergebnisse methodisch fundiert in Erwartungswertkalküle von Entscheidungen eingehen. Ferner erlaubt die Metrik eine Konfiguration, um v. a. datenattributspezifische Charakteristika und vorhandene Zusatzdaten bei der Messung zu berücksichtigen. Die Evaluation des Ansatzes erfolgt einerseits anhand von sechs allgemeinen Anforderungen an Datenqualitätsmetriken. Andererseits demonstriert ein reales Fallbeispiel die Instanziierbarkeit und Anwendbarkeit sowie den praktischen Mehrwert der neuen Metrik. Due to the importance of using up-to-date Data in information systems, this paper analyzes how the Data Quality Dimension currency can be measured. Therefore, we design a probability based metric that allows for an objective and to a great extent automated assessment of Data’s currency. In contrast to existing approaches, the resulting values of the new metric meet important requirements such as ratio scale and can be interpreted as probabilities. Hence, they can also be applied to calculate expected values for decision making in a methodically well-founded manner. Moreover, the metric can be adapted to the context of a particular application considering both, the specific characteristics of attribute values and supplemental Data stored in the information system. The evaluation of the approach is based on six requirements for Data Quality metrics. Furthermore, the case of a mobile services provider illustrates the metric’s applicability and its practical benefit.

  • a procedure to develop metrics for currency and its application in crm
    Journal of Data and Information Quality, 2009
    Co-Authors: Bernd Heinrich, Mathias Klier, Marcus Kaiser
    Abstract:

    Due to the importance of using up-to-date Data in information systems, this article analyzes how the Data-Quality Dimension currency can be quantified. Based on several requirements (e.g., normalization and interpretability) and a literature review, we design a procedure to develop probability-based metrics for currency which can be adjusted to the specific characteristics of Data attribute values. We evaluate the presented procedure with regard to the requirements and illustrate the applicability as well as its practical benefit. In cooperation with a major German mobile services provider, the procedure was applied in the field of campaign management in order to improve both success rates and profits.