Data Vault Model

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 54 Experts worldwide ranked by ideXlab platform

Vladan Jovanovic - One of the best experts on this subject based on the ideXlab platform.

  • Conceptual Model for the New Generation of Data Warehouse System Catalog
    Lecture Notes in Networks and Systems, 2019
    Co-Authors: Danijela Jakšić, Patrizia Poščić, Vladan Jovanovic
    Abstract:

    This paper introduces a formal definition of a Data Vault Model and a conceptual Data Model of a new Data Warehouse (DW) system catalog (MetaData Vault Repository - MDV) which is based on the Data Vault (DV) method for Database Modeling. The goal of this conceptual MDV Model is to serve as a basis for future development of a new generation of DW temporal system catalogs – catalogs that will track and manage changes in the DW Data and metaData, as well as in its’ schemas. The main contributions of this paper are: (a) a formal definition of DV Model and its main concepts, (b) a conceptual MDV Model, (c) a final set of fundamental changes over the DW schema, and (d) a formal algebra for DW schema maintenance.

  • Domain/Mapping Model: A Novel Data Warehouse Data Mode
    International Journal of Computers Communications & Control, 2017
    Co-Authors: Ivan Bojicic, Zoran Marjanovic, Nina Turajlić, Marko Petrovic, Milica Vuckovic, Vladan Jovanovic
    Abstract:

    In order for a Data warehouse to be able to adequately fulfill its integrative and historical purpose, its Data Model must enable the appropriate and consistent representation of the different states of a system. In effect, a DW Data Model, representing the physical structure of the DW, must be general enough, to be able to consume Data from heterogeneous Data sources and reconcile the semantic differences of the Data source Models, and, at the same time, be resilient to the constant changes in the structure of the Data sources. One of the main problems related to DW development is the absence of a standardized DW Data Model. In this paper a comparative analysis of the four most prominent DW Data Models (namely the relational/normalized Model, Data Vault Model, anchor Model and dimensional Model) will be given. On the basis of the results of [1]a, the new DW Data Model (the Domain/Mapping Model- DMM) which would more adequately fulfill the posed requirements is presented.

  • A direct approach to physical Data Vault design
    Computer Science and Information Systems, 2016
    Co-Authors: Dragoljub Krneta, Vladan Jovanovic, Zoran Marjanovic
    Abstract:

    The paper presents a novel agile approach to large scale design of enterprise Data warehouses based on a Data Vault Model. An original, simple and direct algorithm is defined for the incremental design of physical Data Vault type enterprise Data warehouses, using source Data meta-Model and rules, and used in developing a prototype case tool for Data Vault design. This approach solves primary requirements for a system of record, that is, preservation of all source information, and fully addresses flexibility and scalability expectations. Our approach benefits from Data Vault dependencies minimizations and rapid loads opportunities enabling greatly simplified ETL transformations in a way not possible with traditional (i.e. non Data Vault based) Data warehouse designs. The approach is illustrated using a realistic example from the healthcare domain.

  • A comparative analysis of Data warehouse Data Models
    2016 6th International Conference on Computers Communications and Control (ICCCC), 2016
    Co-Authors: Ivan Bojicic, Zoran Marjanovic, Nina Turajlić, Marko Petrovic, Milica Vuckovic, Vladan Jovanovic
    Abstract:

    The main purpose of Data warehouses (DW) is to maintain large volumes of historical Data (originating from multiple heterogeneous Data sources and representing the different states of a system caused by various business events or activities) in a format that facilitates its analysis in order to support timelier and better decision-making, at both the operational and strategic level. In order for a Data warehouse to be able to adequately fulfill this purpose, its Data Model must enable the appropriate and consistent representation of the different states of a system. In effect, a DW Data Model, representing the physical structure of the DW, must be general enough, to be able to consume Data from heterogeneous Data sources (where all of the Data should be treated as relevant Data and it must be possible to trace it back to its source) and reconcile the semantic differences of the Data source Models, and, at the same time, be resilient to the constant changes in the structure of the Data sources. One of the main problems related to DW development is the absence of a standardized DW Data Model. In this paper a comparative analysis of the four most prominent DW Data Models (namely the relational/normalized Model, Data Vault Model, anchor Model and dimensional Model) will be given. These Models will be analyzed and compared on the basis of the following criteria: (1) semantics (i.e. the fundamental concepts), (2) resilience of the Data Model with regard to changes in the structure of the Data sources, (3) temporal aspects and (4) completeness and traceability of the Data. By identifying the strengths and weaknesses of each of these Models it would be possible to establish the foundation for a new DW Data Model which would more adequately fulfill the posed requirements.

  • MIPRO - Data Modeling styles in Data warehousing
    2014 37th International Convention on Information and Communication Technology Electronics and Microelectronics (MIPRO), 2014
    Co-Authors: Vladan Jovanovic, Danijela Subotic, Stevan Mrdalj
    Abstract:

    The paper presents a coordinated set of Data Modeling styles relevant for Data warehouse design in the context of relational Databases. The scope of presented Models covers: a) entity relationship Models of existing relational DB sources, b) logical Data Vault Model for integrated enterprise Data warehouses as a system of records, c) dimensional fact Model for analysis leading to query prototyping and dimensional Models, and d) dimensional Model for Data marts. There is no claim of sufficiency, uniqueness, and/or universality of the selected styles set except utility. The principal contributions of the paper are: definition of Data Modeling styles as distinct Modeling mechanisms, and initial coordination of selected complementary styles.

Stefano Rizzi - One of the best experts on this subject based on the ideXlab platform.

  • ADBIS - Starry Vault: Automating Multidimensional Modeling from Data Vaults
    Advances in Databases and Information Systems, 2016
    Co-Authors: Matteo Golfarelli, Simone Graziani, Stefano Rizzi
    Abstract:

    The Data Vault Model natively supports Data and schema evolution, so it is often adopted to create operational Data stores. However, it can hardly be directly used for OLAP querying. In this paper we propose an approach called Starry Vault for finding a multidimensional structure in Data Vaults. Starry Vault builds on the specific features of the Data Vault Model to automate multidimensional Modeling, and uses approximate functional dependencies to discover out of Data the information necessary to infer the structure of multidimensional hierarchies. The manual intervention by the user is limited to some editing of the resulting multidimensional schemata, which makes the overall process simple and quick enough to be compatible with the situational analysis needs of a Data scientist.

Rizzi Stefano - One of the best experts on this subject based on the ideXlab platform.

  • Starry Vault: Automating Multidimensional Modeling from Data Vaults
    Springer, 2016
    Co-Authors: Golfarelli Matteo, Graziani Simone, Rizzi Stefano
    Abstract:

    The Data Vault Model natively supports Data and schema evolution, so it is often adopted to create operational Data stores. However, it can hardly be directly used for OLAP querying. In this paper we propose an approach called Starry Vault for finding a multidimensional structure in Data Vaults. Starry Vault builds on the specific features of the Data Vault Model to automate multidimensional Modeling, and uses approximate functional dependencies to discover out of Data the information necessary to infer the structure of multidimensional hierarchies. The manual intervention by the user is limited to some editing of the resulting multidimensional schemata, which makes the overall process simple and quick enough to be compatible with the situational analysis needs of a Data scientist

Timo Raitalaakso - One of the best experts on this subject based on the ideXlab platform.

  • Data Vault mappings to dimensional Model using schema matching
    International Conference on Research and Practical Issues of Enterprise Information Systems, 2019
    Co-Authors: Mikko Puonti, Timo Raitalaakso
    Abstract:

    In Data warehousing, business driven development defines Data requirements to fulfill reporting needs. A Data warehouse stores current and historical Data in one single place. Data warehouse architecture consists of several layers and each has its own purpose. A staging layer is a Data storage area to assists Data loadings, a Data Vault Modelled layer is the persistent storage that integrates Data and stores the history, whereas publish layer presents Data using a vocabulary that is familiar to the information users. By following the process which is driven by business requirements and starts with publish layer structure, this creates a situation where manual work requires a specialist, who knows the Data Vault Model. Our goal is to reduce the number of entities that can be selected in a transformation so that the individual developer does not need to know the whole solution, but can focus on a subset of entities (partial schema). In this paper, we present two different schema matchers, one based on attribute names, and another based on Data flow mapping information. Schema matching based on Data flow mappings is a novel addition to current schema matching literature. Through the example of Northwind, we show how these two different matchers affect the formation of a partial schema for transformation source entities. Based on our experiment with Northwind we conclude that combining schema matching algorithms produces correct entities in the partial schema.

  • CONFENIS - Data Vault Mappings to Dimensional Model Using Schema Matching
    Lecture Notes in Business Information Processing, 2019
    Co-Authors: Mikko Puonti, Timo Raitalaakso
    Abstract:

    In Data warehousing, business driven development defines Data requirements to fulfill reporting needs. A Data warehouse stores current and historical Data in one single place. Data warehouse architecture consists of several layers and each has its own purpose. A staging layer is a Data storage area to assists Data loadings, a Data Vault Modelled layer is the persistent storage that integrates Data and stores the history, whereas publish layer presents Data using a vocabulary that is familiar to the information users. By following the process which is driven by business requirements and starts with publish layer structure, this creates a situation where manual work requires a specialist, who knows the Data Vault Model. Our goal is to reduce the number of entities that can be selected in a transformation so that the individual developer does not need to know the whole solution, but can focus on a subset of entities (partial schema). In this paper, we present two different schema matchers, one based on attribute names, and another based on Data flow mapping information. Schema matching based on Data flow mappings is a novel addition to current schema matching literature. Through the example of Northwind, we show how these two different matchers affect the formation of a partial schema for transformation source entities. Based on our experiment with Northwind we conclude that combining schema matching algorithms produces correct entities in the partial schema.

Matteo Golfarelli - One of the best experts on this subject based on the ideXlab platform.

  • ADBIS - Starry Vault: Automating Multidimensional Modeling from Data Vaults
    Advances in Databases and Information Systems, 2016
    Co-Authors: Matteo Golfarelli, Simone Graziani, Stefano Rizzi
    Abstract:

    The Data Vault Model natively supports Data and schema evolution, so it is often adopted to create operational Data stores. However, it can hardly be directly used for OLAP querying. In this paper we propose an approach called Starry Vault for finding a multidimensional structure in Data Vaults. Starry Vault builds on the specific features of the Data Vault Model to automate multidimensional Modeling, and uses approximate functional dependencies to discover out of Data the information necessary to infer the structure of multidimensional hierarchies. The manual intervention by the user is limited to some editing of the resulting multidimensional schemata, which makes the overall process simple and quick enough to be compatible with the situational analysis needs of a Data scientist.