Data Warehouse Design

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 8577 Experts worldwide ranked by ideXlab platform

Elisa Turricchia - One of the best experts on this subject based on the ideXlab platform.

  • sprint planning optimization in agile Data Warehouse Design
    Data Warehousing and Knowledge Discovery, 2012
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Agile methods have been increasingly adopted to make Data Warehouse Design faster and nimbler. They divide a Data Warehouse project into sprints (iterations), and include a sprint planning phase that is critical to ensure the project success. Several factors impact on the optimality of a sprint plan, e.g., the estimated complexity, business value, and affinity of the elemental functionalities included in each sprint, which makes the planning problem difficult. In this paper we formalize the planning problem and propose an optimization model that, given the estimates made by the project team and a set of development constraints, produces an optimal sprint plan that maximizes the business value perceived by users. The planning problem is converted into a multi-knapsack problem with constraints, given a linear programming formulation, and solved using the IBM ILOG CPLEX Optimizer. Finally, the proposed approach is validated through effectiveness and efficiency tests.

  • DaWaK - Sprint planning optimization in agile Data Warehouse Design
    Data Warehousing and Knowledge Discovery, 2012
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Agile methods have been increasingly adopted to make Data Warehouse Design faster and nimbler. They divide a Data Warehouse project into sprints (iterations), and include a sprint planning phase that is critical to ensure the project success. Several factors impact on the optimality of a sprint plan, e.g., the estimated complexity, business value, and affinity of the elemental functionalities included in each sprint, which makes the planning problem difficult. In this paper we formalize the planning problem and propose an optimization model that, given the estimates made by the project team and a set of development constraints, produces an optimal sprint plan that maximizes the business value perceived by users. The planning problem is converted into a multi-knapsack problem with constraints, given a linear programming formulation, and solved using the IBM ILOG CPLEX Optimizer. Finally, the proposed approach is validated through effectiveness and efficiency tests.

  • DaWaK - Modern software engineering methodologies meet Data Warehouse Design: 4WD
    Data Warehousing and Knowledge Discovery, 2011
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Data Warehouse systems are characterized by a long and expensive development process that hardly meets the ambitious requirements of today's market. This suggests that some further investigation on the methodological issues related to Data Warehouse Design is necessary, aimed at improving the development process from different points of view. In this paper we analyze the potential advantages arising from the application of modern software engineering methodologies to a Data Warehouse project and we propose 4WD, a Design methodology that couples the main principles emerging from these methodologies to the peculiarities of Data Warehouse projects. The principles underlying 4WD are risk-based iteration, evolutionary and incremental prototyping, user involvement, component reuse, formal and light documentation, and automated schema transformation.

  • Modern software engineering methodologies meet Data Warehouse Design: 4WD
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Data Warehouse systems are characterized by a long and ex- pensive development process that hardly meets the ambitious require- ments of today’s market. This suggests that some further investiga- tion on the methodological issues related to Data Warehouse Design is necessary, aimed at improving the development process from different points of view. In this paper we analyze the potential advantages arising from the application of modern software engineering methodologies to a Data Warehouse project and we propose 4WD, a Design methodology that couples the main principles emerging from these methodologies to the peculiarities of Data Warehouse projects. The principles underlying 4WD are risk-based iteration, evolutionary and incremental prototyping, user involvement, component reuse, formal and light documentation, and automated schema transformation.

Matteo Golfarelli - One of the best experts on this subject based on the ideXlab platform.

  • sprint planning optimization in agile Data Warehouse Design
    Data Warehousing and Knowledge Discovery, 2012
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Agile methods have been increasingly adopted to make Data Warehouse Design faster and nimbler. They divide a Data Warehouse project into sprints (iterations), and include a sprint planning phase that is critical to ensure the project success. Several factors impact on the optimality of a sprint plan, e.g., the estimated complexity, business value, and affinity of the elemental functionalities included in each sprint, which makes the planning problem difficult. In this paper we formalize the planning problem and propose an optimization model that, given the estimates made by the project team and a set of development constraints, produces an optimal sprint plan that maximizes the business value perceived by users. The planning problem is converted into a multi-knapsack problem with constraints, given a linear programming formulation, and solved using the IBM ILOG CPLEX Optimizer. Finally, the proposed approach is validated through effectiveness and efficiency tests.

  • DaWaK - Sprint planning optimization in agile Data Warehouse Design
    Data Warehousing and Knowledge Discovery, 2012
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Agile methods have been increasingly adopted to make Data Warehouse Design faster and nimbler. They divide a Data Warehouse project into sprints (iterations), and include a sprint planning phase that is critical to ensure the project success. Several factors impact on the optimality of a sprint plan, e.g., the estimated complexity, business value, and affinity of the elemental functionalities included in each sprint, which makes the planning problem difficult. In this paper we formalize the planning problem and propose an optimization model that, given the estimates made by the project team and a set of development constraints, produces an optimal sprint plan that maximizes the business value perceived by users. The planning problem is converted into a multi-knapsack problem with constraints, given a linear programming formulation, and solved using the IBM ILOG CPLEX Optimizer. Finally, the proposed approach is validated through effectiveness and efficiency tests.

  • DaWaK - Modern software engineering methodologies meet Data Warehouse Design: 4WD
    Data Warehousing and Knowledge Discovery, 2011
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Data Warehouse systems are characterized by a long and expensive development process that hardly meets the ambitious requirements of today's market. This suggests that some further investigation on the methodological issues related to Data Warehouse Design is necessary, aimed at improving the development process from different points of view. In this paper we analyze the potential advantages arising from the application of modern software engineering methodologies to a Data Warehouse project and we propose 4WD, a Design methodology that couples the main principles emerging from these methodologies to the peculiarities of Data Warehouse projects. The principles underlying 4WD are risk-based iteration, evolutionary and incremental prototyping, user involvement, component reuse, formal and light documentation, and automated schema transformation.

  • Modern software engineering methodologies meet Data Warehouse Design: 4WD
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Data Warehouse systems are characterized by a long and ex- pensive development process that hardly meets the ambitious require- ments of today’s market. This suggests that some further investiga- tion on the methodological issues related to Data Warehouse Design is necessary, aimed at improving the development process from different points of view. In this paper we analyze the potential advantages arising from the application of modern software engineering methodologies to a Data Warehouse project and we propose 4WD, a Design methodology that couples the main principles emerging from these methodologies to the peculiarities of Data Warehouse projects. The principles underlying 4WD are risk-based iteration, evolutionary and incremental prototyping, user involvement, component reuse, formal and light documentation, and automated schema transformation.

  • Data Warehouse Design modern principles and methodologies
    2009
    Co-Authors: Matteo Golfarelli, Stefano Rizzi
    Abstract:

    Chapter 1. Introduction to Data Warehousing Chapter 2. Data Warehouse System Lifecycle Chapter 3. Analysis and Reconciliation of Data Sources Chapter 4. User Requirement Analysis Chapter 5. Conceptual Modeling Chapter 6. Conceptual Design Chapter 7. Workload and Data Volume Chapter 8. Logical Modeling Chapter 9. Logical Design Chapter 10. Data-staging Design Chapter 11. Indexes for the Data Warehouse Chapter 12. Physical Design Chapter 13. Data Warehouse Project Documentation Chapter 14. A Case Study Chapter 15. Business Intelligence: Beyond the Data Warehouse Glossary Bibliography Index

Stefano Rizzi - One of the best experts on this subject based on the ideXlab platform.

  • sprint planning optimization in agile Data Warehouse Design
    Data Warehousing and Knowledge Discovery, 2012
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Agile methods have been increasingly adopted to make Data Warehouse Design faster and nimbler. They divide a Data Warehouse project into sprints (iterations), and include a sprint planning phase that is critical to ensure the project success. Several factors impact on the optimality of a sprint plan, e.g., the estimated complexity, business value, and affinity of the elemental functionalities included in each sprint, which makes the planning problem difficult. In this paper we formalize the planning problem and propose an optimization model that, given the estimates made by the project team and a set of development constraints, produces an optimal sprint plan that maximizes the business value perceived by users. The planning problem is converted into a multi-knapsack problem with constraints, given a linear programming formulation, and solved using the IBM ILOG CPLEX Optimizer. Finally, the proposed approach is validated through effectiveness and efficiency tests.

  • DaWaK - Sprint planning optimization in agile Data Warehouse Design
    Data Warehousing and Knowledge Discovery, 2012
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Agile methods have been increasingly adopted to make Data Warehouse Design faster and nimbler. They divide a Data Warehouse project into sprints (iterations), and include a sprint planning phase that is critical to ensure the project success. Several factors impact on the optimality of a sprint plan, e.g., the estimated complexity, business value, and affinity of the elemental functionalities included in each sprint, which makes the planning problem difficult. In this paper we formalize the planning problem and propose an optimization model that, given the estimates made by the project team and a set of development constraints, produces an optimal sprint plan that maximizes the business value perceived by users. The planning problem is converted into a multi-knapsack problem with constraints, given a linear programming formulation, and solved using the IBM ILOG CPLEX Optimizer. Finally, the proposed approach is validated through effectiveness and efficiency tests.

  • DaWaK - Modern software engineering methodologies meet Data Warehouse Design: 4WD
    Data Warehousing and Knowledge Discovery, 2011
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Data Warehouse systems are characterized by a long and expensive development process that hardly meets the ambitious requirements of today's market. This suggests that some further investigation on the methodological issues related to Data Warehouse Design is necessary, aimed at improving the development process from different points of view. In this paper we analyze the potential advantages arising from the application of modern software engineering methodologies to a Data Warehouse project and we propose 4WD, a Design methodology that couples the main principles emerging from these methodologies to the peculiarities of Data Warehouse projects. The principles underlying 4WD are risk-based iteration, evolutionary and incremental prototyping, user involvement, component reuse, formal and light documentation, and automated schema transformation.

  • Modern software engineering methodologies meet Data Warehouse Design: 4WD
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011
    Co-Authors: Matteo Golfarelli, Stefano Rizzi, Elisa Turricchia
    Abstract:

    Data Warehouse systems are characterized by a long and ex- pensive development process that hardly meets the ambitious require- ments of today’s market. This suggests that some further investiga- tion on the methodological issues related to Data Warehouse Design is necessary, aimed at improving the development process from different points of view. In this paper we analyze the potential advantages arising from the application of modern software engineering methodologies to a Data Warehouse project and we propose 4WD, a Design methodology that couples the main principles emerging from these methodologies to the peculiarities of Data Warehouse projects. The principles underlying 4WD are risk-based iteration, evolutionary and incremental prototyping, user involvement, component reuse, formal and light documentation, and automated schema transformation.

  • Data Warehouse Design modern principles and methodologies
    2009
    Co-Authors: Matteo Golfarelli, Stefano Rizzi
    Abstract:

    Chapter 1. Introduction to Data Warehousing Chapter 2. Data Warehouse System Lifecycle Chapter 3. Analysis and Reconciliation of Data Sources Chapter 4. User Requirement Analysis Chapter 5. Conceptual Modeling Chapter 6. Conceptual Design Chapter 7. Workload and Data Volume Chapter 8. Logical Modeling Chapter 9. Logical Design Chapter 10. Data-staging Design Chapter 11. Indexes for the Data Warehouse Chapter 12. Physical Design Chapter 13. Data Warehouse Project Documentation Chapter 14. A Case Study Chapter 15. Business Intelligence: Beyond the Data Warehouse Glossary Bibliography Index

Jean-françois Ethier - One of the best experts on this subject based on the ideXlab platform.

  • DEXA (2) - Past Indeterminacy in Data Warehouse Design
    Lecture Notes in Computer Science, 2017
    Co-Authors: Christina Khnaisser, Luc Lavoie, Anita Burgun, Jean-françois Ethier
    Abstract:

    Traditional Data Warehouse Design methods do not fully address some important challenges, particularly temporal ones. Among them past indeterminacy is not handled systematically and uniformly. Furthermore, most methods published until now present transformation approaches by providing examples rather than general and systematic transformation rules. As a result, real-world applications require manual adaptations and implementations. This hinders scalability, long-term maintenance and increases the risk of inconsistency in case of manual implementation. This article extends the Unified Bitemporal Historicization Framework with a set of specifications and a deterministic process that defines simple steps for transforming a non-historical Database schema into a historical schema allowing Data evolution and traceability, including past and future indeterminacy. The primary aim of this work is to help Data Warehouse Designers to model historicized schema based on a sound theory ensuring a sound temporal semantic, Data integrity and query expressiveness.

  • Data Warehouse Design methods review trends challenges and future directions for the healthcare domain
    Advances in Databases and Information Systems, 2015
    Co-Authors: Christina Khnaisser, Luc Lavoie, Jean-françois Ethier, Hassan Diab
    Abstract:

    In secondary Data use context, traditional Data Warehouse Design methods don’t address many of today’s challenges; particularly in the healthcare domain were semantics plays an essential role to achieve an effective and implementable heterogeneous Data integration while satisfying core requirements. Forty papers were selected based on seven core requirements: Data integrity, sound temporal schema Design, query expressiveness, heterogeneous Data integration, knowledge/source evolution integration, traceability and guided automation. Proposed methods were compared based on twenty-two comparison criteria. Analysis of the results shows important trends and challenges, among them (1) a growing number of methods unify knowledge with source structure to obtain a well-defined Data Warehouse schema built on semantic integration; (2) none of the published methods cover all the core requirements as a whole and (3) their potential in real world is not demonstrated yet.

  • ADBIS (Short Papers and Workshops) - Data Warehouse Design Methods Review: Trends, Challenges and Future Directions for the Healthcare Domain
    Communications in Computer and Information Science, 2015
    Co-Authors: Christina Khnaisser, Luc Lavoie, Hassan Diab, Jean-françois Ethier
    Abstract:

    In secondary Data use context, traditional Data Warehouse Design methods don’t address many of today’s challenges; particularly in the healthcare domain were semantics plays an essential role to achieve an effective and implementable heterogeneous Data integration while satisfying core requirements. Forty papers were selected based on seven core requirements: Data integrity, sound temporal schema Design, query expressiveness, heterogeneous Data integration, knowledge/source evolution integration, traceability and guided automation. Proposed methods were compared based on twenty-two comparison criteria. Analysis of the results shows important trends and challenges, among them (1) a growing number of methods unify knowledge with source structure to obtain a well-defined Data Warehouse schema built on semantic integration; (2) none of the published methods cover all the core requirements as a whole and (3) their potential in real world is not demonstrated yet.

Filippo Tangorra - One of the best experts on this subject based on the ideXlab platform.

  • DaWaK - Evaluation of Data Warehouse Design Methodologies in the Context of Big Data
    Big Data Analytics and Knowledge Discovery, 2017
    Co-Authors: Francesco Di Tria, Ezio Lefons, Filippo Tangorra
    Abstract:

    The Data Warehouse Design methodologies require a novel approach in the Big Data context, because the methodologies have to provide solutions to face the issues related to the 5 Vs (Volume, Velocity, Variety, Veracity, and Value). So it is mandatory to support the Designer through automatic techniques able to quickly produce a multidimensional schema using and integrating several Data sources, which can be also unstructured and, therefore, need an ontology-based reasoning. Accordingly, the methodologies have to adopt agile techniques, in order to change the multidimensional schema as the business requirements change, without a complete Design process. Furthermore, hybrid approaches must be used instead of the traditional Data-driven or requirement-driven approaches, in order to avoid missing the adhesion to user requirements and to produce a valuable multidimensional schema compliant with Data sources. In the paper, we perform a metric comparison among different methodologies, in order to demonstrate that methodologies classified as hybrid, ontology-based, automatic, and agile are tailored for the Big Data context.

  • Cost-benefit analysis of Data Warehouse Design methodologies
    Information Systems, 2017
    Co-Authors: Francesco Di Tria, Ezio Lefons, Filippo Tangorra
    Abstract:

    Methodologies for Data Warehouse Design are increasing more and more in last years, and each of them proposes a different point of view. Among all the methodologies present in literature, the promising ones are the hybrid methodologies—because they represent the only way to ensure a multidimensional schema to be both consistent with Data sources and adherent to user business goals—and those able to support the Designer by providing some kind of automation. However, the results obtainable by the methodologies can differ substantially in terms of schema quality and required efforts. In this paper, we provide metrics for evaluating the quality of multidimensional schemata in reference to the effort spent in the Design process and the automation degree of the methodology. As a case study, we apply our evaluation to the major emerging hybrid methodologies for Data Warehouse schema Design.

  • Academic Data Warehouse Design using a hybrid methodology
    Computer Science and Information Systems, 2015
    Co-Authors: Francesco Di Tria, Ezio Lefons, Filippo Tangorra
    Abstract:

    In the last years, Data warehousing has got attention from Universities which are now adopting business intelligence solutions in order to analyze crucial aspects of the academic context. In this paper, we present the architecture of a Business Intelligence system for academic organizations. Then, we illustrate the Design process of the Data Warehouse devoted to the analysis of the main factors affecting the importance and the quality level of every University, such as the evaluation of the Research and the Didactics. The Design process we describe is based on a hybrid methodology that is largely automatic and relies on an ontological approach for the integration of the different Data sources.