Data Warehouse Environment

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1170 Experts worldwide ranked by ideXlab platform

Brian Lings - One of the best experts on this subject based on the ideXlab platform.

  • implementation and comparative evaluation of maintenance policies in a Data Warehouse Environment
    British National Conference on Databases, 2002
    Co-Authors: Henrik Engstrom, Sharma Chakravarthy, Brian Lings
    Abstract:

    Data Warehouse maintenance is the task of updating a materialised view to reflect changes to autonomous, heterogeneous and distributed sources. Selection of a maintenance policy has been shown to depend on source and view properties, and on the user specified criteria (such as staleness, response time etc.), which are mapped on to evaluation criteria. In our previous work, we have analysed source and view characteristics, and user requirements to derive a costmodel. Maintenance policy selection has thus been cast as an optimisation problem.This paper takes a complementary approach to evaluating maintenance policies, by implementing a test-bed which allows us to vary source characteristics and wrapper location. The test-bed is instrumented to allow costs associated with a policy to be measured. An actual DBMS (InterBase) has been used as a relational source and an XML web server has been used as a non-relational source. The experiments clearly show that maintenance policy performance can be highly sensitive to source capabilities, which can therefore significantly affect policy selection. They have further substantiated some of the conjectures found in the literature. Some of the lessons learnt from this test-bed implementation and evaluation are reviewed.

  • BNCOD - Implementation and Comparative Evaluation of Maintenance Policies in a Data Warehouse Environment
    Lecture Notes in Computer Science, 2002
    Co-Authors: Henrik Engstrom, Sharma Chakravarthy, Brian Lings
    Abstract:

    Data Warehouse maintenance is the task of updating a materialised view to reflect changes to autonomous, heterogeneous and distributed sources. Selection of a maintenance policy has been shown to depend on source and view properties, and on the user specified criteria (such as staleness, response time etc.), which are mapped on to evaluation criteria. In our previous work, we have analysed source and view characteristics, and user requirements to derive a costmodel. Maintenance policy selection has thus been cast as an optimisation problem.This paper takes a complementary approach to evaluating maintenance policies, by implementing a test-bed which allows us to vary source characteristics and wrapper location. The test-bed is instrumented to allow costs associated with a policy to be measured. An actual DBMS (InterBase) has been used as a relational source and an XML web server has been used as a non-relational source. The experiments clearly show that maintenance policy performance can be highly sensitive to source capabilities, which can therefore significantly affect policy selection. They have further substantiated some of the conjectures found in the literature. Some of the lessons learnt from this test-bed implementation and evaluation are reviewed.

  • a systematic approach to selecting maintenance policies in a Data Warehouse Environment
    Extending Database Technology, 2002
    Co-Authors: Henrik Engstrom, Sharma Chakravarthy, Brian Lings
    Abstract:

    Most work on Data warehousing addresses aspects related to the internal operation of a Data Warehouse server, such as selection of views to materialise, maintenance of aggregate views and performance of OLAP queries. Issues related to Data Warehouse maintenance, i.e. how changes to autonomous sources should be detected and propagated to a Warehouse, have been addressed in a fragmented manner. Although Data propagation policies, source Database capabilities, and user requirements have been addressed individually, their co-dependencies and relationships have not been explored. In this paper, we present a comprehensive framework for evaluating Data propagation policies against Data Warehouse requirements and source capabilities. We formalize Data Warehouse specification along the dimensions of staleness, response time, storage, and computation cost, and classify source Databases according to their Data propagation capabilities. A detailed cost-model is presented for a representative set of policies. A prototype tool has been developed to allow an exploration of the various trade-offs.

  • EDBT - A Systematic Approach to Selecting Maintenance Policies in a Data Warehouse Environment
    Advances in Database Technology — EDBT 2002, 2002
    Co-Authors: Henrik Engstrom, Sharma Chakravarthy, Brian Lings
    Abstract:

    Most work on Data warehousing addresses aspects related to the internal operation of a Data Warehouse server, such as selection of views to materialise, maintenance of aggregate views and performance of OLAP queries. Issues related to Data Warehouse maintenance, i.e. how changes to autonomous sources should be detected and propagated to a Warehouse, have been addressed in a fragmented manner. Although Data propagation policies, source Database capabilities, and user requirements have been addressed individually, their co-dependencies and relationships have not been explored. In this paper, we present a comprehensive framework for evaluating Data propagation policies against Data Warehouse requirements and source capabilities. We formalize Data Warehouse specification along the dimensions of staleness, response time, storage, and computation cost, and classify source Databases according to their Data propagation capabilities. A detailed cost-model is presented for a representative set of policies. A prototype tool has been developed to allow an exploration of the various trade-offs.

  • a benchmark comparison of maintenance policies in a Data Warehouse Environment
    2001
    Co-Authors: Henrik Engstrom, Gionata Gelati, Brian Lings
    Abstract:

    A Data Warehouse contains Data originating from autonomous sources. Various maintenance policies have been suggested which specify when and how changes to a source should be propagated to the Data Warehouse. Engstrom et al.(HS-IDA-TR-00-001) present a cost-based model which makes it possible to compare and select policies based on quality of service as well as system properties. This paper presents a simulation Environment for benchmarking maintenance policies. The main aim is to compare benchmark results with predictions from the cost-model. We report results from a set of experiments which all have a close correspondence with the cost-model predictions. The process of developing the simulation Environment and conducting experiments has, in addition, given us valuable insights into the maintenance problem, which are reported in the paper.

J Yang - One of the best experts on this subject based on the ideXlab platform.

  • An evolutionary approach to materialized views selection in a Data Warehouse Environment
    IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews), 2001
    Co-Authors: C Zhang, J Yang
    Abstract:

    A Data Warehouse (DW) contains multiple views accessed by queries. One of the most important decisions in designing a DW is selecting views to materialize for the purpose of efficiently supporting decision making. The search space for possible materialized views is exponentially large. Therefore heuristics have been used to search for a near optimal solution. In this paper, we explore the use of an evolutionary algorithm for materialized view selection based on multiple global processing plans for queries. We apply a hybrid evolutionary algorithm to solve three related problems. The first is to optimize queries. The second is to choose the best global processing plan from multiple global processing plans. The third is to select materialized views from a given global processing plan. Our experiment shows that the hybrid evolutionary algorithm delivers better performance than either the evolutionary algorithm or heuristics used alone in terms of the minimal query and maintenance cost and the evaluation cost to obtain the minimal cost.

  • materialized view evolution support in Data Warehouse Environment
    Database Systems for Advanced Applications, 1999
    Co-Authors: C Zhang, J Yang
    Abstract:

    As a sufficiently abstract level, the Data in the Data Warehouse can be seen as a set of materialized views, where the base Data resides at the information sources. These materialized views are designed based on the users' requirements (e.g., frequently asked queries). However, a Data Warehouse is a dynamic Environment, i.e., when user query requirement changes, the existing materialized views should evolve to meet the new requirement. These changes will demand schema changes at the Warehouse and should be handled with as little disruption or modification to other components of the warehousing system as possible. We propose a framework to determine if the existing materialized views will be affected by the requirement changes, and how they are affected. Algorithms are proposed to deal with the situation when a new query is added in. The aim of the algorithms is to efficiently get the new materialized views by analysing the relationships among queries using MVPP, a specification for a query processing plan.

  • DASFAA - Materialized view evolution support in Data Warehouse Environment
    Proceedings. 6th International Conference on Advanced Systems for Advanced Applications, 1999
    Co-Authors: C Zhang, J Yang
    Abstract:

    As a sufficiently abstract level, the Data in the Data Warehouse can be seen as a set of materialized views, where the base Data resides at the information sources. These materialized views are designed based on the users' requirements (e.g., frequently asked queries). However, a Data Warehouse is a dynamic Environment, i.e., when user query requirement changes, the existing materialized views should evolve to meet the new requirement. These changes will demand schema changes at the Warehouse and should be handled with as little disruption or modification to other components of the warehousing system as possible. We propose a framework to determine if the existing materialized views will be affected by the requirement changes, and how they are affected. Algorithms are proposed to deal with the situation when a new query is added in. The aim of the algorithms is to efficiently get the new materialized views by analysing the relationships among queries using MVPP, a specification for a query processing plan.

Henrik Engstrom - One of the best experts on this subject based on the ideXlab platform.

  • implementation and comparative evaluation of maintenance policies in a Data Warehouse Environment
    British National Conference on Databases, 2002
    Co-Authors: Henrik Engstrom, Sharma Chakravarthy, Brian Lings
    Abstract:

    Data Warehouse maintenance is the task of updating a materialised view to reflect changes to autonomous, heterogeneous and distributed sources. Selection of a maintenance policy has been shown to depend on source and view properties, and on the user specified criteria (such as staleness, response time etc.), which are mapped on to evaluation criteria. In our previous work, we have analysed source and view characteristics, and user requirements to derive a costmodel. Maintenance policy selection has thus been cast as an optimisation problem.This paper takes a complementary approach to evaluating maintenance policies, by implementing a test-bed which allows us to vary source characteristics and wrapper location. The test-bed is instrumented to allow costs associated with a policy to be measured. An actual DBMS (InterBase) has been used as a relational source and an XML web server has been used as a non-relational source. The experiments clearly show that maintenance policy performance can be highly sensitive to source capabilities, which can therefore significantly affect policy selection. They have further substantiated some of the conjectures found in the literature. Some of the lessons learnt from this test-bed implementation and evaluation are reviewed.

  • BNCOD - Implementation and Comparative Evaluation of Maintenance Policies in a Data Warehouse Environment
    Lecture Notes in Computer Science, 2002
    Co-Authors: Henrik Engstrom, Sharma Chakravarthy, Brian Lings
    Abstract:

    Data Warehouse maintenance is the task of updating a materialised view to reflect changes to autonomous, heterogeneous and distributed sources. Selection of a maintenance policy has been shown to depend on source and view properties, and on the user specified criteria (such as staleness, response time etc.), which are mapped on to evaluation criteria. In our previous work, we have analysed source and view characteristics, and user requirements to derive a costmodel. Maintenance policy selection has thus been cast as an optimisation problem.This paper takes a complementary approach to evaluating maintenance policies, by implementing a test-bed which allows us to vary source characteristics and wrapper location. The test-bed is instrumented to allow costs associated with a policy to be measured. An actual DBMS (InterBase) has been used as a relational source and an XML web server has been used as a non-relational source. The experiments clearly show that maintenance policy performance can be highly sensitive to source capabilities, which can therefore significantly affect policy selection. They have further substantiated some of the conjectures found in the literature. Some of the lessons learnt from this test-bed implementation and evaluation are reviewed.

  • a systematic approach to selecting maintenance policies in a Data Warehouse Environment
    Extending Database Technology, 2002
    Co-Authors: Henrik Engstrom, Sharma Chakravarthy, Brian Lings
    Abstract:

    Most work on Data warehousing addresses aspects related to the internal operation of a Data Warehouse server, such as selection of views to materialise, maintenance of aggregate views and performance of OLAP queries. Issues related to Data Warehouse maintenance, i.e. how changes to autonomous sources should be detected and propagated to a Warehouse, have been addressed in a fragmented manner. Although Data propagation policies, source Database capabilities, and user requirements have been addressed individually, their co-dependencies and relationships have not been explored. In this paper, we present a comprehensive framework for evaluating Data propagation policies against Data Warehouse requirements and source capabilities. We formalize Data Warehouse specification along the dimensions of staleness, response time, storage, and computation cost, and classify source Databases according to their Data propagation capabilities. A detailed cost-model is presented for a representative set of policies. A prototype tool has been developed to allow an exploration of the various trade-offs.

  • EDBT - A Systematic Approach to Selecting Maintenance Policies in a Data Warehouse Environment
    Advances in Database Technology — EDBT 2002, 2002
    Co-Authors: Henrik Engstrom, Sharma Chakravarthy, Brian Lings
    Abstract:

    Most work on Data warehousing addresses aspects related to the internal operation of a Data Warehouse server, such as selection of views to materialise, maintenance of aggregate views and performance of OLAP queries. Issues related to Data Warehouse maintenance, i.e. how changes to autonomous sources should be detected and propagated to a Warehouse, have been addressed in a fragmented manner. Although Data propagation policies, source Database capabilities, and user requirements have been addressed individually, their co-dependencies and relationships have not been explored. In this paper, we present a comprehensive framework for evaluating Data propagation policies against Data Warehouse requirements and source capabilities. We formalize Data Warehouse specification along the dimensions of staleness, response time, storage, and computation cost, and classify source Databases according to their Data propagation capabilities. A detailed cost-model is presented for a representative set of policies. A prototype tool has been developed to allow an exploration of the various trade-offs.

  • a benchmark comparison of maintenance policies in a Data Warehouse Environment
    2001
    Co-Authors: Henrik Engstrom, Gionata Gelati, Brian Lings
    Abstract:

    A Data Warehouse contains Data originating from autonomous sources. Various maintenance policies have been suggested which specify when and how changes to a source should be propagated to the Data Warehouse. Engstrom et al.(HS-IDA-TR-00-001) present a cost-based model which makes it possible to compare and select policies based on quality of service as well as system properties. This paper presents a simulation Environment for benchmarking maintenance policies. The main aim is to compare benchmark results with predictions from the cost-model. We report results from a set of experiments which all have a close correspondence with the cost-model predictions. The process of developing the simulation Environment and conducting experiments has, in addition, given us valuable insights into the maintenance problem, which are reported in the paper.

Masaki Hasegawa - One of the best experts on this subject based on the ideXlab platform.

  • parallel generation of base relation snapshots for materialised view maintenance in Data Warehouse Environment
    Computational Science and Engineering, 2007
    Co-Authors: S. Saeki, Subhash Bhalla, Masaki Hasegawa
    Abstract:

    A Data Warehouse organises and stores Data needed for informational, analytical processing over a long historical time perspective. These keep a materialised view (such as historical Data), and user queries are processed using this view. The view has to be maintained to reflect the updates done against the base relations stored at the various distributed Data sources. For unsophisticated sources, in practice it is often necessary to infer modifications by periodically comparing snapshots and backup copies of Data from the source. This study considers the materialised view and its maintenance. Various differential snapshot algorithms have been compared.

  • parallel generation of base relation snapshots for materialized view maintenance in Data Warehouse Environment
    International Conference on Parallel Processing, 2002
    Co-Authors: S. Saeki, Subhash Bhalla, Masaki Hasegawa
    Abstract:

    Data Warehouses are used in many applications that depend on distributed systems. A Data Warehouse supports information processing by providing a single platform of integrated, historical Data for doing analysis. Data Warehouses provide the facility for integration in a world of unintegrated application systems. The contents of a Data Warehouse are evolved in an evolutionary, step-at-a-time fashion. A Data Warehouse organizes and stores the Data needed for informational, analytical processing over a long historical time perspective. Data Warehouses keep a materialized view (such as historical Data), and user queries are processed using this view. The view has to be maintained to reflect the updates done against the base relations stored at the various distributed Data sources. Detecting and extracting modifications from information sources is an integral part of a Data Warehouse. For unsophisticated sources, in practice it is often necessary to infer modifications by periodically comparing snapshots and backup copies of Data from the source. This study considers the materialized view and its maintenance. Various implementation and performance evaluation of the differential snapshot algorithms have been compared for evaluation of suitable alternatives.

C Zhang - One of the best experts on this subject based on the ideXlab platform.

  • An evolutionary approach to materialized views selection in a Data Warehouse Environment
    IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews), 2001
    Co-Authors: C Zhang, J Yang
    Abstract:

    A Data Warehouse (DW) contains multiple views accessed by queries. One of the most important decisions in designing a DW is selecting views to materialize for the purpose of efficiently supporting decision making. The search space for possible materialized views is exponentially large. Therefore heuristics have been used to search for a near optimal solution. In this paper, we explore the use of an evolutionary algorithm for materialized view selection based on multiple global processing plans for queries. We apply a hybrid evolutionary algorithm to solve three related problems. The first is to optimize queries. The second is to choose the best global processing plan from multiple global processing plans. The third is to select materialized views from a given global processing plan. Our experiment shows that the hybrid evolutionary algorithm delivers better performance than either the evolutionary algorithm or heuristics used alone in terms of the minimal query and maintenance cost and the evaluation cost to obtain the minimal cost.

  • materialized view evolution support in Data Warehouse Environment
    Database Systems for Advanced Applications, 1999
    Co-Authors: C Zhang, J Yang
    Abstract:

    As a sufficiently abstract level, the Data in the Data Warehouse can be seen as a set of materialized views, where the base Data resides at the information sources. These materialized views are designed based on the users' requirements (e.g., frequently asked queries). However, a Data Warehouse is a dynamic Environment, i.e., when user query requirement changes, the existing materialized views should evolve to meet the new requirement. These changes will demand schema changes at the Warehouse and should be handled with as little disruption or modification to other components of the warehousing system as possible. We propose a framework to determine if the existing materialized views will be affected by the requirement changes, and how they are affected. Algorithms are proposed to deal with the situation when a new query is added in. The aim of the algorithms is to efficiently get the new materialized views by analysing the relationships among queries using MVPP, a specification for a query processing plan.

  • DASFAA - Materialized view evolution support in Data Warehouse Environment
    Proceedings. 6th International Conference on Advanced Systems for Advanced Applications, 1999
    Co-Authors: C Zhang, J Yang
    Abstract:

    As a sufficiently abstract level, the Data in the Data Warehouse can be seen as a set of materialized views, where the base Data resides at the information sources. These materialized views are designed based on the users' requirements (e.g., frequently asked queries). However, a Data Warehouse is a dynamic Environment, i.e., when user query requirement changes, the existing materialized views should evolve to meet the new requirement. These changes will demand schema changes at the Warehouse and should be handled with as little disruption or modification to other components of the warehousing system as possible. We propose a framework to determine if the existing materialized views will be affected by the requirement changes, and how they are affected. Algorithms are proposed to deal with the situation when a new query is added in. The aim of the algorithms is to efficiently get the new materialized views by analysing the relationships among queries using MVPP, a specification for a query processing plan.