Data Warehouse System

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 16494 Experts worldwide ranked by ideXlab platform

Derya Birant - One of the best experts on this subject based on the ideXlab platform.

  • st dbscan an algorithm for clustering spatial temporal Data
    Data and Knowledge Engineering, 2007
    Co-Authors: Derya Birant
    Abstract:

    This paper presents a new density-based clustering algorithm, ST-DBSCAN, which is based on DBSCAN. We propose three marginal extensions to DBSCAN related with the identification of (i) core objects, (ii) noise objects, and (iii) adjacent clusters. In contrast to the existing density-based clustering algorithms, our algorithm has the ability of discovering clusters according to non-spatial, spatial and temporal values of the objects. In this paper, we also present a spatial-temporal Data Warehouse System designed for storing and clustering a wide range of spatial-temporal Data. We show an implementation of our algorithm by using this Data Warehouse and present the Data mining results.

  • An algorithm to discover spatial–temporal distributions of physical seawater characteristics and a case study in Turkish seas
    Journal of Marine Science and Technology, 2006
    Co-Authors: Derya Birant
    Abstract:

    Clustering is one of the major Data mining methods to obtain a number of clues about how the physical properties of the water are distributed in a marine environment. It is a difficult problem, especially when we consider the task for spatial–temporal marine Data. This study introduces a new clustering algorithm to discover regions that have similar physical seawater characteristics. In contrast to the existing density-based clustering algorithms, our algorithm has the ability of discovering clusters according to the nonspatial, spatial, and temporal values of the objects. Our algorithm also overcomes three drawbacks of existing clustering algorithms: problems in the identification of core objects, noise objects, and adjacent clusters. This paper also presents a spatial–temporal marine Data Warehouse System designed for storing and clustering physical Data from Turkish seas. Special functions were developed for Data integration, Data conversion, querying, visualization, analysis, and management. User-friendly interfaces were also developed, allowing relatively inexperienced users to operate the System. As a case study, we show the spatial–temporal distributions of sea surface temperature, sea surface height residual, and significant wave height values in Turkish seas to demonstrate our algorithm.

Jürgen Pleiss - One of the best experts on this subject based on the ideXlab platform.

  • DWARF – a Data Warehouse System for analyzing protein families
    BMC bioinformatics, 2006
    Co-Authors: Markus Fischer, Quan K Thai, Melanie Grieb, Jürgen Pleiss
    Abstract:

    The emerging field of integrative bioinformatics provides the tools to organize and Systematically analyze vast amounts of highly diverse biological Data and thus allows to gain a novel understanding of complex biological Systems. The Data Warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. The Data Warehouse System DWARF integrates Data on sequence, structure, and functional annotation for protein fold families. The underlying relational Data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading Data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the Database. The Data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering Database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. DWARF has been designed for constructing Databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a Systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering.

  • dwarf a Data Warehouse System for analyzing protein families
    BMC Bioinformatics, 2006
    Co-Authors: Markus Fischer, Quan K Thai, Melanie Grieb, Jürgen Pleiss
    Abstract:

    The emerging field of integrative bioinformatics provides the tools to organize and Systematically analyze vast amounts of highly diverse biological Data and thus allows to gain a novel understanding of complex biological Systems. The Data Warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. The Data Warehouse System DWARF integrates Data on sequence, structure, and functional annotation for protein fold families. The underlying relational Data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading Data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the Database. The Data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering Database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. DWARF has been designed for constructing Databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a Systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering.

Markus Fischer - One of the best experts on this subject based on the ideXlab platform.

  • DWARF – a Data Warehouse System for analyzing protein families
    BMC bioinformatics, 2006
    Co-Authors: Markus Fischer, Quan K Thai, Melanie Grieb, Jürgen Pleiss
    Abstract:

    The emerging field of integrative bioinformatics provides the tools to organize and Systematically analyze vast amounts of highly diverse biological Data and thus allows to gain a novel understanding of complex biological Systems. The Data Warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. The Data Warehouse System DWARF integrates Data on sequence, structure, and functional annotation for protein fold families. The underlying relational Data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading Data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the Database. The Data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering Database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. DWARF has been designed for constructing Databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a Systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering.

  • dwarf a Data Warehouse System for analyzing protein families
    BMC Bioinformatics, 2006
    Co-Authors: Markus Fischer, Quan K Thai, Melanie Grieb, Jürgen Pleiss
    Abstract:

    The emerging field of integrative bioinformatics provides the tools to organize and Systematically analyze vast amounts of highly diverse biological Data and thus allows to gain a novel understanding of complex biological Systems. The Data Warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. The Data Warehouse System DWARF integrates Data on sequence, structure, and functional annotation for protein fold families. The underlying relational Data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading Data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the Database. The Data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering Database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. DWARF has been designed for constructing Databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a Systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering.

Hua Yang Lin - One of the best experts on this subject based on the ideXlab platform.

  • A fuzzy-based decision-making procedure for Data Warehouse System selection
    Expert Systems with Applications, 2007
    Co-Authors: Hua Yang Lin, Ping-yu Hsu, Gwo-ji Sheen
    Abstract:

    The increase in the number of companies seeking Data warehousing solutions, in order to gain significant business advantages, has created the need for a decision-aid approach in choosing appropriate Data Warehouse (DW) Systems. Owing to the vague concepts frequently represented in decision environments, we have proposed a fuzzy multi-criteria decision-making procedure, to facilitate Data Warehouse System selection, with consideration given to both technical and managerial criteria. The procedure can Systematically construct the objectives of DW Systems selection to support the business goals and requirements of an organization, and identify the appropriate attributes or criteria for evaluation. In the fuzzy-based method, the weight of each criterion and the rating of each alternative are described using linguistic terms, which can also be expressed as triangular fuzzy numbers. The fuzzy algorithm aggregated the decision-makers' preference rating for criteria, and the suitability of Data Warehouse alternatives versus the selection criteria, to calculate fuzzy appropriateness indices, through which, the most suitable Data Warehouse System was determined. A case study of a Bar Code Implementation Project for Agricultural Products in Taiwan was conducted to illustrate this method's effectiveness.

  • Application of the Analytic Hierarchy Process on Data Warehouse System Selection Decisions for Small and Large Enterprises in Taiwan
    2007
    Co-Authors: Hua Yang Lin, Ping-yu Hsu
    Abstract:

    The study investigates the practice of Data Warehouse System selection decisions for small and medium-sized enterprises (SMEs) and large enterprises (LEs) in Taiwan. The increasing number of companies looking for Data warehousing solutions to gain a significant business advantage has created a need for Systematic method for choosing appropriate Data Warehouse System. The aim of this study is to determine the significant factors that influenced Data Warehouse System selection of LEs and SMEs. In this paper, both technical and managerial factors are considered to structure an evaluation hierarchy based on the Analytic Hierarchy Process (AHP). The AHP is used to examine the relative importance of Data Warehouse System selection criteria and subcriteria. The results indicate that SMEs selecting Data Warehouse Systems mainly concentrate on cost and vendor criteria, and LEs focus on technical criteria.

  • Application of the AHP in Data Warehouse System selection decisions for SMEs in Taiwan
    International Journal of Management and Enterprise Development, 2006
    Co-Authors: Hua Yang Lin, Ping-yu Hsu, Yung Tai Yeh
    Abstract:

    The study investigates the practice of Data Warehouse System selection decisions for Small and Medium-sized Enterprises (SMEs) in Taiwan. As Data Warehouse System evaluation and selection are often costly and time consuming, the need for a decision-aid approach to software selection is obvious, especially for SMEs. The aim of this study is to determine significant factors that influence Data Warehouse System selection of SMEs. Both technical and managerial factors are considered in structuring an evaluation hierarchy based on the Analytic Hierarchy Process (AHP). The results indicate that SMEs select Data Warehouse Systems that focus mainly on cost and vendor criteria.

Shiming Huang - One of the best experts on this subject based on the ideXlab platform.

  • A Space-Efficient Protocol for Consistency of External View Maintenance on Data Warehouse Systems
    Data Warehousing and Mining, 2008
    Co-Authors: Shiming Huang, David C. Yen, Hsiang-yuan Hsueh
    Abstract:

    The materialized view approach is widely adopted in implementations of Data Warehouse Systems in or-der for efficiency purposes. In terms of the construction of a materialized Data Warehouse System, some managerial problems still exist to most developers and users in the view resource maintenance area in particular. Resource redundancy and Data inconsistency among materialized views in a Data Warehouse System is a problem that many developers and users struggle with. In this article, a space-efficient protocol for materialized view maintenance with a global Data view on Data Warehouses with embedded proxies is proposed. In the protocol set, multilevel proxy-based protocols with a Data compensating mechanism are provided to certify the consistency and uniqueness of materialized Data among Data resources and materialized views. The authors also provide a set of evaluation experiences and derivations to verify the feasibility of proposed protocols and mechanisms. With such protocols as proxy services, the performance and space utilization of the materialized view approach will be improved. Furthermore, the consistency issue among materialized Data Warehouses and heterogeneous Data sources can be properly accomplished by applying a dynamic compensating and synchronization mechanism. The trade-off between efficiency, storage consumption, and Data validity for view maintenance tasks can be properly balanced.

  • Intelligent Cache Management for Mobile Data Warehouse Systems
    Data Warehousing and Mining, 2008
    Co-Authors: Shiming Huang, Binshan Lin, Qun-shi Deng
    Abstract:

    This research proposes an intelligent cache mechanism for a Data Warehouse System in a mobile environment. Because mobile devices can often be disconnected from the host server and due to the low bandwidth of wireless networks, it is more efficient to store query results from a mobile device in the cache. For more personal use of mobile devices, we use a Data mining technique to determine the pattern from a record of previous queries. Then the Data, which will be retrieved by the user, are prefetched and stored in the cache, thus, improving the query efficiency. We demonstrate the feasibility of the proposed approach with experiments using simulation. Comparison of our approach with a standard approach indicates that there is a significant advantage to using mobile Data Warehouse Systems.

  • A Space-Efficient Protocol for Consistency of External View Maintenance on Data Warehouse Systems: A Proxy Approach
    Journal of Database Management, 2007
    Co-Authors: Shiming Huang, David C. Yen, Hsiang-yuan Hsueh
    Abstract:

    The materialized view approach is widely adopted in implementations of Data Warehouse Systems in or-der for efficiency purposes. In terms of the construction of a materialized Data Warehouse System, some managerial problems still exist to most developers and users in the view resource maintenance area in particular. Resource redundancy and Data inconsistency among materialized views in a Data Warehouse System is a problem that many developers and users struggle with. In this article, a space-efficient protocol for materialized view maintenance with a global Data view on Data Warehouses with embedded proxies is proposed. In the protocol set, multilevel proxy-based protocols with a Data compensating mechanism are provided to certify the consistency and uniqueness of materialized Data among Data resources and materialized views. The authors also provide a set of evaluation experiences and derivations to verify the feasibility of proposed protocols and mechanisms. With such protocols as proxy services, the performance and space utilization of the materialized view approach will be improved. Furthermore, the consistency issue among materialized Data Warehouses and heterogeneous Data sources can be properly accomplished by applying a dynamic compensating and synchronization mechanism. The trade-off between efficiency, storage consumption, and Data validity for view maintenance tasks can be properly balanced.

  • IDEAL - The Development of an XML-Based Data Warehouse System
    Intelligent Data Engineering and Automated Learning — IDEAL 2002, 2002
    Co-Authors: Shiming Huang
    Abstract:

    Along with the enterprise globalization and Internet popularization, the Internet-based Data Warehouse System (DWS) has gradually replaced the traditional DWS and becomes its mainstream structure. The manager can easily obtain and share the Data on the distribution System using the Internet. Through the multiple Data source collections, the quality and broad base of DWS can be increased and thus help managers to make more decisive policies. However, utilizing the basic client/server structure of DWS can increase many tolerances and cost based problems. This paper uses the XML to establish the Internet-based DWS and utilize the advantage of its flexibility, self-definition, self-description and low cost to improve the unavoidable defect of the client/server DWS. We also use pull and push method approaches to determine what information can be shared on the Internet or delivered through e-mail. In this work, we show that the DWS architecture can not only improve the scalability and speed but also enhance the System security. In addition, it can be applied for both traditional client/server DWS and web-based DWS. We present a case study to prove the validity of this System architecture and create a prototype System to show the feasibility of this System architecture.

  • the development of an xml based Data Warehouse System
    Intelligent Data Engineering and Automated Learning, 2002
    Co-Authors: Shiming Huang
    Abstract:

    Along with the enterprise globalization and Internet popularization, the Internet-based Data Warehouse System (DWS) has gradually replaced the traditional DWS and becomes its mainstream structure. The manager can easily obtain and share the Data on the distribution System using the Internet. Through the multiple Data source collections, the quality and broad base of DWS can be increased and thus help managers to make more decisive policies. However, utilizing the basic client/server structure of DWS can increase many tolerances and cost based problems. This paper uses the XML to establish the Internet-based DWS and utilize the advantage of its flexibility, self-definition, self-description and low cost to improve the unavoidable defect of the client/server DWS. We also use pull and push method approaches to determine what information can be shared on the Internet or delivered through e-mail. In this work, we show that the DWS architecture can not only improve the scalability and speed but also enhance the System security. In addition, it can be applied for both traditional client/server DWS and web-based DWS. We present a case study to prove the validity of this System architecture and create a prototype System to show the feasibility of this System architecture.