Data Integration System

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 2622 Experts worldwide ranked by ideXlab platform

Sun Zhihui - One of the best experts on this subject based on the ideXlab platform.

Yacine Sam - One of the best experts on this subject based on the ideXlab platform.

  • Querying a Semi-Automated Data Integration System
    2012
    Co-Authors: Cheikh Niang, Béatrice Bouchou-markhoff, Yacine Sam
    Abstract:

    A Data Integration System enables users to query a unified view of Data sources through a global schema. We consider a semi-automatically built Data Integration System in the semantic web context, where Data sources are annotated with ontologies. The global schema is also an ontology, expressed in DL-LiteA. After the semi-automated building of the global schema in a previous work, we focus here on the second part of this System, dedicated to the query answering process. We show how it can rely on either GAV or LAV mappings, that are automatically computed. We present algorithms for both cases, and we discuss the properties of this semi-automatically built Data Integration System

  • DEXA (2) - Querying a Semi-automated Data Integration System
    Lecture Notes in Computer Science, 2012
    Co-Authors: Niang, Béatrice Bouchou, Yacine Sam
    Abstract:

    A Data Integration System enables users to query a unified view of Data sources through a global schema. We consider a semi-automatically built Data Integration System in the semantic web context, where Data sources are annotated with ontologies. The global schema is also an ontology, expressed in DL-\(Lite_{\cal A}\). After the semi-automated building of the global schema in a previous work, we focus here on the second part of this System, dedicated to the query answering process. We show how it can rely on either GAV or LAV mappings, that are automatically computed. We present algorithms for both cases, and we discuss the properties of this semi-automatically built Data Integration System.

Zhang Xia-ning - One of the best experts on this subject based on the ideXlab platform.

  • Dynamic composition of services in Data Integration System
    Journal of Computer Applications, 2010
    Co-Authors: Zhang Xia-ning
    Abstract:

    With the growing number of service techniques in Data Integration System,it is necessary to compose existing services dynamically according to service request.The service similarity was measured by the service ontology similarity based on service ontology.An optimized graph for service composition was constructed based on service similarity,therefore the service composition problem changed to tree search problem while the graph was changed to tree.And then an efficient algorithm based on the search tree was presented to accomplish service composition.The simulation proves,compared with existing methods,the method can ensure quality and efficiency while composing services automatically according to service request.

Daniel S. Weld - One of the best experts on this subject based on the ideXlab platform.

  • the nimble xml Data Integration System
    International Conference on Data Engineering, 2001
    Co-Authors: Denise L. Draper, Alon Halevy, Daniel S. Weld
    Abstract:

    For better or for worse, XML has emerged as a de facto standard for Data interchange. This consensus is likely to lead to increased demand for technology that allows users to integrate Data from a variety of applications, repositories, and partners, which are located across the corporate intranet or on the Internet. Nimble Technology has spent two years developing a product to service this market. Originally conceived after decades of person-years of research on Data Integration, the product is now being deployed at several Fortune-500 beta-customer sites. The article reports on the key challenges faced in the design of our product and highlights some issues which require more attention from the research community. In particular we address architectural issues arising from designing a product to support XML as its core representation, choices in the design of the underlying algebra, on-the-fly Data cleaning and caching and materialization policies.

  • an adaptive query execution System for Data Integration
    International Conference on Management of Data, 1999
    Co-Authors: Zachary G Ives, Daniela Florescu, Marc Friedman, Alon Y Levy, Daniel S. Weld
    Abstract:

    Query processing in Data Integration occurs over network-bound, autonomous Data sources. This requires extensions to traditional optimization and execution techniques for three reasons: there is an absence of quality statistics about the Data, Data transfer rates are unpredictable and bursty, and slow or unavailable Data sources can often be replaced by overlapping or mirrored sources. This paper presents the Tukwila Data Integration System, designed to support adaptivity at its core using a two-pronged approach. Interleaved planning and execution with partial optimization allows Tukwila to quickly recover from decisions based on inaccurate estimates. During execution, Tukwila uses adaptive query operators such as the double pipelined hash join, which produces answers quickly, and the dynamic collector, which robustly and efficiently computes unions across overlapping Data sources. We demonstrate that the Tukwila architecture extends previous innovations in adaptive execution (such as query scrambling, mid-execution re-optimization, and choose nodes), and we present experimental evidence that our techniques result in behavior desirable for a Data Integration System.

  • ICDE - The Nimble XML Data Integration System
    Proceedings 17th International Conference on Data Engineering, 1
    Co-Authors: Denise L. Draper, Alon Halevy, Daniel S. Weld
    Abstract:

    For better or for worse, XML has emerged as a de facto standard for Data interchange. This consensus is likely to lead to increased demand for technology that allows users to integrate Data from a variety of applications, repositories, and partners, which are located across the corporate intranet or on the Internet. Nimble Technology has spent two years developing a product to service this market. Originally conceived after decades of person-years of research on Data Integration, the product is now being deployed at several Fortune-500 beta-customer sites. The article reports on the key challenges faced in the design of our product and highlights some issues which require more attention from the research community. In particular we address architectural issues arising from designing a product to support XML as its core representation, choices in the design of the underlying algebra, on-the-fly Data cleaning and caching and materialization policies.

Ashraf Aboulnaga - One of the best experts on this subject based on the ideXlab platform.

  • SIGMOD Conference - Schema clustering and retrieval for multi-domain pay-as-you-go Data Integration Systems
    Proceedings of the 2010 international conference on Management of data - SIGMOD '10, 2010
    Co-Authors: Hatem A. Mahmoud, Ashraf Aboulnaga
    Abstract:

    A Data Integration System offers a single interface to multiple structured Data sources. Many application contexts (e.g., searching structured Data on the web) involve the Integration of large numbers of structured Data sources. At web scale, it is impractical to use manual or semi-automatic Data Integration methods, so a pay-as-you-go approach is more appropriate. A pay-as-you-go approach entails using a fully automatic approximate Data Integration technique to provide an initial Data Integration System (i.e., an initial mediated schema, and initial mappings from source schemas to the mediated schema), and then refining the System as it gets used. Previous research has investigated automatic approximate Data Integration techniques, but all existing techniques require the schemas being integrated to belong to the same conceptual domain. At web scale, it is impractical to classify schemas into domains manually or semi-automatically, which limits the applicability of these techniques. In this paper, we present an approach for clustering schemas into domains without any human intervention and based only on the names of attributes in the schemas. Our clustering approach deals with uncertainty in assigning schemas to domains using a probabilistic model. We also propose a query classifier that determines, for a given a keyword query, the most relevant domains to this query. We experimentally demonstrate the effectiveness of our schema clustering and query classification techniques.

  • ICDE - μBE: User Guided Source Selection and Schema Mediation for Internet Scale Data Integration
    2007 IEEE 23rd International Conference on Data Engineering, 2007
    Co-Authors: Ashraf Aboulnaga, Kareem El Gebaly, Daniel Wong
    Abstract:

    The typical approach to Data Integration is to start by defining a common mediated schema, and then to map the Data sources being integrated to this schema. In Internet-scale Data Integration tasks, where there may be hundreds or thousands of Data sources providing Data of relevance to a particular domain, a better approach is to allow the user to discover the mediated schema and the set of sources to use through an iterative exploration of the space of possible schemas and sources. In this paper, we present μBE, a Data Integration tool that helps in this iterative exploratory process by automatically choosing the Data sources to include in a Data Integration System and defining a mediated schema on these sources. The Data Integration System desired by the user may depend on several subjective and objective criteria, and the user guides μBE towards finding this System by iteratively solving a series of constrained non-linear optimization problems, and modifying the parameters and constraints of the problem in the next iteration based on the solution found in the current iteration. Our formulation of the optimization problem is designed to make it easy for the user to provide such feedback. A simple, intuitive user interface helps the user in this process. We experimentally demonstrate that μBE is efficient and finds high-quality Data Integration solutions.