Referential Integrity

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1569 Experts worldwide ranked by ideXlab platform

Sebastian Link - One of the best experts on this subject based on the ideXlab platform.

  • Referential Integrity under uncertain data
    Conference on Advanced Information Systems Engineering, 2021
    Co-Authors: Sebastian Link, Ziheng Wei
    Abstract:

    Together with domain and entity Integrity, Referential Integrity embodies the Integrity principles of information systems. While relational databases address applications for data that is certain, modern applications require the handling of uncertain data. In particular, the veracity of big data and the complex integration of data from heterogeneous sources leave Referential Integrity vulnerable. We apply possibility theory to introduce the class of possibilistic inclusion dependencies. We show that our class inherits good computational properties from relational inclusion dependencies. In particular, we show that the associated implication problem is PSPACE-complete, but fixed-parameter tractable in the input arity. Combined with possibilistic keys and functional dependencies, our framework makes it possible to quantify the degree of trust in entities and relationships.

  • entity Integrity Referential Integrity and query optimization with embedded uniqueness constraints
    International Conference on Data Engineering, 2019
    Co-Authors: Ziheng Wei, Uwe Leck, Sebastian Link
    Abstract:

    Embedded uniqueness constraints represent unique column combinations embedded in complete fragments of incomplete data. In contrast to SQL UNIQUE constraints, they offer a principled separation of completeness and uniqueness requirements and are capable of exploiting more resource-conscious index structures. The latter help relational database systems to be more efficient in enforcing entity and Referential Integrity, and in evaluating common types of queries.

  • index design for enforcing partial Referential Integrity efficiently
    Extending Database Technology, 2015
    Co-Authors: Mozhgan Memari, Sebastian Link
    Abstract:

    Referential Integrity is fundamental for data processing and data quality. The SQL standard proposes di↵erent semantics under which Referential Integrity can be enforced in practice. Under simple semantics, only total foreign key values must be matched by some referenced key values. Under partial semantics, total and partial foreign key values must be matched by some referenced key values. Support for simple semantics is extensive and widespread across di↵erent database management systems but, surprisingly, partial semantics does not enjoy any native support in any known systems. Previous research has left open the questions whether partial Referential Integrity is useful for any real-world applications and whether it can enjoy ecient support at the systems level. As our first contribution we show that ecient support for partial Referential Integrity can provide database users with intelligent query and update services. Indeed, we regard partial semantics as an e↵ective imputation technique for missing data in query answers and update operations, which increases the quality of these services. As our second contribution we show how partial Referential Integrity can be enforced eciently for real-world foreign keys. For that purpose we propose triggers and exploit di↵erent index structures. Our experiments with synthetic and benchmark data sets confirm that our index structures do not only boost the performance of the state-of-the-art recommendation for enforcing partial semantics in real-world foreign keys, but show trends that are similar to enforcing simple semantics.

  • static analysis of partial Referential Integrity for better quality sql data
    Americas Conference on Information Systems, 2013
    Co-Authors: Sebastian Link, Mozhgan Memari
    Abstract:

    Referential Integrity ensures the consistency of data between database relations. The SQL standard proposes different semantics to deal with partial information under Referential Integrity. Simple semantics neglects tuples with nulls, and enjoys built-in support by commercial database systems. Partial semantics does check tuples with nulls, but does not enjoy built-in support. We investigate this mismatch between the SQL standard and real database systems. Indeed, insight is gained into the trade-off between cleaner data under partial semantics and the efficiency of checking simple semantics. The cost for Referential Integrity checking is evaluated for various dataset sizes, indexing structures and degrees of cleanliness. While the cost of partial semantics exceeds that of simple semantics, their performance trends follow similar patterns under growing database sizes. Applying multiple index structures and exploiting appropriate validation mechanisms increase the efficiency of checking partial semantics.

  • ER - Collection type constructors in entity-relationship modeling
    Conceptual Modeling - ER 2007, 2007
    Co-Authors: Sven Hartmann, Sebastian Link
    Abstract:

    Collections play an important part in everyday life. Therefore, conceptual data models should support collection types to make data modeling as natural as possible for its users. Based on the fundamental properties of endorsing order and multiplicity of its elements we introduce the collection types of rankings, lists, sets and bags into the framework of Entity-Relationship modeling. This provides users with easy-to-use constructors that naturally model different kinds of collections. Moreover, we propose a transformation of extended ER schemata into relational database schemata. The transformation is intuitive and invertable introducing surrogate attributes that preserve the semantics of the collection. Furthermore, it is a proper extension to previous transformations, and results in a relational database schema that is in Inclusion Dependency Normal Form. In addition, we introduce a uniqueness constraint that identifies collections uniquely and guarantees Referential Integrity at the same time.

Carlos Ordonez - One of the best experts on this subject based on the ideXlab platform.

  • Evaluating Join Performance on Relational Database Systems
    2016
    Co-Authors: Carlos Ordonez, Javier García-garcía
    Abstract:

    The join operator is fundamental in relational database systems. Evaluating join queries on large tables is challenging because records need to be efficiently matched based on a given key. In this work, we analyze join queries in SQL with large tables in which a foreign key may be null, invalid or valid, given a Referential Integrity constraint. We conduct an extensive join performance evaluation on three DBMSs. Specifically, we study join queries varying table sizes, row size and key probabilistic distribution, inserting null, invalid or valid foreign key values. We also benchmark three well-known query optimizations: view materialization, secondary index and join reordering. Our experiments show certain optimizations perform well across DBMSs, whereas other optimizations depend on the DBMS architecture. General Terms: query optimization; performance evaluatio

  • repairing olap queries in databases with Referential Integrity errors
    Data Warehousing and OLAP, 2010
    Co-Authors: Javier Garciagarcia, Carlos Ordonez
    Abstract:

    Many database applications and OLAP tools dynamically generate SQL queries involving join operators and aggregate functions and send these queries to a database server for execution. This dynamically generated SQL code normally assumes the underlying tables and columns are clean and lacks the necessary robustness to deal with foreign keys with null and invalid or undefined values that are ubiquitous in databases with inconsistent or incomplete content. The outcome is that at query time, several issues arise mostly as inconsistencies in answer sets, difficult to detect and explain by users of OLAP tools. In this article, we present an automated query rewriting method for automatically generated OLAP queries that are executed over tables with foreign key columns having potentially null or invalid values. Our method is applicable in queries that use join operators and aggregate functions obeying the summarizability property (e.g. sum(), count()). If a user of an OLAP tool wants or requests it, using our method the queries that use join operators may be rewritten and he or she may be warned of the Referential Integrity condition of the underlying database and the answer sets may present alternative consistent results in the case aggregate functions are involved. Preliminary experimental evaluation shows rewritten queries provide valuable information on Referential Integrity and take almost the same time as original queries, highlighting efficiency is good and overhead is minimal.

  • Extended aggregations for databases with Referential Integrity issues
    Data & Knowledge Engineering, 2010
    Co-Authors: Javier García-garcía, Carlos Ordonez
    Abstract:

    Querying inconsistent databases remains a broad and difficult problem. In this work, we study how to improve aggregations computed on databases with Referential errors in the context of database integration, where each source database has different tables, columns with similar content across multiple databases, but different Referential Integrity constraints. Thus, a query in an integrated database may involve tables and columns with Referential Integrity errors. In a data warehouse, even though the ETL processes fix Referential Integrity errors, this is generally done by inserting ''dummy'' records into the dimension tables corresponding to such invalid foreign keys, thereby artificially enforcing Referential Integrity. When two tables are joined and aggregations are computed, rows with an invalid or null foreign key value are skipped, effectively eliminating potentially valuable information. With that motivation in mind, we extend SQL aggregate functions computed over tables with Referential Integrity issues to return complete answer sets in the sense that no row is excluded. We associate to each referenced key in the dimension table, a probability that invalid or null foreign keys refer to it. Our main idea is to compute aggregations over joined tables including rows with invalid or null references by distributing their contribution to aggregation totals, based on probabilities computed over correct foreign keys. Experiments with real and synthetic databases evaluate the usefulness, accuracy and performance of our extended aggregations.

  • a Referential Integrity browser for distributed databases
    International Workshop on the Web and Databases, 2009
    Co-Authors: Carlos Ordonez, Javier Garciagarcia, Rogelio Monterocampos, Carlos Garciaalvarado
    Abstract:

    We demonstrate a program that can inspect a distributed relational database on the Internet to discover and quantify Referential Integrity issues for integration purposes. The program computes data quality metrics for Referential Integrity at four granularity levels: database, table, column and value, going from a global to a detailed view, exhibiting specific evidence about Referential errors. Two orthogonal data quality dimensions are considered: completeness and consistency. Each table is stored at one primary site and it can be replicated at multiple sites, having foreign key references to tables at the same site or at dierent sites. The user can choose alternative query evaluation strategies to eciently compute Referential error metrics. Our proposal can be used in data integration, data warehousing and data quality assurance.

  • estimating and bounding aggregations in databases with Referential Integrity errors
    Data Warehousing and OLAP, 2008
    Co-Authors: Javier Garciagarcia, Carlos Ordonez
    Abstract:

    Database integration builds on tables coming from multiple databases by creating a single view of all these data. Each database has different tables, columns with similar content across databases and different Referential Integrity constraints. Thus, a query in an integrated database is likely to involve tables and columns with Referential Integrity errors. In a data warehouse environment, even though the ETL processes take care of the Referential Integrity errors, in many scenarios this is generally done by including 'dummy' records in the dimension tables used to relate to the fact tables with Referential errors. When two tables are joined, and aggregations are computed, the tuples with an undefined foreign key value are aggregated in a group marked as undefined effectively discarding potentially valuable information. With that motivation in mind, we extend aggregate functions computed over tables with Referential Integrity errors on OLAP databases to return complete answer sets in the sense that no tuple is excluded. We associate to each valid reference, the probability that an invalid reference may actually be a certain correct reference. The main idea of our work is that in certain contexts, it is possible to use tuples with invalid references by taking into account the probability that an invalid reference actually be a certain correct reference. This way, improved answer sets are obtained from aggregate queries in settings where a database violates Referential Integrity constraints.

Doo Kwon Baik - One of the best experts on this subject based on the ideXlab platform.

  • a translation algorithm for effective rdb to xml schema conversion considering Referential Integrity information
    Journal of Information Science and Engineering, 2009
    Co-Authors: Jinhyung Kim, Dongwon Jeong, Doo Kwon Baik
    Abstract:

    In this paper, we propose a new relational schema (R-schema) to XML Schema translation algorithm that analyzes the cardinality between data values and patterns of user queries to resolve the implicit Referential Integrity issue. Many translation methods have been developed taking into account structural and/or semantic aspects. However, earlier methods have considered only the explicit Referential Integrity specified by the R-schema during translation or partially reflect the implicit Referential Integrity. It causes several problems such as incorrect translations, abnormal relational model transitions, and so on. In addition, many conventional translation algorithms support XML document type declaration (DTD) as the final translation result. However, it is insufficient to exactly represent the information of the R-schema. The VQT algorithm analyzes the value cardinality and user query patterns and extracts the implicit Referential integrities by using the cardinality property of foreign key constraints between columns and the equi-join characteristic in user queries. The VQT algorithm can apply the extracted implied Referential Integrity relation information to the R-schema and create an XML Schema as the final result. Therefore, the VQT algorithm prevents the R-schema from incorrectly converting into the XML Schema, and it richly and powerfully represents all the information in the R-schema by creating an XML Schema as the translation result on behalf of the XML DTD.

  • VQT: value cardinality and query pattern based R-schema to XML schema translation with implicit Referential Integrity
    Journal of Zhejiang University-SCIENCE A, 2008
    Co-Authors: Jinhyung Kim, Dongwon Jeong, Doo Kwon Baik
    Abstract:

    In this paper, we propose a new relational schema (R-schema) to XML schema translation algorithm, VQT, which analyzes the value cardinality and user query patterns and extracts the implicit Referential integrities by using the cardinality property of foreign key constraints between columns and the equi-join characteristic in user queries. The VQT algorithm can apply the extracted implied Referential Integrity relation information to the R-schema and create an XML schema as the final result. Therefore, the VQT algorithm prevents the R-schema from being incorrectly converted into the XML schema, and it richly and powerfully represents all the information in the R-schema by creating an XML schema as the translation result on behalf of the XML DTD.

  • an algorithm for extracting Referential Integrity relations using similarity during rdb to xml translation
    Computational Intelligence and Security, 2007
    Co-Authors: Jangwon Kim, Jinhyung Kim, Dongwon Jeong, Doo Kwon Baik
    Abstract:

    XML is rapidly becoming technologies for information exchange and representation. It causes many research issues such as semantic modeling methods, conversion for interoperability with other models, and so on. Especially, the most important issue in practical area is how to achieve the interoperability between XML model and relational database model. Until now, many methods have been proposed to achieve it. However, several problems still remain. Most of all, existing methods do not consider implicit Referential Integrity relations, so it causes loss of information. This paper proposes an algorithm for extracting Referential Integrity relations during RDB to XML translation. The key point of our method is how to find implicit Referential Integrity relations among columns which have different names to represent the same semantic. To resolve it, we define an enhanced extraction algorithm which based on a widely used ontology, WordNet. The proposed algorithm can reduce an extraction time among comparison columns in RDB tables and prevent loss of information.

  • an algorithm for Referential Integrity relations extraction using similarity comparison of rdb
    Journal of the Korea Society for Simulation, 2006
    Co-Authors: Jangwon Kim, Jinhyung Kim, Dongwon Jeong, Doo Kwon Baik
    Abstract:

    XML is rapidly becoming technologies for information exchange and representation. It causes many research issues such as semantic modeling methods, security, conversion far interoperability with other models, and so on. Especially, the most important issue for its practical application is how to achieve the interoperability between XML model and relational model. Until now, many suggestions have been proposed to achieve it. However several problems still remain. Most of all, the exiting methods do not consider implicit Referential Integrity relations, and it causes incorrect data delivery. One method to do this has been proposed with the restriction where one semantic is defined as only one same name in a given database. In real database world, this restriction cannot provide the application and extensibility. This paper proposes a noble conversion (RDB-to-XML) algorithm based on the similarity checking technique. The key point of our method is how to find implicit Referential Integrity relations between different field names presenting one same semantic. To resolve it, we define an enhanced implicity referentiai Integrity relations extraction algorithm based on a widely used ontology, WordNet. The proposed conversion algorithm is more practical than the previous-similar approach.

  • an algorithm for translation from rdb schema model to xml schema model considering implicit Referential Integrity
    Journal of KIISE:Databases, 2006
    Co-Authors: Jinhyung Kim, Dongwon Jeong, Doo Kwon Baik
    Abstract:

    The most representative approach for efficient storing of XML data is to store XML data in relational databases. The merit of this approach is that it can easily accept the realistic status that most data are still stored in relational databases. This approach needs to convert XML data into relational data or relational data into XML data. The most important issue in the translation is to reflect structural and semantic relations of RDB to XML schema model exactly. Many studies have been done to resolve the issue, but those methods have several problems: Not cover structural semantics or just support explicit Referential Integrity relations. In this paper, we propose an algorithm for extracting implicit Referential integrities automatically. We also design and implement the suggested algorithm, and execute comparative evaluations using translated XML documents. The proposed algorithm provides several good points such as improving semantic information extraction and conversion, securing sufficient Referential Integrity of the target databases, and so on. By using the suggested algorithm, we can guarantee not only explicit Referential integrities but also implicit Referential integrities of the initial relational schema model completely. That is, we can create more exact XML schema model through the suggested algorithm.

Mozhgan Memari - One of the best experts on this subject based on the ideXlab platform.

  • index design for enforcing partial Referential Integrity efficiently
    Extending Database Technology, 2015
    Co-Authors: Mozhgan Memari, Sebastian Link
    Abstract:

    Referential Integrity is fundamental for data processing and data quality. The SQL standard proposes di↵erent semantics under which Referential Integrity can be enforced in practice. Under simple semantics, only total foreign key values must be matched by some referenced key values. Under partial semantics, total and partial foreign key values must be matched by some referenced key values. Support for simple semantics is extensive and widespread across di↵erent database management systems but, surprisingly, partial semantics does not enjoy any native support in any known systems. Previous research has left open the questions whether partial Referential Integrity is useful for any real-world applications and whether it can enjoy ecient support at the systems level. As our first contribution we show that ecient support for partial Referential Integrity can provide database users with intelligent query and update services. Indeed, we regard partial semantics as an e↵ective imputation technique for missing data in query answers and update operations, which increases the quality of these services. As our second contribution we show how partial Referential Integrity can be enforced eciently for real-world foreign keys. For that purpose we propose triggers and exploit di↵erent index structures. Our experiments with synthetic and benchmark data sets confirm that our index structures do not only boost the performance of the state-of-the-art recommendation for enforcing partial semantics in real-world foreign keys, but show trends that are similar to enforcing simple semantics.

  • static analysis of partial Referential Integrity for better quality sql data
    Americas Conference on Information Systems, 2013
    Co-Authors: Sebastian Link, Mozhgan Memari
    Abstract:

    Referential Integrity ensures the consistency of data between database relations. The SQL standard proposes different semantics to deal with partial information under Referential Integrity. Simple semantics neglects tuples with nulls, and enjoys built-in support by commercial database systems. Partial semantics does check tuples with nulls, but does not enjoy built-in support. We investigate this mismatch between the SQL standard and real database systems. Indeed, insight is gained into the trade-off between cleaner data under partial semantics and the efficiency of checking simple semantics. The cost for Referential Integrity checking is evaluated for various dataset sizes, indexing structures and degrees of cleanliness. While the cost of partial semantics exceeds that of simple semantics, their performance trends follow similar patterns under growing database sizes. Applying multiple index structures and exploiting appropriate validation mechanisms increase the efficiency of checking partial semantics.

Jixue Liu - One of the best experts on this subject based on the ideXlab platform.

  • preserving Referential Integrity constraints in xml data transformation
    International Conference on Database Theory, 2009
    Co-Authors: Md Sumon Shahriar, Jixue Liu
    Abstract:

    We study the transformation and preservation of XML Referential Integrity constraints in XML data transformation for integration purposes in this paper. In transformation and preservation, we consider XML inclusion dependency and XML foreign key. We show how XML Referential constraints should be transformed and preserved using important transformation operations with sufficient conditions.

  • checking satisfactions of xml Referential Integrity constraints
    Active Media Technology, 2009
    Co-Authors: Md Sumon Shahriar, Jixue Liu
    Abstract:

    Recently we proposed Referential Integrity constraints for XML. In defining two important Referential constraints namely XML inclusion dependency and XML foreign key, we considered ordered XML data model to capture the correct semantics of data when tuples are to be produced. In this paper, we report on the performances of checking both XML inclusion dependency and XML foreign key. We show that both these constraints can be checked in linear time in the context of number of tuples and the number of paths.

  • Towards a Definition of Referential Integrity Constraints for XML
    2009
    Co-Authors: Jixue Liu
    Abstract:

    In relational data model, two important Referential Integrity constraints are inclusion dependency(ID) and foreign key(FK). In last decade, with the growing use of XML as data representation and exchange format over the web, the issue of Integrity constraints in XML has received great importance to the database community. In this paper, we propose XML Inclusion Dependency(XID) and XML foreign key(XFK). When proposing, we show how both XID and XFK can be defined over the Document Type Definition(DTD) and are satisfied by the XML documents. We introduce a novel concept tuple that produces semantically correct values in the XML documents when satisfactions are checked. We also show that XFK is defined with the combination of XID and XML Key.

  • on defining Referential Integrity for xml
    Computer Science and its Applications, 2008
    Co-Authors: Md Sumon Shahriar, Jixue Liu
    Abstract:

    Referential Integrity is one of the Integrity constraints for any data model. In relational data model, inclusion dependency (ID) and foreign key (FK) are well studied and are widely used. In last decade, with the growing use of XML as data representation and exchange format over the web, the issue of Integrity constraints in XML has received great importance to the database community. In this paper, we propose XML inclusion dependency (XID) and XML foreign key(XFK). When proposing, we show how both XID and XFK can be defined over the Document Type Definition (DTD) and are satisfied by the XML documents. We introduce a novel concept tuple that produces semantically correct values in the XML documents when satisfactions are checked. We also show that XFK is defined with the combination of XID and XML key.