Database Schema

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 25434 Experts worldwide ranked by ideXlab platform

Anthony Cleve - One of the best experts on this subject based on the ideXlab platform.

  • adapting queries to Database Schema changes in hybrid polystores
    Source Code Analysis and Manipulation, 2020
    Co-Authors: Jerome Fink, Maxime Gobert, Anthony Cleve
    Abstract:

    Database Schema change has long been recognized as a complex, time-consuming and risky process. It requires not only the modification of Database structures and contents, but also the joint evolution of related application programs. This coevolution process mainly consists in converting Database queries expressed on the source Database Schema, into equivalent queries expressed on the target Database Schema. Several approaches, techniques and tools have been proposed to address this problem, by considering software systems relying on a single Database. In this paper, we propose an automated approach to query adaptation for Schema changes in hybrid polystores, i.e., data-intensive systems relying on several, possibly heterogeneous, Databases. The proposed approach takes advantage of a conceptual modeling language for representing the polystore Schema, and considers a generic query language for expressing queries on top of this Schema. Given a source polystore Schema, a set of input queries and a list of Schema change operators, our approach (1) identifies those input queries that cannot be transformed into equivalent queries expressed on the target Schema, (2) automatically transforms those input queries that can be adapted to the target Schema, and (3) generates warnings for those output queries requiring further manual inspection.

  • zero downtime sql Database Schema evolution for continuous deployment
    International Conference on Software Engineering, 2017
    Co-Authors: Michael De Jong, Arie Van Deursen, Anthony Cleve
    Abstract:

    When a web service or application evolves, its Database Schema --- tables, constraints, and indices --- often need to evolve along with it. Depending on the Database, some of these changes require a full table lock, preventing the service from accessing the tables under change. To deal with this, web services are typically taken offline momentarily to modify the Database Schema. However with the introduction of concepts like Continuous Deployment, web services are deployed into their production environments every time the source code is modified. Having to take the service offline --- potentially several times a day --- to perform Schema changes is undesirable. In this paper we introduce QuantumDB--- a tool-supported approach that abstracts this evolution process away from the web service without locking tables. This allows us to redeploy a web service without needing to take it offline even when a Database Schema change is necessary. In addition QuantumDB puts no restrictions on the method of deployment, supports Schema changes to multiple tables using changesets, and does not subvert foreign key constraints during the evolution process. We evaluate QuantumDB by applying 19 synthetic and 95 industrial evolution scenarios to our open source implementation of QuantumDB. These experiments demonstrate that QuantumDB realizes zerodowntime migrations at the cost of acceptable overhead, and is applicable in industrial continuous deployment contexts.

  • detecting and preventing program inconsistencies under Database Schema evolution
    2016 IEEE International Conference on Software Quality Reliability and Security (QRS), 2016
    Co-Authors: Loup Meurice, Csaba Nagy, Anthony Cleve
    Abstract:

    Nowadays, data-intensive applications tend to access their underlying Database in an increasingly dynamic way. The queries that they send to the Database server are usually built at runtime, through String concatenation, or Object-Relational-Mapping (ORM) frameworks. This level of dynamicity significantly complicates the task of adapting application programs to Database Schema changes. Failing to correctly adapt programs to an evolving Database Schema results in program inconsistencies, which in turn may cause program failures. In this paper, we present a tool-supported approach, that allows developers to (1) analyze how the source code and Database Schema co-evolved in the past and (2) simulate a Database Schema change and automatically determine the set of source code locations that would be impacted by this change. Developers are then provided with recommendations about what they should modify at those source code locations in order to avoid inconsistencies. The approach has been designed to deal with Java systems that use dynamic data access frameworks such as JDBC, Hibernate and JPA. We motivate and evaluate the proposed approach, based on three real-life systems of different size and nature.

  • understanding Database Schema evolution
    Science of Computer Programming, 2015
    Co-Authors: Anthony Cleve, Loup Meurice, Maxime Gobert, Jerome Maes, Jens H Weber
    Abstract:

    Database reverse engineering (DRE) has traditionally been carried out by considering three main information sources: (1) the Database Schema, (2) the stored data, and (3) the application programs. Not all of these information sources are always available, or of sufficient quality to inform the DRE process. For example, getting access to real-world data is often extremely problematic for information systems that maintain private data. In recent years, the analysis of the evolution history of software programs have gained an increasing role in reverse engineering in general, but comparatively little such research has been carried out in the context of Database reverse engineering. The goal of this paper is to contribute to narrowing this gap and exploring the use of the Database evolution history as an additional information source to aid Database Schema reverse engineering. We present a tool-supported method for analyzing the evolution history of legacy Databases, and we report on a large-scale case study of reverse engineering a complex information system and curate it as a benchmark for future research efforts within the community. We present a tool-supported method to analyze the history of a Database Schema.The method makes use of mining software repositories (MSR) techniques.We report on the application of the method to a large-scale case study.

  • dahlia a visual analyzer of Database Schema evolution
    Conference on Software Maintenance and Reengineering, 2014
    Co-Authors: Loup Meurice, Anthony Cleve
    Abstract:

    In a continuously changing environment, software evolution becomes an unavoidable activity. The mining software repositories (MSR) field studies the valuable data available in software repositories such as source code version-control systems, issue/bug-tracking systems, or communication archives. In recent years, many researchers have used MSR techniques as a way to support software understanding and evolution. While many software systems are data-intensive, i.e., their central artefact is a Database, little attention has been devoted to the analysis of this important system component in the context of software evolution. The goal of our work is to reduce this gap by considering the Database evolution history as an additional information source to aid software evolution. We present DAHLIA (Database Schema EvoLutIon Analysis), a visual analyzer of Database Schema evolution. Our tool mines the Database Schema evolution history from the software repository and allows its interactive, visual analysis. We describe DAHLIA and present our novel approach supporting data-intensive software evolution.

Carlo Zaniolo - One of the best experts on this subject based on the ideXlab platform.

  • Automating the Database Schema evolution process
    The VLDB Journal, 2013
    Co-Authors: Carlo Curino, Hyun Jin Moon, Alin Deutsch, Carlo Zaniolo
    Abstract:

    Supporting Database Schema evolution represents a long-standing challenge of practical and theoretical importance for modern information systems. In this paper, we describe techniques and systems for automating the critical tasks of migrating the Database and rewriting the legacy applications. In addition to labor saving, the benefits delivered by these advances are many and include reliable prediction of outcome, minimization of downtime, system-produced documentation, and support for archiving, historical queries, and provenance. The PRISM/PRISM++ system delivers these benefits, by solving the difficult problem of automating the migration of Databases and the rewriting of queries and updates. In this paper, we present the PRISM/PRISM++ system and the novel technology that made it possible. In particular, we focus on the difficult and previously unsolved problem of supporting legacy queries and updates under Schema and integrity constraints evolution. The PRISM/PRISM++ approach consists in providing the users with a set of SQL-based Schema Modification Operators (SMOs), which describe how the tables in the old Schema are modified into those in the new Schema. In order to support updates, SMOs are extended with integrity constraints modification operators. By using recent results on Schema mapping, the paper (i) characterizes the impact on integrity constraints of structural Schema changes, (ii) devises representations that enable the rewriting of updates, and (iii) develop a unified approach for query and update rewriting under constraints. We complement the system with two novel tools: the first automatically collects and provides statistics on Schema evolution histories, whereas the second derives equivalent sequences of SMOs from the migration scripts that were used for Schema upgrades. These tools were used to produce an extensive testbed containing 15 evolution histories of scientific Databases and web information systems, providing over 100 years of aggregate evolution histories and almost 2,000 Schema evolution steps.

  • Update rewriting and integrity constraint maintenance in a Schema evolution support system: PRISM++
    Proceedings of the VLDB Endowment, 2010
    Co-Authors: Carlo Curino, Hyun Jin Moon, Alin Deutsch, Carlo Zaniolo
    Abstract:

    Supporting legacy applications when the Database Schema evolves represents a long-standing challenge of practical and theoretical importance. Recent work has produced algorithms and systems that automate the process of data migration and query adaptation; however, the problems of evolving integrity constraints and supporting legacy updates under Schema and integrity constraints evolution are significantly more difficult and have thus far remained unsolved. In this paper, we address this issue by introducing a formal evolution model for the Database Schema structure and its integrity constraints, and use it to derive update mapping techniques akin to the rewriting techniques used for queries. Thus, we (i) propose a new set of Integrity Constraints Modification Operators (ICMOs), (ii) characterize the impact on integrity constraints of structural Schema changes, (iii) devise representations that enable the rewriting of updates, and (iv) develop a unified approach for query and update rewriting under constraints. We then describe the implementation of these techniques provided by our PRISM++ system. The effectiveness of PRISM++ and its enabling technology has been verified on a testbed containing evolution histories of several scientific Databases and web information systems, including the Genetic DB Ensembl (410+ Schema versions in 9 years), and Wikipedia (240+ Schema versions in 6 years).

  • automating Database Schema evolution in information system upgrades
    International Workshop on Hot Topics in Software Upgrades, 2009
    Co-Authors: Carlo Curino, Hyun Jin Moon, Carlo Zaniolo
    Abstract:

    The complexity, cost, and down-time currently created by the Database Schema evolution process is the source of incessant problems in the life of information systems and a major stumbling block that prevent graceful upgrades. Furthermore, our studies shows that the serious problems encountered by traditional information systems are now further exacerbated in web information systems and cooperative scientific Databases where the frequency of Schema changes has increased while tolerance for downtimes has nearly disappeared. The PRISM project seeks to develop the methods and tools that turn this error-prone and time-consuming process into one that is controllable, predictable and avoids down-time. Toward this goal, we have assembled a large testbed of Schema evolution histories, and developed a language of Schema Modification Operators (SMO) to express concisely these histories. Using this language, the Database administrator can specify new Schema changes, and then rely on PRISM to (i) predict the effect of these changes on current applications, (ii) translate old queries and updates to work on the new Schema version, (iii) perform data migration, and (iv) generate full documentation of intervened changes. Furthermore, PRISM achieves good usability and scalability by incorporating recent advances on mapping composition and invertibility in the implementation of (ii). The progress in automating Schema evolution so achieved provides the enabling technology for other advances, such as light-weight Database design methodologies that embrace changes as the regular state of software. While these topics remain largely unexplored, and thus provide rich opportunities for future research, an important area which we have been investigated is that of archival information systems, where PRISM query mapping techniques were used to support flashback and historical queries for Database archives under Schema evolution.

  • The PRISM Workwench: Database Schema Evolution without Tears
    2009 IEEE 25th International Conference on Data Engineering, 2009
    Co-Authors: Carlo Curino, Hyun Jin Moon, Carlo Zaniolo
    Abstract:

    Information Systems are subject to a perpetual evolution, which is particularly pressing in Web Information Systems, due to their distributed and often collaborative nature. Such continuous adaptation process, comes with a very high cost, because of the intrinsic complexity of the task and the serious ramifications of such changes upon Database-centric Information System softwares. Therefore, there is a need to automate and simplify the Schema evolution process and to ensure predictability and logical independence upon Schema changes. Current relational technology makes it easy to change the Database content or to revise the underlaying storage and indexes but does little to support logical Schema evolution which nowadays remains poorly supported by commercial tools. The PRISM system demonstrates a major new advance toward automating Schema evolution (including query mapping and Database conversion), by improving predictability, logical independence, and auditability of the process. In fact, PRISM exploits recent theoretical results on mapping composition, invertibility and query rewriting to provide DB Administrators with an intuitive, operational workbench usable in their everyday activities-thus enabling graceful Schema evolution. In this demonstration, we will show (i) the functionality of PRISM and its supportive AJAX interface, (ii) its architecture built upon a simple SQL-inspired language of Schema Modification Operators, and (iii) we will allow conference participants to directly interact with the system to test its capabilities. Finally, some of the most interesting evolution steps of popular Web Information Systems, such as Wikipedia, will be reviewed in a brief "Saga of Famous Schema Evolutions".

  • graceful Database Schema evolution the prism workbench
    Very Large Data Bases, 2008
    Co-Authors: Carlo Curino, Hyun Jin Moon, Carlo Zaniolo
    Abstract:

    Supporting graceful Schema evolution represents an unsolved problem for traditional information systems that is further exacerbated in web information systems, such as Wikipedia and public scientific Databases: in these projects based on multiparty cooperation the frequency of Database Schema changes has increased while tolerance for downtimes has nearly disappeared. As of today, Schema evolution remains an error-prone and time-consuming undertaking, because the DB Administrator (DBA) lacks the methods and tools needed to manage and automate this endeavor by (i) predicting and evaluating the effects of the proposed Schema changes, (ii) rewriting queries and applications to operate on the new Schema, and (iii) migrating the Database. Our PRISM system takes a big first step toward addressing this pressing need by providing: (i) a language of Schema Modification Operators to express concisely complex Schema changes, (ii) tools that allow the DBA to evaluate the effects of such changes, (iii) optimized translation of old queries to work on the new Schema version, (iv) automatic data migration, and (v) full documentation of intervened changes as needed to support data provenance, Database flash back, and historical queries. PRISM solves these problems by integrating recent theoretical advances on mapping composition and invertibility, into a design that also achieves usability and scalability. Wikipedia and its 170+ Schema versions provided an invaluable testbed for validating PRISM tools and their ability to support legacy queries.

Henri Prade - One of the best experts on this subject based on the ideXlab platform.

  • Relational Database Schema design for uncertain data
    Information Systems, 2019
    Co-Authors: Sebastian Link, Henri Prade
    Abstract:

    Driven by the dominance of the relational model, we investigate how the requirements of applications on the certainty of functional dependencies can improve the outcomes of relational Database Schema design. For that purpose, we assume that tuples are assigned a degree of possibility with which they occur in a relation, and that functional dependencies are assigned a dual degree of certainty which says to which tuples they apply. A design theory is developed for functional dependencies with degrees of certainty, including efficient axiomatic and algorithmic characterizations of their implication problem. Naturally, the possibility degrees of tuples bring forward different degrees of data redundancy, caused by functional dependencies with the dual degree of certainty. Variants of the classical syntactic Boyce–Codd and Third Normal Forms are established. They are justified semantically in terms of eliminating data redundancy and update anomalies of given degrees, and minimizing data redundancy of given degrees across all dependency-preserving decompositions, respectively. As a practical outcome of our results, designers can simply fix the degree of certainty they target, and then apply classical decomposition and synthesis to the set of functional dependencies whose associated degree of certainty meets the target. Hence, by fixing the certainty degree a designer controls which integrity requirements will be enforced for the application and which data will be processed by the application. The choice of the certainty degree also balances the classical trade-off between query and update efficiency on future Database instances. Our experiments confirm the effectiveness of our control parameter, and provide original insight into classical normalization strategies and their implementations.

  • Relational Database Schema design for uncertain data
    2016
    Co-Authors: Sebastian Link, Henri Prade
    Abstract:

    We investigate the impact of uncertainty on relational data\-base Schema design. Uncertainty is modeled qualitatively by assigning to tuples a degree of possibility with which they occur, and assigning to functional dependencies a degree of certainty which says to which tuples they apply. A design theory is developed for possibilistic functional dependencies, including efficient axiomatic and algorithmic characterizations of their implication problem. Naturally, the possibility degrees of tuples result in a scale of different degrees of data redundancy. Scaled versions of the classical syntactic Boyce-Codd and Third Normal Forms are established and semantically justified in terms of avoiding data redundancy of different degrees. Classical decomposition and synthesis techniques are scaled as well. Therefore, possibilistic functional dependencies do not just enable designers to control the levels of data integrity and losslessness targeted but also to balance the classical trade-off between query and update efficiency. Extensive experiments confirm the efficiency of our framework and provide original insight into relational Schema design.

Carlo Curino - One of the best experts on this subject based on the ideXlab platform.

  • Automating the Database Schema evolution process
    The VLDB Journal, 2013
    Co-Authors: Carlo Curino, Hyun Jin Moon, Alin Deutsch, Carlo Zaniolo
    Abstract:

    Supporting Database Schema evolution represents a long-standing challenge of practical and theoretical importance for modern information systems. In this paper, we describe techniques and systems for automating the critical tasks of migrating the Database and rewriting the legacy applications. In addition to labor saving, the benefits delivered by these advances are many and include reliable prediction of outcome, minimization of downtime, system-produced documentation, and support for archiving, historical queries, and provenance. The PRISM/PRISM++ system delivers these benefits, by solving the difficult problem of automating the migration of Databases and the rewriting of queries and updates. In this paper, we present the PRISM/PRISM++ system and the novel technology that made it possible. In particular, we focus on the difficult and previously unsolved problem of supporting legacy queries and updates under Schema and integrity constraints evolution. The PRISM/PRISM++ approach consists in providing the users with a set of SQL-based Schema Modification Operators (SMOs), which describe how the tables in the old Schema are modified into those in the new Schema. In order to support updates, SMOs are extended with integrity constraints modification operators. By using recent results on Schema mapping, the paper (i) characterizes the impact on integrity constraints of structural Schema changes, (ii) devises representations that enable the rewriting of updates, and (iii) develop a unified approach for query and update rewriting under constraints. We complement the system with two novel tools: the first automatically collects and provides statistics on Schema evolution histories, whereas the second derives equivalent sequences of SMOs from the migration scripts that were used for Schema upgrades. These tools were used to produce an extensive testbed containing 15 evolution histories of scientific Databases and web information systems, providing over 100 years of aggregate evolution histories and almost 2,000 Schema evolution steps.

  • Update rewriting and integrity constraint maintenance in a Schema evolution support system: PRISM++
    Proceedings of the VLDB Endowment, 2010
    Co-Authors: Carlo Curino, Hyun Jin Moon, Alin Deutsch, Carlo Zaniolo
    Abstract:

    Supporting legacy applications when the Database Schema evolves represents a long-standing challenge of practical and theoretical importance. Recent work has produced algorithms and systems that automate the process of data migration and query adaptation; however, the problems of evolving integrity constraints and supporting legacy updates under Schema and integrity constraints evolution are significantly more difficult and have thus far remained unsolved. In this paper, we address this issue by introducing a formal evolution model for the Database Schema structure and its integrity constraints, and use it to derive update mapping techniques akin to the rewriting techniques used for queries. Thus, we (i) propose a new set of Integrity Constraints Modification Operators (ICMOs), (ii) characterize the impact on integrity constraints of structural Schema changes, (iii) devise representations that enable the rewriting of updates, and (iv) develop a unified approach for query and update rewriting under constraints. We then describe the implementation of these techniques provided by our PRISM++ system. The effectiveness of PRISM++ and its enabling technology has been verified on a testbed containing evolution histories of several scientific Databases and web information systems, including the Genetic DB Ensembl (410+ Schema versions in 9 years), and Wikipedia (240+ Schema versions in 6 years).

  • automating Database Schema evolution in information system upgrades
    International Workshop on Hot Topics in Software Upgrades, 2009
    Co-Authors: Carlo Curino, Hyun Jin Moon, Carlo Zaniolo
    Abstract:

    The complexity, cost, and down-time currently created by the Database Schema evolution process is the source of incessant problems in the life of information systems and a major stumbling block that prevent graceful upgrades. Furthermore, our studies shows that the serious problems encountered by traditional information systems are now further exacerbated in web information systems and cooperative scientific Databases where the frequency of Schema changes has increased while tolerance for downtimes has nearly disappeared. The PRISM project seeks to develop the methods and tools that turn this error-prone and time-consuming process into one that is controllable, predictable and avoids down-time. Toward this goal, we have assembled a large testbed of Schema evolution histories, and developed a language of Schema Modification Operators (SMO) to express concisely these histories. Using this language, the Database administrator can specify new Schema changes, and then rely on PRISM to (i) predict the effect of these changes on current applications, (ii) translate old queries and updates to work on the new Schema version, (iii) perform data migration, and (iv) generate full documentation of intervened changes. Furthermore, PRISM achieves good usability and scalability by incorporating recent advances on mapping composition and invertibility in the implementation of (ii). The progress in automating Schema evolution so achieved provides the enabling technology for other advances, such as light-weight Database design methodologies that embrace changes as the regular state of software. While these topics remain largely unexplored, and thus provide rich opportunities for future research, an important area which we have been investigated is that of archival information systems, where PRISM query mapping techniques were used to support flashback and historical queries for Database archives under Schema evolution.

  • The PRISM Workwench: Database Schema Evolution without Tears
    2009 IEEE 25th International Conference on Data Engineering, 2009
    Co-Authors: Carlo Curino, Hyun Jin Moon, Carlo Zaniolo
    Abstract:

    Information Systems are subject to a perpetual evolution, which is particularly pressing in Web Information Systems, due to their distributed and often collaborative nature. Such continuous adaptation process, comes with a very high cost, because of the intrinsic complexity of the task and the serious ramifications of such changes upon Database-centric Information System softwares. Therefore, there is a need to automate and simplify the Schema evolution process and to ensure predictability and logical independence upon Schema changes. Current relational technology makes it easy to change the Database content or to revise the underlaying storage and indexes but does little to support logical Schema evolution which nowadays remains poorly supported by commercial tools. The PRISM system demonstrates a major new advance toward automating Schema evolution (including query mapping and Database conversion), by improving predictability, logical independence, and auditability of the process. In fact, PRISM exploits recent theoretical results on mapping composition, invertibility and query rewriting to provide DB Administrators with an intuitive, operational workbench usable in their everyday activities-thus enabling graceful Schema evolution. In this demonstration, we will show (i) the functionality of PRISM and its supportive AJAX interface, (ii) its architecture built upon a simple SQL-inspired language of Schema Modification Operators, and (iii) we will allow conference participants to directly interact with the system to test its capabilities. Finally, some of the most interesting evolution steps of popular Web Information Systems, such as Wikipedia, will be reviewed in a brief "Saga of Famous Schema Evolutions".

  • graceful Database Schema evolution the prism workbench
    Very Large Data Bases, 2008
    Co-Authors: Carlo Curino, Hyun Jin Moon, Carlo Zaniolo
    Abstract:

    Supporting graceful Schema evolution represents an unsolved problem for traditional information systems that is further exacerbated in web information systems, such as Wikipedia and public scientific Databases: in these projects based on multiparty cooperation the frequency of Database Schema changes has increased while tolerance for downtimes has nearly disappeared. As of today, Schema evolution remains an error-prone and time-consuming undertaking, because the DB Administrator (DBA) lacks the methods and tools needed to manage and automate this endeavor by (i) predicting and evaluating the effects of the proposed Schema changes, (ii) rewriting queries and applications to operate on the new Schema, and (iii) migrating the Database. Our PRISM system takes a big first step toward addressing this pressing need by providing: (i) a language of Schema Modification Operators to express concisely complex Schema changes, (ii) tools that allow the DBA to evaluate the effects of such changes, (iii) optimized translation of old queries to work on the new Schema version, (iv) automatic data migration, and (v) full documentation of intervened changes as needed to support data provenance, Database flash back, and historical queries. PRISM solves these problems by integrating recent theoretical advances on mapping composition and invertibility, into a design that also achieves usability and scalability. Wikipedia and its 170+ Schema versions provided an invaluable testbed for validating PRISM tools and their ability to support legacy queries.

Jun Zhang - One of the best experts on this subject based on the ideXlab platform.

  • adaptive Database Schema design for multi tenant data management
    IEEE Transactions on Knowledge and Data Engineering, 2014
    Co-Authors: Lijun Wang, Jianhua Feng, Jun Zhang
    Abstract:

    Multi-tenant data management is a major application of Software as a Service (SaaS). For example, many companies want to outsource their data to a third party that hosts a multi-tenant Database system to provide data management services. The multi-tenant Database system needs to have high performance, low space requirement, and excellent scalability. One big challenge is devising a high-quality Database Schema. Independent Tables Shared Instances (ITSI) and Shared Tables Shared instances (STSI) are two state-of-the-art approaches to designing the Schema. However, they suffer from some limitations. ITSI has poor scalability since it needs to maintain large numbers of tables. STSI achieves good scalability at the expense of poor performance and high space overhead. Thus, an effective Schema design method that addresses these problems is needed. In this paper, we propose an adaptive Database Schema design method for multi-tenant applications. We trade-off ITSI and STSI and find a balance between them to achieve good scalability and high performance with low space requirement. To this end, we identify the important attributes and use them to generate an appropriate number of base tables . For the remaining attributes, we construct supplementary tables . We discuss how to use the kernel matrix to determine the number of the base tables, apply graph-partitioning algorithms to construct the base tables, and evaluate the importance of attributes using the well-known PageRank algorithm. We propose a cost-based model to adaptively generate the base tables and supplementary tables. Our method has the following advantages. First, our method achieves high scalability. Second, our method achieves high performance and can trade-off the performance and space requirement. Third, our method can be easily applied to existing Databases (e.g., MySQL) with minor revisions. Fourth, our method can adapt to any Schemas and query workloads including both OLAP and OLTP applications. Experimental results on both real and synthetic datasets show that our method achieves high performance and good scalability with low space requirement and outperforms state-of-the-art methods.

  • adapt adaptive Database Schema design for multi tenant applications
    Conference on Information and Knowledge Management, 2012
    Co-Authors: Jun Zhang, Jianhua Feng
    Abstract:

    Multi-tenant data management is a major application of software as a Service (SaaS). Many companies outsource their data to a third party which hosts a multi-tenant Database system to provide data management service. The system should have high performance, low space and excellent scalability. One big challenge is to devise a high-quality Database Schema. Independent Tables Shared Instances and Shared Tables Shared Instances are two state-of-the-art methods. However, the former has poor scalability, while the latter achieves good scalability at the expense of poor performance and high space overhead. In this paper, we trade-off between the two methods and propose an adaptive Database Schema design approach to achieve good scalability and high performance with low space. To this end, we identify the important attributes and use them to generate a base table. For other attributes, we construct supplementary tables. We propose a cost-based model to adaptively generate the tables above. Our method has the following advantages. First, our method achieves high scalability. Second, our method can trade-off performance and space requirement. Third, our method can be easily applied to existing Databases (e.g., MySQL) with minor revisions. Fourth, our method can adapt to any Schemas and query workloads. Experimental results show our method achieves high performance and good scalability with low space and outperforms state-of-the-art method.