Database Systems

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 264738 Experts worldwide ranked by ideXlab platform

Stavros Harizopoulos - One of the best experts on this subject based on the ideXlab platform.

  • the design and implementation of modern column oriented Database Systems
    Foundations and Trends in Databases, 2013
    Co-Authors: Daniel J Abadi, Peter A. Boncz, Stavros Harizopoulos
    Abstract:

    Database system performance is directly related to the efficiency of the system at storing data on primary storage (for example, disk) and moving it into CPU registers for processing. For this reason, there is a long history in the Database community of research exploring physical storage alternatives, including sophisticated indexing, materialized views, and vertical and horizontal partitioning. In recent years, there has been renewed interest in so-called column-oriented Systems, sometimes also called column-stores. Column-store Systems completely vertically partition a Database into a collection of individual columns that are stored separately. By storing each column separately on disk, these column-based Systems enable queries to readjust the attributes they need, rather than having to read entire rows from disk and discard unneeded attributes once they are in memory. The Design and Implementation of Modern Column-Oriented Database Systems discusses modern column-stores, their architecture and evolution as well the benefits they can bring in data analytics. There is a specific focus on three influential research prototypes, MonetDB, MonetDB/X100, and C-Store. These Systems have formed the basis for several well-known commercial column-store implementations. Their similarities and differences are described and they are discussed in terms of their specific architectural features for compression, late materialization, join processing, vectorization and adaptive indexing (Database cracking). The Design and Implementation of Modern Column-Oriented Database Systems is an excellent reference on the topic for Database researchers and practitioners.

  • Column-oriented Database Systems
    Proceedings of the VLDB Endowment, 2009
    Co-Authors: Daniel J Abadi, Peter A. Boncz, Stavros Harizopoulos
    Abstract:

    Column-oriented Database Systems (column-stores) have attracted a lot of attention in the past few years. Column-stores, in a nutshell, store each Database table column separately, with attribute values belonging to the same column stored contiguously, compressed, and densely packed, as opposed to traditional Database Systems that store entire records (rows) one after the other. Reading a subset of a table’s columns becomes faster, at the potential expense of excessive disk-head seeking from column to column for scattered reads or updates. After several dozens of research papers and at least a dozen of new column-store start-ups, several questions remain. Are these a new breed of Systems or simply old wine in new bottles? How easily can a major row-based system achieve column-store performance? Are column-stores the answer to effortlessly support large-scale data-intensive applications? What are the new, exciting system research problems to tackle? What are the new applications that can be potentially enabled by column-stores? In this tutorial, we present an overview of column-oriented Database system technology and address these and other related questions. 1.

Daniel J Abadi - One of the best experts on this subject based on the ideXlab platform.

  • lazy evaluation of transactions in Database Systems
    International Conference on Management of Data, 2014
    Co-Authors: Jose M. Faleiro, Alexander Thomson, Daniel J Abadi
    Abstract:

    Existing Database Systems employ an \textit{eager} transaction processing scheme---that is, upon receiving a transaction request, the system executes all the operations entailed in running the transaction (which typically includes reading Database records, executing user-specified transaction logic, and logging updates and writes) before reporting to the client that the transaction has completed. We introduce a \textit{lazy} transaction execution engine, in which a transaction may be considered durably completed after only partial execution, while the bulk of its operations (notably all reads from the Database and all execution of transaction logic) may be deferred until an arbitrary future time, such as when a user attempts to read some element of the transaction's write-set---all without modifying the semantics of the transaction or sacrificing ACID guarantees. Lazy transactions are processed deterministically, so that the final state of the Database is guaranteed to be equivalent to what the state would have been had all transactions been executed eagerly. Our prototype of a lazy transaction execution engine improves temporal locality when executing related transactions, reduces peak provisioning requirements by deferring more non-urgent work until off-peak load times, and reduces contention footprint of concurrent transactions. However, we find that certain queries suffer increased latency, and therefore lazy Database Systems may not be appropriate for read-latency sensitive applications. We introduce a lazy transaction execution engine, in which a transaction may be considered durably completed after only partial execution, while the bulk of its operations (notably all reads from the Database and all execution of transaction logic) may be deferred until an arbitrary future time, such as when a user attempts to read some element of the transaction's write-set---all without modifying the semantics of the transaction or sacrificing ACID guarantees. Lazy transactions are processed deterministically, so that the final state of the Database is guaranteed to be equivalent to what the state would have been had all transactions been executed eagerly. Our prototype of a lazy transaction execution engine improves temporal locality when executing related transactions, reduces peak provisioning requirements by deferring more non-urgent work until off-peak load times, and reduces contention footprint of concurrent transactions. However, we find that certain queries suffer increased latency, and therefore lazy Database Systems may not be appropriate for read-latency sensitive applications.

  • the design and implementation of modern column oriented Database Systems
    Foundations and Trends in Databases, 2013
    Co-Authors: Daniel J Abadi, Peter A. Boncz, Stavros Harizopoulos
    Abstract:

    Database system performance is directly related to the efficiency of the system at storing data on primary storage (for example, disk) and moving it into CPU registers for processing. For this reason, there is a long history in the Database community of research exploring physical storage alternatives, including sophisticated indexing, materialized views, and vertical and horizontal partitioning. In recent years, there has been renewed interest in so-called column-oriented Systems, sometimes also called column-stores. Column-store Systems completely vertically partition a Database into a collection of individual columns that are stored separately. By storing each column separately on disk, these column-based Systems enable queries to readjust the attributes they need, rather than having to read entire rows from disk and discard unneeded attributes once they are in memory. The Design and Implementation of Modern Column-Oriented Database Systems discusses modern column-stores, their architecture and evolution as well the benefits they can bring in data analytics. There is a specific focus on three influential research prototypes, MonetDB, MonetDB/X100, and C-Store. These Systems have formed the basis for several well-known commercial column-store implementations. Their similarities and differences are described and they are discussed in terms of their specific architectural features for compression, late materialization, join processing, vectorization and adaptive indexing (Database cracking). The Design and Implementation of Modern Column-Oriented Database Systems is an excellent reference on the topic for Database researchers and practitioners.

  • invisible loading access driven data transfer from raw files into Database Systems
    Extending Database Technology, 2013
    Co-Authors: Azza Abouzied, Daniel J Abadi, Avi Silberschatz
    Abstract:

    Commercial analytical Database Systems suffer from a high "time-to-first-analysis": before data can be processed, it must be modeled and schematized (a human effort), transferred into the Database's storage layer, and optionally clustered and indexed (a computational effort). For many types of structured data, this upfront effort is unjustifiable, so the data are processed directly over the file system using the Hadoop framework, despite the cumulative performance benefits of processing this data in an analytical Database system. In this paper we describe a system that achieves the immediate gratification of running MapReduce jobs directly over a file system, while still making progress towards the long-term performance benefits of Database Systems. The basic idea is to piggyback on MapReduce jobs, leverage their parsing and tuple extraction operations to incrementally load and organize tuples into a Database system, while simultaneously processing the file system data. We call this scheme Invisible Loading, as we load fractions of data at a time at almost no marginal cost in query latency, but still allow future queries to run much faster.

  • Column-oriented Database Systems
    Proceedings of the VLDB Endowment, 2009
    Co-Authors: Daniel J Abadi, Peter A. Boncz, Stavros Harizopoulos
    Abstract:

    Column-oriented Database Systems (column-stores) have attracted a lot of attention in the past few years. Column-stores, in a nutshell, store each Database table column separately, with attribute values belonging to the same column stored contiguously, compressed, and densely packed, as opposed to traditional Database Systems that store entire records (rows) one after the other. Reading a subset of a table’s columns becomes faster, at the potential expense of excessive disk-head seeking from column to column for scattered reads or updates. After several dozens of research papers and at least a dozen of new column-store start-ups, several questions remain. Are these a new breed of Systems or simply old wine in new bottles? How easily can a major row-based system achieve column-store performance? Are column-stores the answer to effortlessly support large-scale data-intensive applications? What are the new, exciting system research problems to tackle? What are the new applications that can be potentially enabled by column-stores? In this tutorial, we present an overview of column-oriented Database system technology and address these and other related questions. 1.

Peter A. Boncz - One of the best experts on this subject based on the ideXlab platform.

  • the design and implementation of modern column oriented Database Systems
    Foundations and Trends in Databases, 2013
    Co-Authors: Daniel J Abadi, Peter A. Boncz, Stavros Harizopoulos
    Abstract:

    Database system performance is directly related to the efficiency of the system at storing data on primary storage (for example, disk) and moving it into CPU registers for processing. For this reason, there is a long history in the Database community of research exploring physical storage alternatives, including sophisticated indexing, materialized views, and vertical and horizontal partitioning. In recent years, there has been renewed interest in so-called column-oriented Systems, sometimes also called column-stores. Column-store Systems completely vertically partition a Database into a collection of individual columns that are stored separately. By storing each column separately on disk, these column-based Systems enable queries to readjust the attributes they need, rather than having to read entire rows from disk and discard unneeded attributes once they are in memory. The Design and Implementation of Modern Column-Oriented Database Systems discusses modern column-stores, their architecture and evolution as well the benefits they can bring in data analytics. There is a specific focus on three influential research prototypes, MonetDB, MonetDB/X100, and C-Store. These Systems have formed the basis for several well-known commercial column-store implementations. Their similarities and differences are described and they are discussed in terms of their specific architectural features for compression, late materialization, join processing, vectorization and adaptive indexing (Database cracking). The Design and Implementation of Modern Column-Oriented Database Systems is an excellent reference on the topic for Database researchers and practitioners.

  • Column-oriented Database Systems
    Proceedings of the VLDB Endowment, 2009
    Co-Authors: Daniel J Abadi, Peter A. Boncz, Stavros Harizopoulos
    Abstract:

    Column-oriented Database Systems (column-stores) have attracted a lot of attention in the past few years. Column-stores, in a nutshell, store each Database table column separately, with attribute values belonging to the same column stored contiguously, compressed, and densely packed, as opposed to traditional Database Systems that store entire records (rows) one after the other. Reading a subset of a table’s columns becomes faster, at the potential expense of excessive disk-head seeking from column to column for scattered reads or updates. After several dozens of research papers and at least a dozen of new column-store start-ups, several questions remain. Are these a new breed of Systems or simply old wine in new bottles? How easily can a major row-based system achieve column-store performance? Are column-stores the answer to effortlessly support large-scale data-intensive applications? What are the new, exciting system research problems to tackle? What are the new applications that can be potentially enabled by column-stores? In this tutorial, we present an overview of column-oriented Database system technology and address these and other related questions. 1.

Kenneth Salem - One of the best experts on this subject based on the ideXlab platform.

  • A Taxonomy of Partitioned Replicated Cloud-based Database Systems
    2020
    Co-Authors: Divy Agrawal, Amr El Abbadi, Kenneth Salem
    Abstract:

    Abstract The advent of the cloud computing paradigm has given rise to many innovative and novel proposals for managing large-scale, fault-tolerant and highly available data management Systems. This paper proposes a taxonomy of large scale partitioned replicated transactional Databases with the goal of providing a principled understanding of the growing space of scalable and highly available Database Systems. The taxonomy is based on the relationship between transaction management and replica management. We illustrate specific instances of the taxonomy using several recent partitioned replicated Database Systems

  • workload aware cpu performance scaling for transactional Database Systems
    International Conference on Management of Data, 2018
    Co-Authors: Mustafa Korkmaz, Kenneth Salem, Martin Karsten, Semih Salihoglu
    Abstract:

    Natural short term fluctuations in the load of transactional data Systems present an opportunity for power savings. For example, a system handling 1000 requests per second on average can expect more than 1000 requests in some seconds, fewer in others. By quickly adjusting processing capacity to match such fluctuations, power consumption can be reduced. Many Systems do this already, using dynamic voltage and frequency scaling (DVFS) to reduce processor performance and power consumption when the load is low. DVFS is typically controlled by frequency governors in the operating system, or by the processor itself. In this paper, we show that transactional Database Systems can manage DVFS more effectively than the underlying operating system. This is because the Database system has more information about the workload, and more control over that workload, than is available to the operating system. We present a technique called POLARIS for reducing the power consumption of transactional Database Systems. POLARIS directly manages processor DVFS and controls Database transaction scheduling. Its goal is to minimize power consumption while ensuring the transactions are completed within a specified latency target. POLARIS is workload-aware, and can accommodate concurrent workloads with different characteristics and latency budgets. We show that POLARIS can simultaneously reduce power consumption and reduce missed latency targets, relative to operating-system-based DVFS governors.

  • Main Memory Database Systems: An Overview
    IEEE Transactions on Knowledge and Data Engineering, 1992
    Co-Authors: Hector Garcia-molina, Kenneth Salem
    Abstract:

    Main memory Database Systems (MMDBs) store their data in main physical memory and provide very high-speed access. Conventional Database Systems are optimized for the particular characteristics of disk storage mechanisms. Memory resident Systems, on the other hand, use different optimizations to structure and organize data, as well as to make it reliable. The authors survey the major memory residence optimizations and briefly discuss some of the MMDBs that have been designed or implemented

B. Eaglestone - One of the best experts on this subject based on the ideXlab platform.

  • an integrity constraint for Database Systems containing embedded static neural networks
    International Journal of Intelligent Systems, 2001
    Co-Authors: I. Millns, B. Eaglestone
    Abstract:

    Static neural networks are used in some Database Systems to classify objects, but like traditional statistical classifiers they often misclassify. For some applications, it is necessary to bound the proportion of misclassified objects. This is clearly an integrity problem. We describe a new integrity constraint for Database Systems with embedded static neural networks, with which a Database administrator can enforce a bound on the proportion of misclassifications in a class. The approach is based upon mapping probabilities generated by a probabilistic neural network to the likely percentage of misclassifications.

  • An integrity constraint for Database Systems containing embedded neural networks
    Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130), 1998
    Co-Authors: I. Millns, B. Eaglestone
    Abstract:

    Neural networks are used in some Database Systems to classify objects, but like traditional statistical classifiers they often misclassify. For some applications, it is necessary to bound the proportion of misclassified objects. This is clearly an integrity problem. We describe a new integrity constraint for Database Systems with embedded neural networks, with which Database Administrator can enforce a bound on the proportion of misclassifications in a class. The approach is based upon mapping probabilities generated by a probablistic neural network to the likely percentage of misclassifications