Query Optimization

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 25008 Experts worldwide ranked by ideXlab platform

Christoph Koch - One of the best experts on this subject based on the ideXlab platform.

  • multi objective parametric Query Optimization
    Communications of The ACM, 2017
    Co-Authors: Immanuel Trummer, Christoph Koch
    Abstract:

    We propose a generalization of the classical database Query Optimization problem: multi-objective parametric Query (MPQ) Optimization. MPQ compares alternative processing plans according to multiple execution cost metrics. It also models missing pieces of information on which plan costs depend upon as parameters. Both features are crucial to model Query processing on modern data processing platforms. MPQ generalizes previously proposed Query Optimization variants, such as multi-objective Query Optimization, parametric Query Optimization, and traditional Query Optimization. We show, however, that the MPQ problem has different properties than prior variants and solving it requires novel methods. We present an algorithm that solves the MPQ problem and finds, for a given Query, the set of all relevant Query plans. This set contains all plans that realize optimal execution cost tradeoffs for any combination of parameter values. Our algorithm is based on dynamic programming and recursively constructs relevant Query plans by combining relevant plans for Query parts. We assume that all plan execution cost functions are piecewise-linear in the parameters. We use linear programming to compare alternative plans and to identify plans that are not relevant. We present a complexity analysis of our algorithm and experimentally evaluate its performance.

  • multi objective parametric Query Optimization
    Very Large Data Bases, 2014
    Co-Authors: Immanuel Trummer, Christoph Koch
    Abstract:

    Classical Query Optimization compares Query plans according to one cost metric and associates each plan with a constant cost value. In this paper, we introduce the Multi-Objective Parametric Query Optimization (MPQ) problem where Query plans are compared according to multiple cost metrics and the cost of a given plan according to a given metric is modeled as a function that depends on multiple parameters. The cost metrics may for instance include execution time or monetary fees; a parameter may represent the selectivity of a Query predicate that is unspecified at Optimization time.MPQ generalizes parametric Query Optimization (which allows multiple parameters but only one cost metric) and multi-objective Query Optimization (which allows multiple cost metrics but no parameters). We formally analyze the novel MPQ problem and show why existing algorithms are inapplicable. We present a generic algorithm for MPQ and a specialized version for MPQ with piecewise-linear plan cost functions. We prove that both algorithms find all relevant Query plans and experimentally evaluate the performance of our second algorithm in a Cloud computing scenario.

Beng Chin Ooi - One of the best experts on this subject based on the ideXlab platform.

  • crowdop Query Optimization for declarative crowdsourcing systems
    International Conference on Data Engineering, 2016
    Co-Authors: Ju Fan, Meihui Zhang, Stanley Kok, Beng Chin Ooi
    Abstract:

    We propose CROWDOP, a cost-based Query Optimization approach for declarative crowdsourcing systems. CROWDOP considers both cost and latency in the Query Optimization objectives and generates Query plans that provide a good balance between the cost and latency. We develop efficient algorithms in CROWDOP for optimizing three types of queries: selection, join and complex selection-join queries. We validate our approach via extensive experiments by simulation as well as with the real crowd on Amazon Mechanical Turk.

  • crowdop Query Optimization for declarative crowdsourcing systems
    IEEE Transactions on Knowledge and Data Engineering, 2015
    Co-Authors: Ju Fan, Meihui Zhang, Stanley Kok, Beng Chin Ooi
    Abstract:

    We study the Query Optimization problem in declarative crowdsourcing systems. Declarative crowdsourcing is designed to hide the complexities and relieve the user of the burden of dealing with the crowd. The user is only required to submit an SQL-like Query and the system takes the responsibility of compiling the Query, generating the execution plan and evaluating in the crowdsourcing marketplace. A given Query can have many alternative execution plans and the difference in crowdsourcing cost between the best and the worst plans may be several orders of magnitude. Therefore, as in relational database systems, Query Optimization is important to crowdsourcing systems that provide declarative Query interfaces. In this paper, we propose CrowdOp , a cost-based Query Optimization approach for declarative crowdsourcing systems. CrowdOp considers both cost and latency in Query Optimization objectives and generates Query plans that provide a good balance between the cost and latency. We develop efficient algorithms in the CrowdOp for optimizing three types of queries: selection queries, join queries, and complex selection-join queries. We validate our approach via extensive experiments by simulation as well as with the real crowd on Amazon Mechanical Turk.

  • Query Optimization for massively parallel data processing
    Symposium on Cloud Computing, 2011
    Co-Authors: Sharad Mehrotra, Beng Chin Ooi
    Abstract:

    MapReduce has been widely recognized as an efficient tool for large-scale data analysis. It achieves high performance by exploiting parallelism among processing nodes while providing a simple interface for upper-layer applications. Some vendors have enhanced their data warehouse systems by integrating MapReduce into the systems. However, existing MapReduce-based Query processing systems, such as Hive, fall short of the Query Optimization and competency of conventional database systems. Given an SQL Query, Hive translates the Query into a set of MapReduce jobs sentence by sentence. This design assumes that the user can optimize his Query before submitting it to the system. Unfortunately, manual Query Optimization is time consuming and difficult, even to an experienced database user or administrator. In this paper, we propose a Query Optimization scheme for MapReduce-based processing systems. Specifically, we embed into Hive a Query optimizer which is designed to generate an efficient Query plan based on our proposed cost model. Experiments carried out on our in-house cluster confirm the effectiveness of our Query optimizer.

  • SoCC - Query Optimization for massively parallel data processing
    Proceedings of the 2nd ACM Symposium on Cloud Computing - SOCC '11, 2011
    Co-Authors: Sharad Mehrotra, Beng Chin Ooi
    Abstract:

    MapReduce has been widely recognized as an efficient tool for large-scale data analysis. It achieves high performance by exploiting parallelism among processing nodes while providing a simple interface for upper-layer applications. Some vendors have enhanced their data warehouse systems by integrating MapReduce into the systems. However, existing MapReduce-based Query processing systems, such as Hive, fall short of the Query Optimization and competency of conventional database systems. Given an SQL Query, Hive translates the Query into a set of MapReduce jobs sentence by sentence. This design assumes that the user can optimize his Query before submitting it to the system. Unfortunately, manual Query Optimization is time consuming and difficult, even to an experienced database user or administrator. In this paper, we propose a Query Optimization scheme for MapReduce-based processing systems. Specifically, we embed into Hive a Query optimizer which is designed to generate an efficient Query plan based on our proposed cost model. Experiments carried out on our in-house cluster confirm the effectiveness of our Query optimizer.

Yannis E Ioannidis - One of the best experts on this subject based on the ideXlab platform.

  • Distributed Query Optimization by Query trading
    Lecture Notes in Computer Science, 2004
    Co-Authors: Fragkiskos Pentaris, Yannis E Ioannidis
    Abstract:

    Large-scale distributed environments, where each node is completely autonomous and offers services to its peers through external communication, pose significant challenges to Query processing and Optimization. Autonomy is the main source of the problem, as it results in lack of knowledge about any particular node with respect to the information it can produce and its characteristics. Internode competition is another source of the problem, as it results in potentially inconsistent behavior of the nodes at different times. In this paper, inspired by e-commerce technology, we recognize queries (and Query answers) as commodities and model Query Optimization as a trading negotiation process. Query parts (and their answers) are traded between nodes until deals are struck with some nodes for all of them. We identify the key parameters of this framework and suggest several potential alternatives for each one. Finally, we conclude with some experiments that demonstrate the scalability and performance characteristics of our approach compared to those of traditional Query Optimization.

  • Parametric Query Optimization
    The VLDB Journal The International Journal on Very Large Data Bases, 1997
    Co-Authors: Yannis E Ioannidis, Raymond T. Ng, Kyuseok Shim, Timos K. Sellis
    Abstract:

    In most database systems, the values of many important run-time parameters of the system, the data, or the Query are unknown at Query Optimization time. Parametric Query Optimization attempts to identify at compile time several execution plans, each one of which is optimal for a subset of all possible values of the run-time parameters. The goal is that at run time, when the actual parameter values are known, the appropriate plan should be identifiable with essentially no overhead. We present a general formulation of this problem and study it primarily for the buffer size parameter. We adopt randomized algorithms as the main approach to this style of Optimization and enhance them with a sideways information passing feature that increases their effectiveness in the new task. Experimental results of these enhanced algorithms show that they optimize queries for large numbers of buffer sizes in the same time needed by their conventional versions for a single buffer size, without much sacrifice in the output quality and with essentially zero run-time overhead.

  • a genetic algorithm for database Query Optimization
    ICGA, 1991
    Co-Authors: K Bennett, Michael C Ferris, Yannis E Ioannidis
    Abstract:

    Current Query Optimization techniques are inadequate to support some of the emerging database applications. In this paper, we outline a database Query Optimization problem and describe the adaptation of a genetic algorithm to the problem. We present a method for encoding arbitrary binary trees as chromosomes and describe several crossover operators for such chromosomes. Preliminary computational comparisons with the current best{known method for Query Optimization indicate this to be a promising approach. In particular, the output quality and the time needed to produce such solutions is comparable to and in general better than the current method.

Immanuel Trummer - One of the best experts on this subject based on the ideXlab platform.

  • multi objective parametric Query Optimization
    Communications of The ACM, 2017
    Co-Authors: Immanuel Trummer, Christoph Koch
    Abstract:

    We propose a generalization of the classical database Query Optimization problem: multi-objective parametric Query (MPQ) Optimization. MPQ compares alternative processing plans according to multiple execution cost metrics. It also models missing pieces of information on which plan costs depend upon as parameters. Both features are crucial to model Query processing on modern data processing platforms. MPQ generalizes previously proposed Query Optimization variants, such as multi-objective Query Optimization, parametric Query Optimization, and traditional Query Optimization. We show, however, that the MPQ problem has different properties than prior variants and solving it requires novel methods. We present an algorithm that solves the MPQ problem and finds, for a given Query, the set of all relevant Query plans. This set contains all plans that realize optimal execution cost tradeoffs for any combination of parameter values. Our algorithm is based on dynamic programming and recursively constructs relevant Query plans by combining relevant plans for Query parts. We assume that all plan execution cost functions are piecewise-linear in the parameters. We use linear programming to compare alternative plans and to identify plans that are not relevant. We present a complexity analysis of our algorithm and experimentally evaluate its performance.

  • multi objective parametric Query Optimization
    Very Large Data Bases, 2014
    Co-Authors: Immanuel Trummer, Christoph Koch
    Abstract:

    Classical Query Optimization compares Query plans according to one cost metric and associates each plan with a constant cost value. In this paper, we introduce the Multi-Objective Parametric Query Optimization (MPQ) problem where Query plans are compared according to multiple cost metrics and the cost of a given plan according to a given metric is modeled as a function that depends on multiple parameters. The cost metrics may for instance include execution time or monetary fees; a parameter may represent the selectivity of a Query predicate that is unspecified at Optimization time.MPQ generalizes parametric Query Optimization (which allows multiple parameters but only one cost metric) and multi-objective Query Optimization (which allows multiple cost metrics but no parameters). We formally analyze the novel MPQ problem and show why existing algorithms are inapplicable. We present a generic algorithm for MPQ and a specialized version for MPQ with piecewise-linear plan cost functions. We prove that both algorithms find all relevant Query plans and experimentally evaluate the performance of our second algorithm in a Cloud computing scenario.

Suk I. Yoo - One of the best experts on this subject based on the ideXlab platform.

  • DOOD - Semantic Query Optimization for Object Queries
    Deductive and Object-Oriented Databases, 1995
    Co-Authors: Young-whun Lee, Suk I. Yoo
    Abstract:

    Semantic Query Optimization is an approach to Query Optimization that utilizes semantic knowledge to reformulate a Query into one that would generate the same set of answers in a more efficient way. The semantic Query Optimization becomes more important in OODB systems where object queries are complex due to the presence of object-oriented concepts such as subclassing relationship, referential relationship, and object-identifier. In this paper we investigate the representation of semantic constraints and present a semantic Query Optimization technique for object queries. This paper describes useful semantic transformation heuristics in object-oriented context. Using an object Query optimizer prototype MagicVol, we conduct preliminary Optimization experiments to verify that the semantic Query Optimization for object queries is reasonable and useful.

  • DEXA - Semantic Query Optimization in OODB Systems
    Lecture Notes in Computer Science, 1994
    Co-Authors: Young-whun Lee, Suk I. Yoo
    Abstract:

    Semantic Query Optimization becomes more important in OODB systems where object queries are complex due to presence of the object-oriented concepts such as subclassing relationship, is-part-of relationship, and object-identifier. However, there are few research works on them. In this paper we investigate the representation and maintenance of semantic constraints and present a semantic Query Optimization technique in OODB systems. Our semantic Query optimizer deals semantic transformations of the object Query graph. We also develop new transformation heuristics which guide useful semantic transformations.