Query Optimizer

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

Jayant R Haritsa - One of the best experts on this subject based on the ideXlab platform.

  • Query Optimizer plan diagrams production reduction and applications
    International Conference on Data Engineering, 2011
    Co-Authors: Jayant R Haritsa
    Abstract:

    The automated optimization of declarative SQL queries is a classical problem that has been diligently addressed by the database community over several decades. However, due to its inherent complexities and challenges, the topic has largely remained a “black art”, and the quality of the Query Optimizer continues to be a key differentiator between competing database products, with large technical teams involved in their design and implementation.

  • the picasso database Query Optimizer visualizer
    Very Large Data Bases, 2010
    Co-Authors: Jayant R Haritsa
    Abstract:

    Modern database systems employ a Query Optimizer module to automatically identify the most efficient strategies for executing the declarative SQL queries submitted by users. The efficiency of these strategies, called "plans", is measured in terms of "costs" that are indicative of Query response times. Optimization is a mandatory exercise since the difference between the costs of the best execution plan, and a random choice, could be in orders of magnitude. The role of Query Optimizers has become especially critical during this decade due to the high degree of processing complexity characterizing current data warehousing and mining applications, as exemplified by the TPC-H and TPC-DS decision support benchmarks [20, 21].

  • efficiently approximating Query Optimizer plan diagrams
    Very Large Data Bases, 2008
    Co-Authors: Atreyee Dey, Sourjya Bhaumik, D Harish, Jayant R Haritsa
    Abstract:

    Given a parametrized n-dimensional SQL Query template and a choice of Query Optimizer, a plan diagram is a color-coded pictorial enumeration of the execution plan choices of the Optimizer over the Query parameter space. These diagrams have proved to be a powerful metaphor for the analysis and redesign of modern Optimizers, and are gaining currency in diverse industrial and academic institutions. However, their utility is adversely impacted by the impractically large computational overheads incurred when standard brute-force exhaustive approaches are used for producing fine-grained diagrams on high-dimensional Query templates.In this paper, we investigate strategies for efficiently producing close approximations to complex plan diagrams. Our techniques are customized to the features available in the Optimizer's API, ranging from the generic Optimizers that provide only the optimal plan for a Query, to those that also support costing of sub-optimal plans and enumerating rank-ordered lists of plans. The techniques collectively feature both random and grid sampling, as well as inference techniques based on nearest-neighbor classifiers, parametric Query optimization and plan cost monotonicity.Extensive experimentation with a representative set of TPC-H and TPC-DS-based Query templates on industrial-strength Optimizers indicates that our techniques are capable of delivering 90% accurate diagrams while incurring less than 15% of the computational overheads of the exhaustive approach. In fact, for full-featured Optimizers, we can guarantee zero error with less than 10% overheads. These approximation techniques have been implemented in the publicly available Picasso Optimizer visualization tool.

Cesar A Galindolegaria - One of the best experts on this subject based on the ideXlab platform.

  • Query optimization in microsoft sql server pdw
    International Conference on Management of Data, 2012
    Co-Authors: Srinath Shankar, Rimma V Nehme, Josep Aguilarsaborit, Andrew Chung, Mostafa Elhemali, Alan Halverson, Eric R Robinson, Mahadevan Sankara Subramanian, David J Dewitt, Cesar A Galindolegaria
    Abstract:

    In recent years, Massively Parallel Processors have increasingly been used to manage and Query vast amounts of data. Dramatic performance improvements are achieved through distributed execution of queries across many nodes. Query optimization for such system is a challenging and important problem. In this paper we describe the Query Optimizer inside the SQL Server Parallel Data Warehouse product (PDW QO). We leverage existing QO technology in Microsoft SQL Server to implement a cost-based Optimizer for distributed Query execution. By properly abstracting metadata we can readily reuse existing logic for Query simplification, space exploration and cardinality estimation. Unlike earlier approaches that simply parallelize the best serial plan, our Optimizer considers a rich space of execution alternatives, and picks one based on a cost-model for the distributed execution environment. The result is a high-quality, effective Query Optimizer for distributed Query processing in an MPP.

  • counting enumerating and sampling of execution plans in a cost based Query Optimizer
    Report - Information systems, 1999
    Co-Authors: Florian Waas, Cesar A Galindolegaria
    Abstract:

    Testing an SQL database system by running large sets of deterministic or stochastic SQL statements is common practice in commercial database development. However, code defects often remain undetected as the Query Optimizer's choice of an execution plan is not only depending on the Query but strongly influenced by a large number of parameters describing the database and the hardware environment. Modifying these parameters in order to steer the Optimizer to select other plans is difficult since this means anticipating often complex search strategies implemented in the Optimizer. In this paper we devise algorithms for counting, exhaustive generation, and uniform sampling of plans from the complete search space. Our techniques allow extensive validation of both generation of alternatives, and execution algorithms with plans other than the optimized one---if two candidate plans fail to produce the same results, then either the Optimizer considered an invalid plan, or the execution code is faulty. When the space of alternatives becomes too large for exhaustive testing, which can occur even with a handful of joins, uniform random sampling provides a mechanism for unbiased testing. The technique is implemented in Microsoft's SQL Server, where it is an integral part of the validation and testing process.

Sharad Mehrotra - One of the best experts on this subject based on the ideXlab platform.

  • Query optimization for massively parallel data processing
    Symposium on Cloud Computing, 2011
    Co-Authors: Sharad Mehrotra
    Abstract:

    MapReduce has been widely recognized as an efficient tool for large-scale data analysis. It achieves high performance by exploiting parallelism among processing nodes while providing a simple interface for upper-layer applications. Some vendors have enhanced their data warehouse systems by integrating MapReduce into the systems. However, existing MapReduce-based Query processing systems, such as Hive, fall short of the Query optimization and competency of conventional database systems. Given an SQL Query, Hive translates the Query into a set of MapReduce jobs sentence by sentence. This design assumes that the user can optimize his Query before submitting it to the system. Unfortunately, manual Query optimization is time consuming and difficult, even to an experienced database user or administrator. In this paper, we propose a Query optimization scheme for MapReduce-based processing systems. Specifically, we embed into Hive a Query Optimizer which is designed to generate an efficient Query plan based on our proposed cost model. Experiments carried out on our in-house cluster confirm the effectiveness of our Query Optimizer.

  • Query optimization in encrypted database systems
    Database Systems for Advanced Applications, 2005
    Co-Authors: Hakan Hacigumus, Bala Iyer, Sharad Mehrotra
    Abstract:

    To ensure the privacy of data in the relational databases, prior work has given techniques to support data encryption and execute SQL queries over the encrypted data. However, the problem of how to put these techniques together in an optimum manner was not addressed, which is equivalent to having an RDBMS without a Query Optimizer. This paper models and solves that optimization problem.

C Egyhazy - One of the best experts on this subject based on the ideXlab platform.

  • developing a Query Optimizer for a federated database system
    Intelligent Information Systems, 1997
    Co-Authors: C Egyhazy
    Abstract:

    This paper concerns our experiences in developing a Query Optimizer for the Cyrano prototype federated database system developed at Virginia Tech. We used a bottom-up evaluation method commonly seen in deductive systems. In Cyrano, queries and stored information is represented in classes, as in object-oriented systems. Consequently, the Optimizer evaluates a Query by repeatedly cycling through all base classes, and the base classes of the base classes. Experiments showed that in the processing of recursive queries, the implementation of a semi-naive algorithm produced considerable improvements. We concluded that the performance of the Optimizer, based on the implementation of the semi-naive algorithm, is adequate for most typical queries, but that it could be improved further by rewriting classes using a magic-sets rewrite algorithm.

Sharma Chakravarthy - One of the best experts on this subject based on the ideXlab platform.

  • plan before you execute a cost based Query Optimizer for attributed graph databases
    International Conference on Big Data, 2016
    Co-Authors: Soumyava Das, Ankur Goyal, Sharma Chakravarthy
    Abstract:

    Proliferation of NoSQL and graph databases indicates a move towards alternate forms of data representation beyond the traditional relational data model. This raises the question of processing queries efficiently over these representations. Graphs have become one of the preferred ways to represent and store data related to social networks and other domains where relationships and their labels need to be captured explicitly. Currently, for Querying graph databases, users have to either learn a new graph Query language (e.g. Metaweb Query language or MQL [6]) for posing their queries or use customized searches of specific substructures [14]. Hence, there is a clear need for posing queries using the same representation as that of a graph database, generate and evaluate alternate plans, develop cost metrics for evaluating plans, and prune the search space to converge on a good plan that can be evaluated directly over the graph database.

  • divide and conquer a basis for augmenting a conventional Query Optimizer with multiple Query processing capabilities
    International Conference on Data Engineering, 1991
    Co-Authors: Sharma Chakravarthy
    Abstract:

    An approach for adding a new component without radically changing an existing single-Query Optimizer (SQO) is proposed. A new way of organizing the strategy space of a set of queries being optimized is proposed for developing a multiple-Query Optimizer (MQO) architecture. The architecture relies on the generation of two strategy spaces using subsumption and equivalence of subexpressions at the logical level. Heuristics for pruning the space of multistrategies are also presented. It is shown that the partitioned organisation of the strategy space not only reduces the size of the strategy space but also lends itself to division of labor, thereby leading to a simpler MQO design. Clear separation of the module specific to multistrategy generation provides an easy migration path from SQOs to MQOs. In the decomposition algorithm, selections are propagated down the operator tree (counterintuitively) enabling the detection and creation of larger common subexpressions. >