Query Execution

The Experts below are selected from a list of 20946 Experts worldwide ranked by ideXlab platform

Jeffrey F Naughton - One of the best experts on this subject based on the ideXlab platform.

Uncertainty Aware Query Execution Time Prediction

arXiv: Databases, 2014

Co-Authors: Hakan Hacigümuş, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is a fundamental issue underlying many database management tasks. Existing predictors rely on information such as cardinality estimates and system performance constants that are difficult to know exactly. As a result, accurate prediction still remains elusive for many queries. However, existing predictors provide a single, point estimate of the true Execution time, but fail to characterize the uncertainty in the prediction. In this paper, we take a first step towards providing uncertainty information along with Query Execution time predictions. We use the Query optimizer's cost model to represent the Query Execution time as a function of the selectivities of operators in the Query plan as well as the constants that describe the cost of CPU and I/O operations in the system. By treating these quantities as random variables rather than constants, we show that with low overhead we can infer the distribution of likely prediction errors. We further show that the estimated prediction errors by our proposed techniques are strongly correlated with the actual prediction errors.

15 days free trial to Access Article
Towards predicting Query Execution time for concurrent and dynamic database workloads

Proceedings of the VLDB Endowment, 2013

Co-Authors: Yun Chi, Hakan Hacigümuş, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is crucial for many database management tasks including admission control, Query scheduling, and progress monitoring. While a number of recent papers have explored this problem, the bulk of the existing work either considers prediction for a single Query, or prediction for a static workload of concurrent queries, where by "static" we mean that the queries to be run are fixed and known. In this paper, we consider the more general problem of dynamic concurrent workloads. Unlike most previous work on Query Execution time prediction, our proposed framework is based on analytic modeling rather than machine learning. We first use the optimizer's cost model to estimate the I/O and CPU requirements for each pipeline of each Query in isolation, and then use a combination queueing model and buffer pool model that merges the I/O and CPU requests from concurrent queries to predict running times. We compare the proposed approach with a machine-learning based approach that is a variant of previous work. Our experiments show that our analytic-model based approach can lead to competitive and often better prediction accuracy than its machine-learning based counterpart.

15 days free trial to Access Article
Predicting Query Execution time: Are optimizer cost models really unusable?

Proceedings - International Conference on Data Engineering, 2013

Co-Authors: Wentao Wu, Junichi Tatemura, Hakan Hacigümuş, Shenghuo Zhu, Yun Chi, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is useful in many database management issues including admission control, Query scheduling, progress monitoring, and system sizing. Recently the research community has been exploring the use of statistical machine learning approaches to build predictive models for this task. An implicit assumption behind this work is that the cost models used by Query optimizers are insufficient for Query Execution time prediction. In this paper we challenge this assumption and show while the simple approach of scaling the optimizer's estimated cost indeed fails, a properly calibrated optimizer cost model is surprisingly effective. However, even a well-tuned optimizer cost model will fail in the presence of errors in cardinality estimates. Accordingly we investigate the novel idea of spending extra resources to refine estimates for the Query plan after it has been chosen by the optimizer but before Execution. In our experiments we find that a well calibrated Query optimizer model along with cardinality estimation refinement provides a low overhead way to provide estimates that are always competitive and often much better than the best reported numbers from the machine learning approaches.

15 days free trial to Access Article
ICDE - Predicting Query Execution time: Are optimizer cost models really unusable?

2013 IEEE 29th International Conference on Data Engineering (ICDE), 2013

Co-Authors: Yun Chi, Junichi Tatemura, Hakan Hacigümuş, Shenghuo Zhu, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is useful in many database management issues including admission control, Query scheduling, progress monitoring, and system sizing. Recently the research community has been exploring the use of statistical machine learning approaches to build predictive models for this task. An implicit assumption behind this work is that the cost models used by Query optimizers are insufficient for Query Execution time prediction. In this paper we challenge this assumption and show while the simple approach of scaling the optimizer's estimated cost indeed fails, a properly calibrated optimizer cost model is surprisingly effective. However, even a well-tuned optimizer cost model will fail in the presence of errors in cardinality estimates. Accordingly we investigate the novel idea of spending extra resources to refine estimates for the Query plan after it has been chosen by the optimizer but before Execution. In our experiments we find that a well calibrated Query optimizer model along with cardinality estimation refinement provides a low overhead way to provide estimates that are always competitive and often much better than the best reported numbers from the machine learning approaches.

15 days free trial to Access Article

Hakan Hacigümuş - One of the best experts on this subject based on the ideXlab platform.

Uncertainty Aware Query Execution Time Prediction

arXiv: Databases, 2014

Co-Authors: Hakan Hacigümuş, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is a fundamental issue underlying many database management tasks. Existing predictors rely on information such as cardinality estimates and system performance constants that are difficult to know exactly. As a result, accurate prediction still remains elusive for many queries. However, existing predictors provide a single, point estimate of the true Execution time, but fail to characterize the uncertainty in the prediction. In this paper, we take a first step towards providing uncertainty information along with Query Execution time predictions. We use the Query optimizer's cost model to represent the Query Execution time as a function of the selectivities of operators in the Query plan as well as the constants that describe the cost of CPU and I/O operations in the system. By treating these quantities as random variables rather than constants, we show that with low overhead we can infer the distribution of likely prediction errors. We further show that the estimated prediction errors by our proposed techniques are strongly correlated with the actual prediction errors.

15 days free trial to Access Article
Towards predicting Query Execution time for concurrent and dynamic database workloads

Proceedings of the VLDB Endowment, 2013

Co-Authors: Yun Chi, Hakan Hacigümuş, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is crucial for many database management tasks including admission control, Query scheduling, and progress monitoring. While a number of recent papers have explored this problem, the bulk of the existing work either considers prediction for a single Query, or prediction for a static workload of concurrent queries, where by "static" we mean that the queries to be run are fixed and known. In this paper, we consider the more general problem of dynamic concurrent workloads. Unlike most previous work on Query Execution time prediction, our proposed framework is based on analytic modeling rather than machine learning. We first use the optimizer's cost model to estimate the I/O and CPU requirements for each pipeline of each Query in isolation, and then use a combination queueing model and buffer pool model that merges the I/O and CPU requests from concurrent queries to predict running times. We compare the proposed approach with a machine-learning based approach that is a variant of previous work. Our experiments show that our analytic-model based approach can lead to competitive and often better prediction accuracy than its machine-learning based counterpart.

15 days free trial to Access Article
Predicting Query Execution time: Are optimizer cost models really unusable?

Proceedings - International Conference on Data Engineering, 2013

Co-Authors: Wentao Wu, Junichi Tatemura, Hakan Hacigümuş, Shenghuo Zhu, Yun Chi, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is useful in many database management issues including admission control, Query scheduling, progress monitoring, and system sizing. Recently the research community has been exploring the use of statistical machine learning approaches to build predictive models for this task. An implicit assumption behind this work is that the cost models used by Query optimizers are insufficient for Query Execution time prediction. In this paper we challenge this assumption and show while the simple approach of scaling the optimizer's estimated cost indeed fails, a properly calibrated optimizer cost model is surprisingly effective. However, even a well-tuned optimizer cost model will fail in the presence of errors in cardinality estimates. Accordingly we investigate the novel idea of spending extra resources to refine estimates for the Query plan after it has been chosen by the optimizer but before Execution. In our experiments we find that a well calibrated Query optimizer model along with cardinality estimation refinement provides a low overhead way to provide estimates that are always competitive and often much better than the best reported numbers from the machine learning approaches.

15 days free trial to Access Article
ICDE - Predicting Query Execution time: Are optimizer cost models really unusable?

2013 IEEE 29th International Conference on Data Engineering (ICDE), 2013

Co-Authors: Yun Chi, Junichi Tatemura, Hakan Hacigümuş, Shenghuo Zhu, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is useful in many database management issues including admission control, Query scheduling, progress monitoring, and system sizing. Recently the research community has been exploring the use of statistical machine learning approaches to build predictive models for this task. An implicit assumption behind this work is that the cost models used by Query optimizers are insufficient for Query Execution time prediction. In this paper we challenge this assumption and show while the simple approach of scaling the optimizer's estimated cost indeed fails, a properly calibrated optimizer cost model is surprisingly effective. However, even a well-tuned optimizer cost model will fail in the presence of errors in cardinality estimates. Accordingly we investigate the novel idea of spending extra resources to refine estimates for the Query plan after it has been chosen by the optimizer but before Execution. In our experiments we find that a well calibrated Query optimizer model along with cardinality estimation refinement provides a low overhead way to provide estimates that are always competitive and often much better than the best reported numbers from the machine learning approaches.

15 days free trial to Access Article

Yun Chi - One of the best experts on this subject based on the ideXlab platform.

Towards predicting Query Execution time for concurrent and dynamic database workloads

Proceedings of the VLDB Endowment, 2013

Co-Authors: Yun Chi, Hakan Hacigümuş, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is crucial for many database management tasks including admission control, Query scheduling, and progress monitoring. While a number of recent papers have explored this problem, the bulk of the existing work either considers prediction for a single Query, or prediction for a static workload of concurrent queries, where by "static" we mean that the queries to be run are fixed and known. In this paper, we consider the more general problem of dynamic concurrent workloads. Unlike most previous work on Query Execution time prediction, our proposed framework is based on analytic modeling rather than machine learning. We first use the optimizer's cost model to estimate the I/O and CPU requirements for each pipeline of each Query in isolation, and then use a combination queueing model and buffer pool model that merges the I/O and CPU requests from concurrent queries to predict running times. We compare the proposed approach with a machine-learning based approach that is a variant of previous work. Our experiments show that our analytic-model based approach can lead to competitive and often better prediction accuracy than its machine-learning based counterpart.

15 days free trial to Access Article
Predicting Query Execution time: Are optimizer cost models really unusable?

Proceedings - International Conference on Data Engineering, 2013

Co-Authors: Wentao Wu, Junichi Tatemura, Hakan Hacigümuş, Shenghuo Zhu, Yun Chi, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is useful in many database management issues including admission control, Query scheduling, progress monitoring, and system sizing. Recently the research community has been exploring the use of statistical machine learning approaches to build predictive models for this task. An implicit assumption behind this work is that the cost models used by Query optimizers are insufficient for Query Execution time prediction. In this paper we challenge this assumption and show while the simple approach of scaling the optimizer's estimated cost indeed fails, a properly calibrated optimizer cost model is surprisingly effective. However, even a well-tuned optimizer cost model will fail in the presence of errors in cardinality estimates. Accordingly we investigate the novel idea of spending extra resources to refine estimates for the Query plan after it has been chosen by the optimizer but before Execution. In our experiments we find that a well calibrated Query optimizer model along with cardinality estimation refinement provides a low overhead way to provide estimates that are always competitive and often much better than the best reported numbers from the machine learning approaches.

15 days free trial to Access Article
ICDE - Predicting Query Execution time: Are optimizer cost models really unusable?

2013 IEEE 29th International Conference on Data Engineering (ICDE), 2013

Co-Authors: Yun Chi, Junichi Tatemura, Hakan Hacigümuş, Shenghuo Zhu, Jeffrey F Naughton

Abstract:

Predicting Query Execution time is useful in many database management issues including admission control, Query scheduling, progress monitoring, and system sizing. Recently the research community has been exploring the use of statistical machine learning approaches to build predictive models for this task. An implicit assumption behind this work is that the cost models used by Query optimizers are insufficient for Query Execution time prediction. In this paper we challenge this assumption and show while the simple approach of scaling the optimizer's estimated cost indeed fails, a properly calibrated optimizer cost model is surprisingly effective. However, even a well-tuned optimizer cost model will fail in the presence of errors in cardinality estimates. Accordingly we investigate the novel idea of spending extra resources to refine estimates for the Query plan after it has been chosen by the optimizer but before Execution. In our experiments we find that a well calibrated Query optimizer model along with cardinality estimation refinement provides a low overhead way to provide estimates that are always competitive and often much better than the best reported numbers from the machine learning approaches.

15 days free trial to Access Article

Angela Demke Brown - One of the best experts on this subject based on the ideXlab platform.

speeding up spatial database Query Execution using gpus

International Conference on Conceptual Structures, 2012

Co-Authors: Bogdan Simion, Suprio Ray, Angela Demke Brown

Abstract:

Abstract Spatial databases are used in a wide variety of real-world applications, such as land surveying, urban planning, and environmental assessments, as well as geospatial Web services. As uses of spatial databases become more widespread, there is a growing need for good performance of spatial applications. In spatial workloads, queries tend to be computationally-intensive due to the complex processing of geometric relationships. Furthermore, a signiﬁcant fraction of spatial Query Execution time is spent on CPU stalls due to memory accesses, caused by the ever-increasing processor-memory speed gap. With the advent of massively-parallel graphics-processing hardware (GPUs) and frameworks like CUDA, opportunities for speeding up spatial processing have emerged. In addition to massive parallelism, GPUs can also better hide the memory latency.We aim to speed up spatial Query Execution using CUDA and recent GPU cards. One of the main challenges in using GPUs is the transfer time from main memory to GPU memory. We implement a set of six typical spatial queries and achieve a baseline speedup (without the transfer cost) of 62-318x over the CPU counterparts. We show that the transfer cost can be amortized over the Execution of each individual Query. For simpler spatial queries, the transfer time is a signiﬁcant fraction of the Query Execution time, but we still achieve a 6-10x speedup. For more complex spatial queries, the transfer time becomes negligible compared to the processing time, and we obtain a 62-240x speedup.

15 days free trial to Access Article
ICCS - Speeding up Spatial Database Query Execution using GPUs

Procedia Computer Science, 2012

Co-Authors: Bogdan Simion, Suprio Ray, Angela Demke Brown

Abstract:

AbstractSpatial databases are used in a wide variety of real-world applications, such as land surveying, urban planning, and environmental assessments, as well as geospatial Web services. As uses of spatial databases become more widespread, there is a growing need for good performance of spatial applications. In spatial workloads, queries tend to be computationally-intensive due to the complex processing of geometric relationships. Furthermore, a signiﬁcant fraction of spatial Query Execution time is spent on CPU stalls due to memory accesses, caused by the ever-increasing processor-memory speed gap. With the advent of massively-parallel graphics-processing hardware (GPUs) and frameworks like CUDA, opportunities for speeding up spatial processing have emerged. In addition to massive parallelism, GPUs can also better hide the memory latency.We aim to speed up spatial Query Execution using CUDA and recent GPU cards. One of the main challenges in using GPUs is the transfer time from main memory to GPU memory. We implement a set of six typical spatial queries and achieve a baseline speedup (without the transfer cost) of 62-318x over the CPU counterparts. We show that the transfer cost can be amortized over the Execution of each individual Query. For simpler spatial queries, the transfer time is a signiﬁcant fraction of the Query Execution time, but we still achieve a 6-10x speedup. For more complex spatial queries, the transfer time becomes negligible compared to the processing time, and we obtain a 62-240x speedup

15 days free trial to Access Article

Rik Van De Walle - One of the best experts on this subject based on the ideXlab platform.

Query Execution optimization for clients of triple pattern fragments

European Semantic Web Conference, 2015

Co-Authors: Joachim Van Herwegen, Ruben Verborgh, Erik Mannens, Rik Van De Walle

Abstract:

In order to reduce the server-side cost of publishing Queryable Linked Data, Triple Pattern Fragments tpf were introduced as ai¾?simple interface to rdf triples. They allow for sparql Query Execution at low server cost, by partially shifting the load from servers to clients. The previously proposed Query Execution algorithm uses more http requests than necessary, and only makes partial use of the available metadata. In this paper, we propose a new Query Execution algorithm for a client communicating with a tpf server. In contrast to ai¾?greedy solution, we maintain an overview of the entire Query to find the optimal steps for solving a given Query. We show multiple cases in which our algorithm reaches solutions with far fewer http requests, without significantly increasing the cost in other cases. This improves the efficiency of common sparql queries against tpf interfaces, augmenting their viability compared to the more powerful, but more costly, sparql interface.

15 days free trial to Access Article
ESWC - Query Execution Optimization for Clients of Triple Pattern Fragments

The Semantic Web. Latest Advances and New Domains, 2015

Co-Authors: Joachim Van Herwegen, Ruben Verborgh, Erik Mannens, Rik Van De Walle

Abstract:

In order to reduce the server-side cost of publishing Queryable Linked Data, Triple Pattern Fragments tpf were introduced as ai¾?simple interface to rdf triples. They allow for sparql Query Execution at low server cost, by partially shifting the load from servers to clients. The previously proposed Query Execution algorithm uses more http requests than necessary, and only makes partial use of the available metadata. In this paper, we propose a new Query Execution algorithm for a client communicating with a tpf server. In contrast to ai¾?greedy solution, we maintain an overview of the entire Query to find the optimal steps for solving a given Query. We show multiple cases in which our algorithm reaches solutions with far fewer http requests, without significantly increasing the cost in other cases. This improves the efficiency of common sparql queries against tpf interfaces, augmenting their viability compared to the more powerful, but more costly, sparql interface.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Jeffrey F Naughton - One of the best experts on this subject based on the ideXlab platform.

Uncertainty Aware Query Execution Time Prediction

Towards predicting Query Execution time for concurrent and dynamic database workloads

Predicting Query Execution time: Are optimizer cost models really unusable?

ICDE - Predicting Query Execution time: Are optimizer cost models really unusable?

Hakan Hacigümuş - One of the best experts on this subject based on the ideXlab platform.

Uncertainty Aware Query Execution Time Prediction

Towards predicting Query Execution time for concurrent and dynamic database workloads

Predicting Query Execution time: Are optimizer cost models really unusable?

ICDE - Predicting Query Execution time: Are optimizer cost models really unusable?

Yun Chi - One of the best experts on this subject based on the ideXlab platform.

Towards predicting Query Execution time for concurrent and dynamic database workloads

Predicting Query Execution time: Are optimizer cost models really unusable?

ICDE - Predicting Query Execution time: Are optimizer cost models really unusable?

Angela Demke Brown - One of the best experts on this subject based on the ideXlab platform.

speeding up spatial database Query Execution using gpus

ICCS - Speeding up Spatial Database Query Execution using GPUs

Rik Van De Walle - One of the best experts on this subject based on the ideXlab platform.

Query Execution optimization for clients of triple pattern fragments

ESWC - Query Execution Optimization for Clients of Triple Pattern Fragments