Parallel Algorithms

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 161607 Experts worldwide ranked by ideXlab platform

Francesco Scarcello - One of the best experts on this subject based on the ideXlab platform.

  • tree projections and constraint optimization problems fixed parameter tractability and Parallel Algorithms
    Journal of Computer and System Sciences, 2017
    Co-Authors: Georg Gottlob, Gianluigi Greco, Francesco Scarcello
    Abstract:

    Abstract The problem of computing optimal solutions to Constraint Satisfaction Problem (CSP) instances parameterized by the size of the objective function is considered, and fixed-parameter polynomial-time Algorithms are proposed within the structure-based framework of tree projections. The Algorithms compute the desired optimal (or best k ) solutions whenever there exists a tree projection for the given instance; otherwise, the Algorithms report that there is no tree-projection. For the case where a tree projection is available, Parallel Algorithms are also proposed and analyzed. Structural decomposition methods based on acyclic, bounded treewidth, and bounded generalized hypertree-width hypergraphs, extensively considered in the CSP setting, as well as in conjunctive database query evaluation and optimization, are all covered as special cases of the tree projection framework.

  • tree projections and constraint optimization problems fixed parameter tractability and Parallel Algorithms
    arXiv: Artificial Intelligence, 2017
    Co-Authors: Georg Gottlob, Gianluigi Greco, Francesco Scarcello
    Abstract:

    Tree projections provide a unifying framework to deal with most structural decomposition methods of constraint satisfaction problems (CSPs). Within this framework, a CSP instance is decomposed into a number of sub-problems, called views, whose solutions are either already available or can be computed efficiently. The goal is to arrange portions of these views in a tree-like structure, called tree projection, which determines an efficiently solvable CSP instance equivalent to the original one. Deciding whether a tree projection exists is NP-hard. Solution methods have therefore been proposed in the literature that do not require a tree projection to be given, and that either correctly decide whether the given CSP instance is satisfiable, or return that a tree projection actually does not exist. These approaches had not been generalized so far on CSP extensions for optimization problems, where the goal is to compute a solution of maximum value/minimum cost. The paper fills the gap, by exhibiting a fixed-parameter polynomial-time algorithm that either disproves the existence of tree projections or computes an optimal solution, with the parameter being the size of the expression of the objective function to be optimized over all possible solutions (and not the size of the whole constraint formula, used in related works). Tractability results are also established for the problem of returning the best K solutions. Finally, Parallel Algorithms for such optimization problems are proposed and analyzed. Given that the classes of acyclic hypergraphs, hypergraphs of bounded treewidth, and hypergraphs of bounded generalized hypertree width are all covered as special cases of the tree projection framework, the results in this paper directly apply to these classes. These classes are extensively considered in the CSP setting, as well as in conjunctive database query evaluation and optimization.

Julian Shun - One of the best experts on this subject based on the ideXlab platform.

  • fast Parallel Algorithms for euclidean minimum spanning tree and hierarchical spatial clustering
    International Conference on Management of Data, 2021
    Co-Authors: Yiqiu Wang, Julian Shun
    Abstract:

    This paper presents new Parallel Algorithms for generating Euclidean minimum spanning trees and spatial clustering hierarchies (known as HDBSCAN*). Our approach is based on generating a well-separated pair decomposition followed by using Kruskal's minimum spanning tree algorithm and bichromatic closest pair computations. We introduce a new notion of well-separation to reduce the work and space of our algorithm for HDBSCAN*. We also give a new Parallel divide-and-conquer algorithm for computing the dendrogram and reachability plots, which are used in visualizing clusters of different scale that arise for both EMST and HDBSCAN*. We show that our Algorithms are theoretically efficient: they have work (number of operations) matching their sequential counterparts, and polylogarithmic depth (Parallel time). We implement our Algorithms and propose a memory optimization that requires only a subset of well-separated pairs to be computed and materialized, leading to savings in both space (up to 10x) and time (up to 8x). Our experiments on large real-world and synthetic data sets using a 48-core machine show that our fastest Algorithms outperform the best serial Algorithms for the problems by 11.13--55.89x, and existing Parallel Algorithms by at least an order of magnitude.

  • fast Parallel Algorithms for euclidean minimum spanning tree and hierarchical spatial clustering
    arXiv: Data Structures and Algorithms, 2021
    Co-Authors: Yiqiu Wang, Julian Shun
    Abstract:

    This paper presents new Parallel Algorithms for generating Euclidean minimum spanning trees and spatial clustering hierarchies (known as HDBSCAN$^*$). Our approach is based on generating a well-separated pair decomposition followed by using Kruskal's minimum spanning tree algorithm and bichromatic closest pair computations. We introduce a new notion of well-separation to reduce the work and space of our algorithm for HDBSCAN$^*$. We also present a Parallel approximate algorithm for OPTICS based on a recent sequential algorithm by Gan and Tao. Finally, we give a new Parallel divide-and-conquer algorithm for computing the dendrogram and reachability plots, which are used in visualizing clusters of different scale that arise for both EMST and HDBSCAN$^*$. We show that our Algorithms are theoretically efficient: they have work (number of operations) matching their sequential counterparts, and polylogarithmic depth (Parallel time). We implement our Algorithms and propose a memory optimization that requires only a subset of well-separated pairs to be computed and materialized, leading to savings in both space (up to 10x) and time (up to 8x). Our experiments on large real-world and synthetic data sets using a 48-core machine show that our fastest Algorithms outperform the best serial Algorithms for the problems by 11.13--55.89x, and existing Parallel Algorithms by at least an order of magnitude.

  • Parallel wavelet tree construction
    Data Compression Conference, 2015
    Co-Authors: Julian Shun
    Abstract:

    We present Parallel Algorithms for wavelet tree construction with polylogarithmic depth, improving upon the linear depth of the recent Parallel Algorithms by Fuentes-Sepulveda et al. We experimentally show that on a 40-core machine with two-way hyper-threading, we outperform the existing Parallel Algorithms by 1.3--5.6x and achieve up to 27x speedup over the sequential algorithm on a variety of real-world and artificial inputs. Our Algorithms show good scalability with increasing thread count, input size and alphabet size. We also discuss extensions to variants of the standard wavelet tree.

  • internally deterministic Parallel Algorithms can be fast
    ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012
    Co-Authors: Guy E Blelloch, Jeremy T Fineman, Phillip B Gibbons, Julian Shun
    Abstract:

    The virtues of deterministic Parallelism have been argued for decades and many forms of deterministic Parallelism have been described and analyzed. Here we are concerned with one of the strongest forms, requiring that for any input there is a unique dependence graph representing a trace of the computation annotated with every operation and value. This has been referred to as internal determinism, and implies a sequential semantics---i.e., considering any sequential traversal of the dependence graph is sufficient for analyzing the correctness of the code. In addition to returning deterministic results, internal determinism has many advantages including ease of reasoning about the code, ease of verifying correctness, ease of debugging, ease of defining invariants, ease of defining good coverage for testing, and ease of formally, informally and experimentally reasoning about performance. On the other hand one needs to consider the possible downsides of determinism, which might include making Algorithms (i) more complicated, unnatural or special purpose and/or (ii) slower or less scalable.In this paper we study the effectiveness of this strong form of determinism through a broad set of benchmark problems. Our main contribution is to demonstrate that for this wide body of problems, there exist efficient internally deterministic Algorithms, and moreover that these Algorithms are natural to reason about and not complicated to code. We leverage an approach to determinism suggested by Steele (1990), which is to use nested Parallelism with commutative operations. Our Algorithms apply several diverse programming paradigms that fit within the model including (i) a strict functional style (no shared state among concurrent operations), (ii) an approach we refer to as deterministic reservations, and (iii) the use of commutative, linearizable operations on data structures. We describe Algorithms for the benchmark problems that use these deterministic approaches and present performance results on a 32-core machine. Perhaps surprisingly, for all problems, our internally deterministic Algorithms achieve good speedup and good performance even relative to prior nondeterministic solutions.

Georg Gottlob - One of the best experts on this subject based on the ideXlab platform.

  • tree projections and constraint optimization problems fixed parameter tractability and Parallel Algorithms
    Journal of Computer and System Sciences, 2017
    Co-Authors: Georg Gottlob, Gianluigi Greco, Francesco Scarcello
    Abstract:

    Abstract The problem of computing optimal solutions to Constraint Satisfaction Problem (CSP) instances parameterized by the size of the objective function is considered, and fixed-parameter polynomial-time Algorithms are proposed within the structure-based framework of tree projections. The Algorithms compute the desired optimal (or best k ) solutions whenever there exists a tree projection for the given instance; otherwise, the Algorithms report that there is no tree-projection. For the case where a tree projection is available, Parallel Algorithms are also proposed and analyzed. Structural decomposition methods based on acyclic, bounded treewidth, and bounded generalized hypertree-width hypergraphs, extensively considered in the CSP setting, as well as in conjunctive database query evaluation and optimization, are all covered as special cases of the tree projection framework.

  • tree projections and constraint optimization problems fixed parameter tractability and Parallel Algorithms
    arXiv: Artificial Intelligence, 2017
    Co-Authors: Georg Gottlob, Gianluigi Greco, Francesco Scarcello
    Abstract:

    Tree projections provide a unifying framework to deal with most structural decomposition methods of constraint satisfaction problems (CSPs). Within this framework, a CSP instance is decomposed into a number of sub-problems, called views, whose solutions are either already available or can be computed efficiently. The goal is to arrange portions of these views in a tree-like structure, called tree projection, which determines an efficiently solvable CSP instance equivalent to the original one. Deciding whether a tree projection exists is NP-hard. Solution methods have therefore been proposed in the literature that do not require a tree projection to be given, and that either correctly decide whether the given CSP instance is satisfiable, or return that a tree projection actually does not exist. These approaches had not been generalized so far on CSP extensions for optimization problems, where the goal is to compute a solution of maximum value/minimum cost. The paper fills the gap, by exhibiting a fixed-parameter polynomial-time algorithm that either disproves the existence of tree projections or computes an optimal solution, with the parameter being the size of the expression of the objective function to be optimized over all possible solutions (and not the size of the whole constraint formula, used in related works). Tractability results are also established for the problem of returning the best K solutions. Finally, Parallel Algorithms for such optimization problems are proposed and analyzed. Given that the classes of acyclic hypergraphs, hypergraphs of bounded treewidth, and hypergraphs of bounded generalized hypertree width are all covered as special cases of the tree projection framework, the results in this paper directly apply to these classes. These classes are extensively considered in the CSP setting, as well as in conjunctive database query evaluation and optimization.

Guy E Blelloch - One of the best experts on this subject based on the ideXlab platform.

  • optimal Parallel Algorithms in the binary forking model
    ACM Symposium on Parallel Algorithms and Architectures, 2020
    Co-Authors: Guy E Blelloch, Jeremy T Fineman, Yihan Sun
    Abstract:

    In this paper we develop optimal Algorithms in the binary-forking model for a variety of fundamental problems, including sorting, semisorting, list ranking, tree contraction, range minima, and ordered set union, intersection and difference. In the binary-forking model, tasks can only fork into two child tasks, but can do so recursively and asynchronously. The tasks share memory, supporting reads, writes and test-and-sets. Costs are measured in terms of work (total number of instructions), and span (longest dependence chain). The binary-forking model is meant to capture both algorithm performance and algorithm-design considerations on many existing multithreaded languages, which are also asynchronous and rely on binary forks either explicitly or under the covers. In contrast to the widely studied PRAM model, it does not assume arbitrary-way forks nor synchronous operations, both of which are hard to implement in modern hardware. While optimal PRAM Algorithms are known for the problems studied herein, it turns out that arbitrary-way forking and strict synchronization are powerful, if unrealistic, capabilities. Natural simulations of these PRAM Algorithms in the binary-forking model (i.e., implementations in existing Parallel languages) incur an Ω(log n) overhead in span. This paper explores techniques for designing optimal Algorithms when limited to binary forking and assuming asynchrony. All Algorithms described in this paper are the first Algorithms with optimal work and span in the binary-forking model. Most of the Algorithms are simple. Many are randomized.

  • optimal Parallel Algorithms in the binary forking model
    arXiv: Data Structures and Algorithms, 2019
    Co-Authors: Guy E Blelloch, Jeremy T Fineman, Yihan Sun
    Abstract:

    In this paper we develop optimal Algorithms in the binary-forking model for a variety of fundamental problems, including sorting, semisorting, list ranking, tree contraction, range minima, and set union, intersection and difference. In the binary-forking model, tasks can only fork into two child tasks, but can do so recursively and asynchronously, and join up later. The tasks share memory, and costs are measured in terms of work (total number of instructions), and span (longest dependence chain). Due to the asynchronous nature of the model, and a variety schedulers that are efficient in both theory and practice, variants of the model are widely used in practice in languages such as Cilk and Java Fork-Join. PRAM Algorithms can be simulated in the model but at a loss of a factor of $\Omega(\log n)$ so most PRAM Algorithms are not optimal in the model even if optimal on the PRAM. All Algorithms we describe are optimal in work and span (logarithmic in span). Several are randomized. Beyond being the first optimal Algorithms for their problems in the model, most are very simple.

  • internally deterministic Parallel Algorithms can be fast
    ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012
    Co-Authors: Guy E Blelloch, Jeremy T Fineman, Phillip B Gibbons, Julian Shun
    Abstract:

    The virtues of deterministic Parallelism have been argued for decades and many forms of deterministic Parallelism have been described and analyzed. Here we are concerned with one of the strongest forms, requiring that for any input there is a unique dependence graph representing a trace of the computation annotated with every operation and value. This has been referred to as internal determinism, and implies a sequential semantics---i.e., considering any sequential traversal of the dependence graph is sufficient for analyzing the correctness of the code. In addition to returning deterministic results, internal determinism has many advantages including ease of reasoning about the code, ease of verifying correctness, ease of debugging, ease of defining invariants, ease of defining good coverage for testing, and ease of formally, informally and experimentally reasoning about performance. On the other hand one needs to consider the possible downsides of determinism, which might include making Algorithms (i) more complicated, unnatural or special purpose and/or (ii) slower or less scalable.In this paper we study the effectiveness of this strong form of determinism through a broad set of benchmark problems. Our main contribution is to demonstrate that for this wide body of problems, there exist efficient internally deterministic Algorithms, and moreover that these Algorithms are natural to reason about and not complicated to code. We leverage an approach to determinism suggested by Steele (1990), which is to use nested Parallelism with commutative operations. Our Algorithms apply several diverse programming paradigms that fit within the model including (i) a strict functional style (no shared state among concurrent operations), (ii) an approach we refer to as deterministic reservations, and (iii) the use of commutative, linearizable operations on data structures. We describe Algorithms for the benchmark problems that use these deterministic approaches and present performance results on a 32-core machine. Perhaps surprisingly, for all problems, our internally deterministic Algorithms achieve good speedup and good performance even relative to prior nondeterministic solutions.

Gianluigi Greco - One of the best experts on this subject based on the ideXlab platform.

  • tree projections and constraint optimization problems fixed parameter tractability and Parallel Algorithms
    Journal of Computer and System Sciences, 2017
    Co-Authors: Georg Gottlob, Gianluigi Greco, Francesco Scarcello
    Abstract:

    Abstract The problem of computing optimal solutions to Constraint Satisfaction Problem (CSP) instances parameterized by the size of the objective function is considered, and fixed-parameter polynomial-time Algorithms are proposed within the structure-based framework of tree projections. The Algorithms compute the desired optimal (or best k ) solutions whenever there exists a tree projection for the given instance; otherwise, the Algorithms report that there is no tree-projection. For the case where a tree projection is available, Parallel Algorithms are also proposed and analyzed. Structural decomposition methods based on acyclic, bounded treewidth, and bounded generalized hypertree-width hypergraphs, extensively considered in the CSP setting, as well as in conjunctive database query evaluation and optimization, are all covered as special cases of the tree projection framework.

  • tree projections and constraint optimization problems fixed parameter tractability and Parallel Algorithms
    arXiv: Artificial Intelligence, 2017
    Co-Authors: Georg Gottlob, Gianluigi Greco, Francesco Scarcello
    Abstract:

    Tree projections provide a unifying framework to deal with most structural decomposition methods of constraint satisfaction problems (CSPs). Within this framework, a CSP instance is decomposed into a number of sub-problems, called views, whose solutions are either already available or can be computed efficiently. The goal is to arrange portions of these views in a tree-like structure, called tree projection, which determines an efficiently solvable CSP instance equivalent to the original one. Deciding whether a tree projection exists is NP-hard. Solution methods have therefore been proposed in the literature that do not require a tree projection to be given, and that either correctly decide whether the given CSP instance is satisfiable, or return that a tree projection actually does not exist. These approaches had not been generalized so far on CSP extensions for optimization problems, where the goal is to compute a solution of maximum value/minimum cost. The paper fills the gap, by exhibiting a fixed-parameter polynomial-time algorithm that either disproves the existence of tree projections or computes an optimal solution, with the parameter being the size of the expression of the objective function to be optimized over all possible solutions (and not the size of the whole constraint formula, used in related works). Tractability results are also established for the problem of returning the best K solutions. Finally, Parallel Algorithms for such optimization problems are proposed and analyzed. Given that the classes of acyclic hypergraphs, hypergraphs of bounded treewidth, and hypergraphs of bounded generalized hypertree width are all covered as special cases of the tree projection framework, the results in this paper directly apply to these classes. These classes are extensively considered in the CSP setting, as well as in conjunctive database query evaluation and optimization.