Sorting Algorithm

The Experts below are selected from a list of 12921 Experts worldwide ranked by ideXlab platform

Paolo Ienne - One of the best experts on this subject based on the ideXlab platform.

High performance comparison-based Sorting Algorithm on many-core GPUs

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2010

Co-Authors: Xiaochun Ye, Nan Yuan, Paolo Ienne

Abstract:

Sorting is a kernel Algorithm for a wide range of applications. In this paper, we present a new Algorithm, GPU-Warpsort, to perform comparison-based parallel sort on Graphics Processing Units (GPUs). It mainly consists of a bitonic sort followed by a merge sort. Our Algorithm achieves high performance by efficiently mapping the Sorting tasks to GPU architectures. Firstly, we take advantage of the synchronous execution of threads in a warp to eliminate the barriers in bitonic Sorting network. We also provide sufficient homogeneous parallel operations for all the threads within a warp to avoid branch divergence. Furthermore, we implement the merge sort efficiently by assigning each warp independent pairs of sequences to be merged and by exploiting totally coalesced global memory accesses to eliminate the bandwidth bottleneck. Our experimental results indicate that GPU-Warpsort works well on different kinds of input distributions, and it achieves up to 30% higher performance than previous optimized comparison-based GPU Sorting Algorithm on input sequences with millions of elements.

15 days free trial to Access Article
high performance comparison based Sorting Algorithm on many core gpus

International Parallel and Distributed Processing Symposium, 2010

Co-Authors: Xiaochun Ye, Nan Yuan, Paolo Ienne

Abstract:

Sorting is a kernel Algorithm for a wide range of applications. We present a new Algorithm, GPU-Warpsort, to perform comparison-based parallel sort on Graphics Processing Units (GPUs). It mainly consists of a bitonic sort followed by a merge sort. Our Algorithm achieves high performance by efficiently mapping the Sorting tasks to GPU architectures. Firstly, we take advantage of the synchronous execution of threads in a warp to eliminate the barriers in bitonic Sorting network. We also provide sufficient homogeneous parallel operations for all the threads within a warp to avoid branch divergence. Furthermore, we implement the merge sort efficiently by assigning each warp independent pairs of sequences to be merged and by exploiting totally coalesced global memory accesses to eliminate the bandwidth bottleneck. Our experimental results indicate that GPU-Warpsort works well on different kinds of input distributions, and it achieves up to 30% higher performance than previous optimized comparison-based GPU Sorting Algorithm on input sequences with millions of elements.

15 days free trial to Access Article

Xiaochun Ye - One of the best experts on this subject based on the ideXlab platform.

High performance comparison-based Sorting Algorithm on many-core GPUs

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2010

Co-Authors: Xiaochun Ye, Nan Yuan, Paolo Ienne

Abstract:

Sorting is a kernel Algorithm for a wide range of applications. In this paper, we present a new Algorithm, GPU-Warpsort, to perform comparison-based parallel sort on Graphics Processing Units (GPUs). It mainly consists of a bitonic sort followed by a merge sort. Our Algorithm achieves high performance by efficiently mapping the Sorting tasks to GPU architectures. Firstly, we take advantage of the synchronous execution of threads in a warp to eliminate the barriers in bitonic Sorting network. We also provide sufficient homogeneous parallel operations for all the threads within a warp to avoid branch divergence. Furthermore, we implement the merge sort efficiently by assigning each warp independent pairs of sequences to be merged and by exploiting totally coalesced global memory accesses to eliminate the bandwidth bottleneck. Our experimental results indicate that GPU-Warpsort works well on different kinds of input distributions, and it achieves up to 30% higher performance than previous optimized comparison-based GPU Sorting Algorithm on input sequences with millions of elements.

15 days free trial to Access Article
high performance comparison based Sorting Algorithm on many core gpus

International Parallel and Distributed Processing Symposium, 2010

Co-Authors: Xiaochun Ye, Nan Yuan, Paolo Ienne

Abstract:

Sorting is a kernel Algorithm for a wide range of applications. We present a new Algorithm, GPU-Warpsort, to perform comparison-based parallel sort on Graphics Processing Units (GPUs). It mainly consists of a bitonic sort followed by a merge sort. Our Algorithm achieves high performance by efficiently mapping the Sorting tasks to GPU architectures. Firstly, we take advantage of the synchronous execution of threads in a warp to eliminate the barriers in bitonic Sorting network. We also provide sufficient homogeneous parallel operations for all the threads within a warp to avoid branch divergence. Furthermore, we implement the merge sort efficiently by assigning each warp independent pairs of sequences to be merged and by exploiting totally coalesced global memory accesses to eliminate the bandwidth bottleneck. Our experimental results indicate that GPU-Warpsort works well on different kinds of input distributions, and it achieves up to 30% higher performance than previous optimized comparison-based GPU Sorting Algorithm on input sequences with millions of elements.

15 days free trial to Access Article

Viktor K Prasanna - One of the best experts on this subject based on the ideXlab platform.

an optimal Sorting Algorithm on reconfigurable mesh

Journal of Parallel and Distributed Computing, 1995

Co-Authors: Juwook Jang, Viktor K Prasanna

Abstract:

Abstract This paper shows nontrivial ways to use the Reconfigurable Mesh to solve several basic arithmetic problems in constant time. These solutions are obtained by novel ways to represent numbers and by exploiting the reconfigurability of the architecture. In particular, a constant time Algorithm to add nk-bit numbers using an n × nk bit model of Reconfigurable Mesh is shown. Using these techniques, an optimal Sorting Algorithm on the Reconfigurable Mesh is derived. The Algorithm sorts n numbers in constant time using n × n processors. Our Algorithm uses optimal size-of the mesh to sort n numbers in constant time and satisfies the AT2 lower bound of Ω(n2) for Sorting n numbers in a variation of the word model of VLSI. The Sorting Algorithm runs on all known variations of the Reconfigurable Mesh model.

15 days free trial to Access Article
an optimal Sorting Algorithm on reconfigurable mesh

International Parallel Processing Symposium, 1992

Co-Authors: Juwook Jang, Viktor K Prasanna

Abstract:

An optimal Sorting Algorithm on the reconfigurable mesh is proposed. The Algorithm sorts n numbers in constant time using n*n processors. The best known previous result uses O(n*nlog/sup 2/n) processors. The presented Algorithm satisfies the AT/sup 2/ lower bound of Omega (n/sup 2/) for Sorting n numbers in the word model of VLSI. Modification to the Algorithm for area-time trade-off is shown, to achieve AT/sup 2/ optimality over 1 >

15 days free trial to Access Article

Ming Chen - One of the best experts on this subject based on the ideXlab platform.

A Fast Nondominated Sorting Algorithm

2005 International Conference on Neural Networks and Brain, 2005

Co-Authors: Ming Chen

Abstract:

The process of nondominated Sorting is one of main time-consuming parts of multiobjective evolutionary Algorithm (MOEA). Designing a fast nondominated Sorting Algorithm is crucial to improve the performance of MOEA. The paper uses a Better function to compare solutions, and theoretical analysis shows that the Better function has the properties of general symmetry and transitivity. Based on these properties, the Better nondominated Sorting Algorithm (BNS) is designed to reduce the comparisons among solutions distinctly. Through the simulation experiments and comparing study, the new Algorithm is found to speed up the process of nondominated Sorting in deed

15 days free trial to Access Article

Nan Yuan - One of the best experts on this subject based on the ideXlab platform.

High performance comparison-based Sorting Algorithm on many-core GPUs

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2010

Co-Authors: Xiaochun Ye, Nan Yuan, Paolo Ienne

Abstract:

Sorting is a kernel Algorithm for a wide range of applications. In this paper, we present a new Algorithm, GPU-Warpsort, to perform comparison-based parallel sort on Graphics Processing Units (GPUs). It mainly consists of a bitonic sort followed by a merge sort. Our Algorithm achieves high performance by efficiently mapping the Sorting tasks to GPU architectures. Firstly, we take advantage of the synchronous execution of threads in a warp to eliminate the barriers in bitonic Sorting network. We also provide sufficient homogeneous parallel operations for all the threads within a warp to avoid branch divergence. Furthermore, we implement the merge sort efficiently by assigning each warp independent pairs of sequences to be merged and by exploiting totally coalesced global memory accesses to eliminate the bandwidth bottleneck. Our experimental results indicate that GPU-Warpsort works well on different kinds of input distributions, and it achieves up to 30% higher performance than previous optimized comparison-based GPU Sorting Algorithm on input sequences with millions of elements.

15 days free trial to Access Article
high performance comparison based Sorting Algorithm on many core gpus

International Parallel and Distributed Processing Symposium, 2010

Co-Authors: Xiaochun Ye, Nan Yuan, Paolo Ienne

Abstract:

Sorting is a kernel Algorithm for a wide range of applications. We present a new Algorithm, GPU-Warpsort, to perform comparison-based parallel sort on Graphics Processing Units (GPUs). It mainly consists of a bitonic sort followed by a merge sort. Our Algorithm achieves high performance by efficiently mapping the Sorting tasks to GPU architectures. Firstly, we take advantage of the synchronous execution of threads in a warp to eliminate the barriers in bitonic Sorting network. We also provide sufficient homogeneous parallel operations for all the threads within a warp to avoid branch divergence. Furthermore, we implement the merge sort efficiently by assigning each warp independent pairs of sequences to be merged and by exploiting totally coalesced global memory accesses to eliminate the bandwidth bottleneck. Our experimental results indicate that GPU-Warpsort works well on different kinds of input distributions, and it achieves up to 30% higher performance than previous optimized comparison-based GPU Sorting Algorithm on input sequences with millions of elements.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Paolo Ienne - One of the best experts on this subject based on the ideXlab platform.

High performance comparison-based Sorting Algorithm on many-core GPUs

high performance comparison based Sorting Algorithm on many core gpus

Xiaochun Ye - One of the best experts on this subject based on the ideXlab platform.

High performance comparison-based Sorting Algorithm on many-core GPUs

high performance comparison based Sorting Algorithm on many core gpus

Viktor K Prasanna - One of the best experts on this subject based on the ideXlab platform.

an optimal Sorting Algorithm on reconfigurable mesh

an optimal Sorting Algorithm on reconfigurable mesh

Ming Chen - One of the best experts on this subject based on the ideXlab platform.

A Fast Nondominated Sorting Algorithm

Nan Yuan - One of the best experts on this subject based on the ideXlab platform.

High performance comparison-based Sorting Algorithm on many-core GPUs

high performance comparison based Sorting Algorithm on many core gpus