Kernel Invocation

The Experts below are selected from a list of 72 Experts worldwide ranked by ideXlab platform

Vibha Patel - One of the best experts on this subject based on the ideXlab platform.

a shared memory based implementation of needleman wunsch algorithm using skewing transformation

International Journal of Advanced Research in Computer Science, 2017

Co-Authors: Vibha Patel, Krunal Gandhi, Darshak Bhatti

Abstract:

Among various algorithms for protein and nucleotide alignment, Needleman-Wunsch algorithm is widely accepted as it can divide the problem into sub-problems. We present two parallel approaches of the Needleman-Wunsch algorithm with the single Kernel and multi-Kernel Invocation using skewing transformation which is used for traversing and calculation of dynamic programming matrix. We also compare these with traditional CPU sequential approach which resulted in six-fold performance improvement. Furthermore, we present same single Kernel ideology on shared memory which resulted in two-fold performance improvement our non-shared memory approach.

15 days free trial to Access Article
a gpu based implementation of needleman wunsch algorithm using skewing transformation

International Conference on Contemporary Computing, 2015

Co-Authors: Anuj Chaudhary, Deepkumar Kagathara, Vibha Patel

Abstract:

We present a new parallel approach of Needleman-Wunsch algorithm for global sequence alignment. This approach uses skewing transformation for traversal and calculation of the dynamic programming matrix. We compare the execution time of sequential CPU based implementation with two parallel GPU based implementations: Single-Kernel Invocation with lock-free block synchronization and multi-Kernel Invocation at block-synchronization points. Both the GPU based implementations gave upto 6 times performance improvement over the sequential CPU based implementation.

15 days free trial to Access Article
IC3 - A GPU based implementation of Needleman-Wunsch algorithm using skewing transformation

2015 Eighth International Conference on Contemporary Computing (IC3), 2015

Co-Authors: Anuj Chaudhary, Deepkumar Kagathara, Vibha Patel

Abstract:

We present a new parallel approach of Needleman-Wunsch algorithm for global sequence alignment. This approach uses skewing transformation for traversal and calculation of the dynamic programming matrix. We compare the execution time of sequential CPU based implementation with two parallel GPU based implementations: Single-Kernel Invocation with lock-free block synchronization and multi-Kernel Invocation at block-synchronization points. Both the GPU based implementations gave upto 6 times performance improvement over the sequential CPU based implementation.

15 days free trial to Access Article

Zebo Peng - One of the best experts on this subject based on the ideXlab platform.

Latency-aware packet processing on CPU-GPU heterogeneous systems

2017 54th ACM EDAC IEEE Design Automation Conference (DAC), 2017

Co-Authors: Arian Maghazeh, Unmesh D. Bordoloi, Usman Dastgeer, Alexandru Andrei, Petru Eles, Zebo Peng

Abstract:

In response to the tremendous growth of the Internet, towards what we call the Internet of Things (IoT), there is a need to move from costly, high-time-to-market specific-purpose hardware to flexible, low-time-to-market general-purpose devices for packet processing. Among several such devices, GPUs have attracted attention in the past, mainly because the high computing demand of packet processing applications can, potentially, be satisfied by these throughput-oriented machines. However, another important aspect of such applications is the packet latency which, if not handled carefully, will overshadow the throughput benefits. Unfortunately, until now, this aspect has been mostly ignored. To address this issue, we propose a method that considers the variable bit rate of the traffic and, depending on the current rate, minimizes the latency, while meeting the rate demand. We propose a persistent Kernel based software architecture to overcome the challenges inherent in GPU implementation like Kernel Invocation overhead, CPU-GPU communication and memory access overhead. We have chosen packet classification as the packet processing application to demonstrate our technique. Using the proposed approach, we are able to reduce the packet latency on average by a factor of 3.5, compared to the state-of-the-art solutions, without any packet drop.

15 days free trial to Access Article
DAC - Latency-Aware Packet Processing on CPU-GPU Heterogeneous Systems

Proceedings of the 54th Annual Design Automation Conference 2017, 2017

Co-Authors: Arian Maghazeh, Unmesh D. Bordoloi, Usman Dastgeer, Alexandru Andrei, Petru Eles, Zebo Peng

Abstract:

In response to the tremendous growth of the Internet, towards what we call the Internet of Things (IoT), there is a need to move from costly, high-time-to-market specific-purpose hardware to flexible, low-time-to-market general-purpose devices for packet processing. Among several such devices, GPUs have attracted attention in the past, mainly because the high computing demand of packet processing applications can, potentially, be satisfied by these throughput-oriented machines. However, another important aspect of such applications is the packet latency which, if not handled carefully, will overshadow the throughput benefits. Unfortunately, until now, this aspect has been mostly ignored. To address this issue, we propose a method that considers the variable bit rate of the traffic and, depending on the current rate, minimizes the latency, while meeting the rate demand. We propose a persistent Kernel based software architecture to overcome the challenges inherent in GPU implementation like Kernel Invocation overhead, CPU-GPU communication and memory access overhead. We have chosen packet classification as the packet processing application to demonstrate our technique. Using the proposed approach, we are able to reduce the packet latency on average by a factor of 3.5, compared to the state-of-the-art solutions, without any packet drop.

15 days free trial to Access Article

Bin Gong - One of the best experts on this subject based on the ideXlab platform.

PAAP - Option Pricing on the GPU with Backward Stochastic Differential Equation

2011 Fourth International Symposium on Parallel Architectures Algorithms and Programming, 2011

Co-Authors: Ying Peng, Bin Gong

Abstract:

In this paper, we develop acceleration strategies for option pricing with non-linear Backward Stochastic Differential Equation (BSDE), which appears as a robust and valuable tool in financial markets. An efficient binomial lattice based method is adopted to solve the BSDE numerically. In order to reduce the global memory access frequency, the Kernel Invocation is avoided to be performed on each time step. Furthermore, for evaluating the affect of load balance to the performance, we provide two different acceleration strategies and compare them with running time experiments. The acceleration algorithms exhibit tremendous speedup over the sequential CPU implementation and therefore suitable for real-time application.

15 days free trial to Access Article
Option Pricing on the GPU with Backward Stochastic Differential Equation

2011 Fourth International Symposium on Parallel Architectures Algorithms and Programming, 2011

Co-Authors: Ying Peng, Bin Gong

Abstract:

In this paper, we develop acceleration strategies for option pricing with non-linear Backward Stochastic Differential Equation (BSDE), which appears as a robust and valuable tool in financial markets. An efficient binomial lattice based method is adopted to solve the BSDE numerically. In order to reduce the global memory access frequency, the Kernel Invocation is avoided to be performed on each time step. Furthermore, for evaluating the affect of load balance to the performance, we provide two different acceleration strategies and compare them with running time experiments. The acceleration algorithms exhibit tremendous speedup over the sequential CPU implementation and therefore suitable for real-time application.

15 days free trial to Access Article

Anuj Chaudhary - One of the best experts on this subject based on the ideXlab platform.

a gpu based implementation of needleman wunsch algorithm using skewing transformation

International Conference on Contemporary Computing, 2015

Co-Authors: Anuj Chaudhary, Deepkumar Kagathara, Vibha Patel

Abstract:

We present a new parallel approach of Needleman-Wunsch algorithm for global sequence alignment. This approach uses skewing transformation for traversal and calculation of the dynamic programming matrix. We compare the execution time of sequential CPU based implementation with two parallel GPU based implementations: Single-Kernel Invocation with lock-free block synchronization and multi-Kernel Invocation at block-synchronization points. Both the GPU based implementations gave upto 6 times performance improvement over the sequential CPU based implementation.

15 days free trial to Access Article
IC3 - A GPU based implementation of Needleman-Wunsch algorithm using skewing transformation

2015 Eighth International Conference on Contemporary Computing (IC3), 2015

Co-Authors: Anuj Chaudhary, Deepkumar Kagathara, Vibha Patel

Abstract:

We present a new parallel approach of Needleman-Wunsch algorithm for global sequence alignment. This approach uses skewing transformation for traversal and calculation of the dynamic programming matrix. We compare the execution time of sequential CPU based implementation with two parallel GPU based implementations: Single-Kernel Invocation with lock-free block synchronization and multi-Kernel Invocation at block-synchronization points. Both the GPU based implementations gave upto 6 times performance improvement over the sequential CPU based implementation.

15 days free trial to Access Article

Darshak Bhatti - One of the best experts on this subject based on the ideXlab platform.

a shared memory based implementation of needleman wunsch algorithm using skewing transformation

International Journal of Advanced Research in Computer Science, 2017

Co-Authors: Vibha Patel, Krunal Gandhi, Darshak Bhatti

Abstract:

Among various algorithms for protein and nucleotide alignment, Needleman-Wunsch algorithm is widely accepted as it can divide the problem into sub-problems. We present two parallel approaches of the Needleman-Wunsch algorithm with the single Kernel and multi-Kernel Invocation using skewing transformation which is used for traversing and calculation of dynamic programming matrix. We also compare these with traditional CPU sequential approach which resulted in six-fold performance improvement. Furthermore, we present same single Kernel ideology on shared memory which resulted in two-fold performance improvement our non-shared memory approach.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Vibha Patel - One of the best experts on this subject based on the ideXlab platform.

a shared memory based implementation of needleman wunsch algorithm using skewing transformation

a gpu based implementation of needleman wunsch algorithm using skewing transformation

IC3 - A GPU based implementation of Needleman-Wunsch algorithm using skewing transformation

Zebo Peng - One of the best experts on this subject based on the ideXlab platform.

Latency-aware packet processing on CPU-GPU heterogeneous systems

DAC - Latency-Aware Packet Processing on CPU-GPU Heterogeneous Systems

Bin Gong - One of the best experts on this subject based on the ideXlab platform.

PAAP - Option Pricing on the GPU with Backward Stochastic Differential Equation

Option Pricing on the GPU with Backward Stochastic Differential Equation

Anuj Chaudhary - One of the best experts on this subject based on the ideXlab platform.

a gpu based implementation of needleman wunsch algorithm using skewing transformation

IC3 - A GPU based implementation of Needleman-Wunsch algorithm using skewing transformation

Darshak Bhatti - One of the best experts on this subject based on the ideXlab platform.

a shared memory based implementation of needleman wunsch algorithm using skewing transformation

Kernel Invocation

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Vibha Patel - One of the best experts on this subject based on the ideXlab platform.

Zebo Peng - One of the best experts on this subject based on the ideXlab platform.

Bin Gong - One of the best experts on this subject based on the ideXlab platform.

Anuj Chaudhary - One of the best experts on this subject based on the ideXlab platform.

Darshak Bhatti - One of the best experts on this subject based on the ideXlab platform.

Related terms