Lower Price Point - Explore the Science & Experts

The Experts below are selected from a list of 102 Experts worldwide ranked by ideXlab platform

Matei Ripeanu - One of the best experts on this subject based on the ideXlab platform.

On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest

2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013

Co-Authors: Abdullah Gharaibeh, Lauro Beltrão Beltrão Costa, Elizeu Santos-neto, Matei Ripeanu

Abstract:

Graph processing has gained renewed attention. The increasing large scale and wealth of connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable information from large scale graphs. Hybrid systems that host processing units optimized for both fast sequential processing and bulk processing (e.g., GPUaccelerated systems) have the potential to cope with the heterogeneous structure of real graphs and enable high performance graph processing. Reaching this Point, however, poses multiple challenges. The heterogeneity of the processing elements (e.g., GPUs implement a different parallel processing model than CPUs and have much less memory) and the inherent irregularity of graph workloads require careful graph partitioning and load assignment. In particular, the workload generated by a partitioning scheme should match the strength of the processing element the partition is allocated to. This work explores the feasibility and quantifies the performance gains of such low-cost partitioning schemes. We propose to partition the workload between the two types of processing elements based on vertex connectivity. We show that such partitioning schemes offer a simple, yet efficient way to boost the overall performance of the hybrid system. Our evaluation illustrates that processing a 4-billion edges graph on a system with one CPU socket and one GPU, while offloading as little as 25% of the edges to the GPU, achieves 2x performance improvement over state-of-the-art implementations running on a dual-socket symmetric system. Moreover, for the same graph, a hybrid system with dualsocket and dual-GPU is capable of 1.13 Billion breadth-first search traversed edge per second, a performance rate that is competitive with the latest entries in the Graph500 list, yet at a much Lower Price Point.

15 days free trial to Access Article
IPDPS - On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest

2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013

Co-Authors: Abdullah Gharaibeh, Elizeu Santos-neto, Lauro Beltrao Costa, Matei Ripeanu

Abstract:

Graph processing has gained renewed attention. The increasing large scale and wealth of connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable information from large scale graphs. Hybrid systems that host processing units optimized for both fast sequential processing and bulk processing (e.g., GPUaccelerated systems) have the potential to cope with the heterogeneous structure of real graphs and enable high performance graph processing. Reaching this Point, however, poses multiple challenges. The heterogeneity of the processing elements (e.g., GPUs implement a different parallel processing model than CPUs and have much less memory) and the inherent irregularity of graph workloads require careful graph partitioning and load assignment. In particular, the workload generated by a partitioning scheme should match the strength of the processing element the partition is allocated to. This work explores the feasibility and quantifies the performance gains of such low-cost partitioning schemes. We propose to partition the workload between the two types of processing elements based on vertex connectivity. We show that such partitioning schemes offer a simple, yet efficient way to boost the overall performance of the hybrid system. Our evaluation illustrates that processing a 4-billion edges graph on a system with one CPU socket and one GPU, while offloading as little as 25% of the edges to the GPU, achieves 2x performance improvement over state-of-the-art implementations running on a dual-socket symmetric system. Moreover, for the same graph, a hybrid system with dualsocket and dual-GPU is capable of 1.13 Billion breadth-first search traversed edge per second, a performance rate that is competitive with the latest entries in the Graph500 list, yet at a much Lower Price Point.

15 days free trial to Access Article

Abdullah Gharaibeh - One of the best experts on this subject based on the ideXlab platform.

On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest

2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013

Co-Authors: Abdullah Gharaibeh, Lauro Beltrão Beltrão Costa, Elizeu Santos-neto, Matei Ripeanu

Abstract:

Graph processing has gained renewed attention. The increasing large scale and wealth of connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable information from large scale graphs. Hybrid systems that host processing units optimized for both fast sequential processing and bulk processing (e.g., GPUaccelerated systems) have the potential to cope with the heterogeneous structure of real graphs and enable high performance graph processing. Reaching this Point, however, poses multiple challenges. The heterogeneity of the processing elements (e.g., GPUs implement a different parallel processing model than CPUs and have much less memory) and the inherent irregularity of graph workloads require careful graph partitioning and load assignment. In particular, the workload generated by a partitioning scheme should match the strength of the processing element the partition is allocated to. This work explores the feasibility and quantifies the performance gains of such low-cost partitioning schemes. We propose to partition the workload between the two types of processing elements based on vertex connectivity. We show that such partitioning schemes offer a simple, yet efficient way to boost the overall performance of the hybrid system. Our evaluation illustrates that processing a 4-billion edges graph on a system with one CPU socket and one GPU, while offloading as little as 25% of the edges to the GPU, achieves 2x performance improvement over state-of-the-art implementations running on a dual-socket symmetric system. Moreover, for the same graph, a hybrid system with dualsocket and dual-GPU is capable of 1.13 Billion breadth-first search traversed edge per second, a performance rate that is competitive with the latest entries in the Graph500 list, yet at a much Lower Price Point.

15 days free trial to Access Article
IPDPS - On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest

2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013

Co-Authors: Abdullah Gharaibeh, Elizeu Santos-neto, Lauro Beltrao Costa, Matei Ripeanu

Abstract:

Graph processing has gained renewed attention. The increasing large scale and wealth of connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable information from large scale graphs. Hybrid systems that host processing units optimized for both fast sequential processing and bulk processing (e.g., GPUaccelerated systems) have the potential to cope with the heterogeneous structure of real graphs and enable high performance graph processing. Reaching this Point, however, poses multiple challenges. The heterogeneity of the processing elements (e.g., GPUs implement a different parallel processing model than CPUs and have much less memory) and the inherent irregularity of graph workloads require careful graph partitioning and load assignment. In particular, the workload generated by a partitioning scheme should match the strength of the processing element the partition is allocated to. This work explores the feasibility and quantifies the performance gains of such low-cost partitioning schemes. We propose to partition the workload between the two types of processing elements based on vertex connectivity. We show that such partitioning schemes offer a simple, yet efficient way to boost the overall performance of the hybrid system. Our evaluation illustrates that processing a 4-billion edges graph on a system with one CPU socket and one GPU, while offloading as little as 25% of the edges to the GPU, achieves 2x performance improvement over state-of-the-art implementations running on a dual-socket symmetric system. Moreover, for the same graph, a hybrid system with dualsocket and dual-GPU is capable of 1.13 Billion breadth-first search traversed edge per second, a performance rate that is competitive with the latest entries in the Graph500 list, yet at a much Lower Price Point.

15 days free trial to Access Article

Mo Li - One of the best experts on this subject based on the ideXlab platform.

ETFA - Localization for industrial warehouse storage rack using passive UHF RFID system

2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), 2017

Co-Authors: Sheng Huang, Sethu Jose, Mo Li

Abstract:

Tracking the location of items on the warehouse storage rack automatically is required by manufacturing companies to improve operational efficiency, productivity, and resource utilization. Due to the long reading range and faster data transfer rate, passive ultra-high frequency (UHF) RFID system is explored to localize the items on the rack in this study. A tag is incorporated into an item on the rack for localization using radio signals. The received signal strength indication (RSSI) and radio frequency (RF) phase are utilized for localization. A supervised clustering approach is developed based on the measurements of location tag at the different zone of the rack. The localization accuracy is improved by using the kernel as a similarity measure. If the desired granularity of item location is not small, the Lower Price Point per tag makes passive RFID systems more economical than active RFID systems.

15 days free trial to Access Article
Localization for industrial warehouse storage rack using passive UHF RFID system

2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), 2017

Co-Authors: Sheng Huang, Sethu Jose, Mo Li

Abstract:

Tracking the location of items on the warehouse storage rack automatically is required by manufacturing companies to improve operational efficiency, productivity, and resource utilization. Due to the long reading range and faster data transfer rate, passive ultra-high frequency (UHF) RFID system is explored to localize the items on the rack in this study. A tag is incorporated into an item on the rack for localization using radio signals. The received signal strength indication (RSSI) and radio frequency (RF) phase are utilized for localization. A supervised clustering approach is developed based on the measurements of location tag at the different zone of the rack. The localization accuracy is improved by using the kernel as a similarity measure. If the desired granularity of item location is not small, the Lower Price Point per tag makes passive RFID systems more economical than active RFID systems.

15 days free trial to Access Article

Elizeu Santos-neto - One of the best experts on this subject based on the ideXlab platform.

On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest

2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013

Co-Authors: Abdullah Gharaibeh, Lauro Beltrão Beltrão Costa, Elizeu Santos-neto, Matei Ripeanu

Abstract:

Graph processing has gained renewed attention. The increasing large scale and wealth of connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable information from large scale graphs. Hybrid systems that host processing units optimized for both fast sequential processing and bulk processing (e.g., GPUaccelerated systems) have the potential to cope with the heterogeneous structure of real graphs and enable high performance graph processing. Reaching this Point, however, poses multiple challenges. The heterogeneity of the processing elements (e.g., GPUs implement a different parallel processing model than CPUs and have much less memory) and the inherent irregularity of graph workloads require careful graph partitioning and load assignment. In particular, the workload generated by a partitioning scheme should match the strength of the processing element the partition is allocated to. This work explores the feasibility and quantifies the performance gains of such low-cost partitioning schemes. We propose to partition the workload between the two types of processing elements based on vertex connectivity. We show that such partitioning schemes offer a simple, yet efficient way to boost the overall performance of the hybrid system. Our evaluation illustrates that processing a 4-billion edges graph on a system with one CPU socket and one GPU, while offloading as little as 25% of the edges to the GPU, achieves 2x performance improvement over state-of-the-art implementations running on a dual-socket symmetric system. Moreover, for the same graph, a hybrid system with dualsocket and dual-GPU is capable of 1.13 Billion breadth-first search traversed edge per second, a performance rate that is competitive with the latest entries in the Graph500 list, yet at a much Lower Price Point.

15 days free trial to Access Article
IPDPS - On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest

2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013

Co-Authors: Abdullah Gharaibeh, Elizeu Santos-neto, Lauro Beltrao Costa, Matei Ripeanu

Abstract:

Graph processing has gained renewed attention. The increasing large scale and wealth of connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable information from large scale graphs. Hybrid systems that host processing units optimized for both fast sequential processing and bulk processing (e.g., GPUaccelerated systems) have the potential to cope with the heterogeneous structure of real graphs and enable high performance graph processing. Reaching this Point, however, poses multiple challenges. The heterogeneity of the processing elements (e.g., GPUs implement a different parallel processing model than CPUs and have much less memory) and the inherent irregularity of graph workloads require careful graph partitioning and load assignment. In particular, the workload generated by a partitioning scheme should match the strength of the processing element the partition is allocated to. This work explores the feasibility and quantifies the performance gains of such low-cost partitioning schemes. We propose to partition the workload between the two types of processing elements based on vertex connectivity. We show that such partitioning schemes offer a simple, yet efficient way to boost the overall performance of the hybrid system. Our evaluation illustrates that processing a 4-billion edges graph on a system with one CPU socket and one GPU, while offloading as little as 25% of the edges to the GPU, achieves 2x performance improvement over state-of-the-art implementations running on a dual-socket symmetric system. Moreover, for the same graph, a hybrid system with dualsocket and dual-GPU is capable of 1.13 Billion breadth-first search traversed edge per second, a performance rate that is competitive with the latest entries in the Graph500 list, yet at a much Lower Price Point.

15 days free trial to Access Article

Lauro Beltrão Beltrão Costa - One of the best experts on this subject based on the ideXlab platform.

On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest

2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013

Co-Authors: Abdullah Gharaibeh, Lauro Beltrão Beltrão Costa, Elizeu Santos-neto, Matei Ripeanu

Abstract:

Graph processing has gained renewed attention. The increasing large scale and wealth of connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable information from large scale graphs. Hybrid systems that host processing units optimized for both fast sequential processing and bulk processing (e.g., GPUaccelerated systems) have the potential to cope with the heterogeneous structure of real graphs and enable high performance graph processing. Reaching this Point, however, poses multiple challenges. The heterogeneity of the processing elements (e.g., GPUs implement a different parallel processing model than CPUs and have much less memory) and the inherent irregularity of graph workloads require careful graph partitioning and load assignment. In particular, the workload generated by a partitioning scheme should match the strength of the processing element the partition is allocated to. This work explores the feasibility and quantifies the performance gains of such low-cost partitioning schemes. We propose to partition the workload between the two types of processing elements based on vertex connectivity. We show that such partitioning schemes offer a simple, yet efficient way to boost the overall performance of the hybrid system. Our evaluation illustrates that processing a 4-billion edges graph on a system with one CPU socket and one GPU, while offloading as little as 25% of the edges to the GPU, achieves 2x performance improvement over state-of-the-art implementations running on a dual-socket symmetric system. Moreover, for the same graph, a hybrid system with dualsocket and dual-GPU is capable of 1.13 Billion breadth-first search traversed edge per second, a performance rate that is competitive with the latest entries in the Graph500 list, yet at a much Lower Price Point.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Lower Price Point with ideXlab!

Matei Ripeanu - One of the best experts on this subject based on the ideXlab platform.

Abdullah Gharaibeh - One of the best experts on this subject based on the ideXlab platform.

Mo Li - One of the best experts on this subject based on the ideXlab platform.

Elizeu Santos-neto - One of the best experts on this subject based on the ideXlab platform.

Lauro Beltrão Beltrão Costa - One of the best experts on this subject based on the ideXlab platform.