Hadoop Platform - Explore the Science & Experts

The Experts below are selected from a list of 3369 Experts worldwide ranked by ideXlab platform

Piyang Chen - One of the best experts on this subject based on the ideXlab platform.

optimizing the cloud Platform performance for supporting large scale cognitive radio networks

Wireless Communications and Networking Conference, 2012

Co-Authors: Shieyuan Wang, Pofan Wang, Piyang Chen

Abstract:

In this paper, we optimize the performance of a cloud Platform to effectively support cooperative spectrum sensing in a cognitive radio (CR) cloud network. This cloud uses the Apache Hadoop Platform to run a cooperative spectrum sensing algorithm in parallel over multiple servers in the cloud. A cooperative spectrum sensing algorithm needs to process a very large number of spectrum sensing reports per second to quickly update its database that stores the current activities of all primary users of the CR network. Because the updates of the database must be finished as soon as possible to make the CR approach effective, the cloud Platform must be able to run the algorithm in real time with as little overhead as possible. In this work, we first measured the execution time of such an algorithm over our own cloud and the Amazon EC2 public cloud, using the original Hadoop Platform design and implementation. We found that the original Hadoop Platform has too much fixed overhead and incurs too much delay to the cooperative spectrum sensing algorithm, which makes it unable to update the primary user database in just a few seconds. Therefore, we studied the source code and the design and implementation of the Hadoop Platform to improve its performance. Our experimental results show that our improvement of the Hadoop Platform can significantly reduce the required time of the cooperative spectrum sensing algorithm and make it more suitable for large-scale CR networks.

15 days free trial to Access Article
WCNC - Optimizing the cloud Platform performance for supporting large-scale cognitive radio networks

2012 IEEE Wireless Communications and Networking Conference (WCNC), 2012

Co-Authors: Shieyuan Wang, Pofan Wang, Piyang Chen

Abstract:

In this paper, we optimize the performance of a cloud Platform to effectively support cooperative spectrum sensing in a cognitive radio (CR) cloud network. This cloud uses the Apache Hadoop Platform to run a cooperative spectrum sensing algorithm in parallel over multiple servers in the cloud. A cooperative spectrum sensing algorithm needs to process a very large number of spectrum sensing reports per second to quickly update its database that stores the current activities of all primary users of the CR network. Because the updates of the database must be finished as soon as possible to make the CR approach effective, the cloud Platform must be able to run the algorithm in real time with as little overhead as possible. In this work, we first measured the execution time of such an algorithm over our own cloud and the Amazon EC2 public cloud, using the original Hadoop Platform design and implementation. We found that the original Hadoop Platform has too much fixed overhead and incurs too much delay to the cooperative spectrum sensing algorithm, which makes it unable to update the primary user database in just a few seconds. Therefore, we studied the source code and the design and implementation of the Hadoop Platform to improve its performance. Our experimental results show that our improvement of the Hadoop Platform can significantly reduce the required time of the cooperative spectrum sensing algorithm and make it more suitable for large-scale CR networks.

15 days free trial to Access Article

Hsiaoping Tsai - One of the best experts on this subject based on the ideXlab platform.

PAKDD Workshops - Mining Uncertain Sequence Data on Hadoop Platform

Lecture Notes in Computer Science, 2014

Co-Authors: Ziyun Sun, Mingche Tsai, Hsiaoping Tsai

Abstract:

Sequence pattern mining is the mining of special and representative features hidden in sequence data. Recently, it has been attracting a lot of attention, especially in the fields of bioinformatics and spatio-temporal trajectory mining. Observing that many sequence data are born with uncertainties and huge sequence data are increasingly generated and accumulated, this paper aims to discover the hidden features from a large amount of uncertain sequence data. Specifically, Probabilistic Suffix Tree (PST) is an implementation of Variable-length Markov Chain (VMM) that has been widely applied in sequence data mining. However, the conventional PST construction algorithm is not for the mining of uncertain data and cannot bear the computing of huge data. Thus, to mine a large amount of sequence data with uncertainties, this paper proposes the uPST\(_{MR}^+\) algorithm on the Hadoop Platform to fully utilize the computing power and storage capacity of cloud computing. The proposed uPST\(_{MR}^+\) algorithm constructs a PST in a progressive, multi-layered, and iterative manner so as to avoid excessive learning patterns and balance the overhead of distributed computing. In addition, to prevent the drag on overall performance owing to multiple scanning of the entire sequence data, we trade space for time by using a NodeArray data structure to store the intermediate statistical results to reduce disk I/O. To verify the performance of uPST\(_{MR}^{+}\), we conduct several experiments. The experimental results show that uPST\(_{MR}^{+}\) outperforms the naive approach significantly and show good scalability and stability. Also, although using NodeArray costs a little extra memory, the execution time is significantly lowered.

15 days free trial to Access Article
mining uncertain sequence data on Hadoop Platform

Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2014

Co-Authors: Ziyun Sun, Mingche Tsai, Hsiaoping Tsai

Abstract:

Sequence pattern mining is the mining of special and representative features hidden in sequence data. Recently, it has been attracting a lot of attention, especially in the fields of bioinformatics and spatio-temporal trajectory mining. Observing that many sequence data are born with uncertainties and huge sequence data are increasingly generated and accumulated, this paper aims to discover the hidden features from a large amount of uncertain sequence data. Specifically, Probabilistic Suffix Tree (PST) is an implementation of Variable-length Markov Chain (VMM) that has been widely applied in sequence data mining. However, the conventional PST construction algorithm is not for the mining of uncertain data and cannot bear the computing of huge data. Thus, to mine a large amount of sequence data with uncertainties, this paper proposes the uPST\(_{MR}^+\) algorithm on the Hadoop Platform to fully utilize the computing power and storage capacity of cloud computing. The proposed uPST\(_{MR}^+\) algorithm constructs a PST in a progressive, multi-layered, and iterative manner so as to avoid excessive learning patterns and balance the overhead of distributed computing. In addition, to prevent the drag on overall performance owing to multiple scanning of the entire sequence data, we trade space for time by using a NodeArray data structure to store the intermediate statistical results to reduce disk I/O. To verify the performance of uPST\(_{MR}^{+}\), we conduct several experiments. The experimental results show that uPST\(_{MR}^{+}\) outperforms the naive approach significantly and show good scalability and stability. Also, although using NodeArray costs a little extra memory, the execution time is significantly lowered.

15 days free trial to Access Article

Shieyuan Wang - One of the best experts on this subject based on the ideXlab platform.

optimizing the cloud Platform performance for supporting large scale cognitive radio networks

Wireless Communications and Networking Conference, 2012

Co-Authors: Shieyuan Wang, Pofan Wang, Piyang Chen

Abstract:

In this paper, we optimize the performance of a cloud Platform to effectively support cooperative spectrum sensing in a cognitive radio (CR) cloud network. This cloud uses the Apache Hadoop Platform to run a cooperative spectrum sensing algorithm in parallel over multiple servers in the cloud. A cooperative spectrum sensing algorithm needs to process a very large number of spectrum sensing reports per second to quickly update its database that stores the current activities of all primary users of the CR network. Because the updates of the database must be finished as soon as possible to make the CR approach effective, the cloud Platform must be able to run the algorithm in real time with as little overhead as possible. In this work, we first measured the execution time of such an algorithm over our own cloud and the Amazon EC2 public cloud, using the original Hadoop Platform design and implementation. We found that the original Hadoop Platform has too much fixed overhead and incurs too much delay to the cooperative spectrum sensing algorithm, which makes it unable to update the primary user database in just a few seconds. Therefore, we studied the source code and the design and implementation of the Hadoop Platform to improve its performance. Our experimental results show that our improvement of the Hadoop Platform can significantly reduce the required time of the cooperative spectrum sensing algorithm and make it more suitable for large-scale CR networks.

15 days free trial to Access Article
WCNC - Optimizing the cloud Platform performance for supporting large-scale cognitive radio networks

2012 IEEE Wireless Communications and Networking Conference (WCNC), 2012

Co-Authors: Shieyuan Wang, Pofan Wang, Piyang Chen

Abstract:

In this paper, we optimize the performance of a cloud Platform to effectively support cooperative spectrum sensing in a cognitive radio (CR) cloud network. This cloud uses the Apache Hadoop Platform to run a cooperative spectrum sensing algorithm in parallel over multiple servers in the cloud. A cooperative spectrum sensing algorithm needs to process a very large number of spectrum sensing reports per second to quickly update its database that stores the current activities of all primary users of the CR network. Because the updates of the database must be finished as soon as possible to make the CR approach effective, the cloud Platform must be able to run the algorithm in real time with as little overhead as possible. In this work, we first measured the execution time of such an algorithm over our own cloud and the Amazon EC2 public cloud, using the original Hadoop Platform design and implementation. We found that the original Hadoop Platform has too much fixed overhead and incurs too much delay to the cooperative spectrum sensing algorithm, which makes it unable to update the primary user database in just a few seconds. Therefore, we studied the source code and the design and implementation of the Hadoop Platform to improve its performance. Our experimental results show that our improvement of the Hadoop Platform can significantly reduce the required time of the cooperative spectrum sensing algorithm and make it more suitable for large-scale CR networks.

15 days free trial to Access Article

Ziyun Sun - One of the best experts on this subject based on the ideXlab platform.

PAKDD Workshops - Mining Uncertain Sequence Data on Hadoop Platform

Lecture Notes in Computer Science, 2014

Co-Authors: Ziyun Sun, Mingche Tsai, Hsiaoping Tsai

Abstract:

Sequence pattern mining is the mining of special and representative features hidden in sequence data. Recently, it has been attracting a lot of attention, especially in the fields of bioinformatics and spatio-temporal trajectory mining. Observing that many sequence data are born with uncertainties and huge sequence data are increasingly generated and accumulated, this paper aims to discover the hidden features from a large amount of uncertain sequence data. Specifically, Probabilistic Suffix Tree (PST) is an implementation of Variable-length Markov Chain (VMM) that has been widely applied in sequence data mining. However, the conventional PST construction algorithm is not for the mining of uncertain data and cannot bear the computing of huge data. Thus, to mine a large amount of sequence data with uncertainties, this paper proposes the uPST\(_{MR}^+\) algorithm on the Hadoop Platform to fully utilize the computing power and storage capacity of cloud computing. The proposed uPST\(_{MR}^+\) algorithm constructs a PST in a progressive, multi-layered, and iterative manner so as to avoid excessive learning patterns and balance the overhead of distributed computing. In addition, to prevent the drag on overall performance owing to multiple scanning of the entire sequence data, we trade space for time by using a NodeArray data structure to store the intermediate statistical results to reduce disk I/O. To verify the performance of uPST\(_{MR}^{+}\), we conduct several experiments. The experimental results show that uPST\(_{MR}^{+}\) outperforms the naive approach significantly and show good scalability and stability. Also, although using NodeArray costs a little extra memory, the execution time is significantly lowered.

15 days free trial to Access Article
mining uncertain sequence data on Hadoop Platform

Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2014

Co-Authors: Ziyun Sun, Mingche Tsai, Hsiaoping Tsai

Abstract:

Sequence pattern mining is the mining of special and representative features hidden in sequence data. Recently, it has been attracting a lot of attention, especially in the fields of bioinformatics and spatio-temporal trajectory mining. Observing that many sequence data are born with uncertainties and huge sequence data are increasingly generated and accumulated, this paper aims to discover the hidden features from a large amount of uncertain sequence data. Specifically, Probabilistic Suffix Tree (PST) is an implementation of Variable-length Markov Chain (VMM) that has been widely applied in sequence data mining. However, the conventional PST construction algorithm is not for the mining of uncertain data and cannot bear the computing of huge data. Thus, to mine a large amount of sequence data with uncertainties, this paper proposes the uPST\(_{MR}^+\) algorithm on the Hadoop Platform to fully utilize the computing power and storage capacity of cloud computing. The proposed uPST\(_{MR}^+\) algorithm constructs a PST in a progressive, multi-layered, and iterative manner so as to avoid excessive learning patterns and balance the overhead of distributed computing. In addition, to prevent the drag on overall performance owing to multiple scanning of the entire sequence data, we trade space for time by using a NodeArray data structure to store the intermediate statistical results to reduce disk I/O. To verify the performance of uPST\(_{MR}^{+}\), we conduct several experiments. The experimental results show that uPST\(_{MR}^{+}\) outperforms the naive approach significantly and show good scalability and stability. Also, although using NodeArray costs a little extra memory, the execution time is significantly lowered.

15 days free trial to Access Article

Pofan Wang - One of the best experts on this subject based on the ideXlab platform.

optimizing the cloud Platform performance for supporting large scale cognitive radio networks

Wireless Communications and Networking Conference, 2012

Co-Authors: Shieyuan Wang, Pofan Wang, Piyang Chen

Abstract:

In this paper, we optimize the performance of a cloud Platform to effectively support cooperative spectrum sensing in a cognitive radio (CR) cloud network. This cloud uses the Apache Hadoop Platform to run a cooperative spectrum sensing algorithm in parallel over multiple servers in the cloud. A cooperative spectrum sensing algorithm needs to process a very large number of spectrum sensing reports per second to quickly update its database that stores the current activities of all primary users of the CR network. Because the updates of the database must be finished as soon as possible to make the CR approach effective, the cloud Platform must be able to run the algorithm in real time with as little overhead as possible. In this work, we first measured the execution time of such an algorithm over our own cloud and the Amazon EC2 public cloud, using the original Hadoop Platform design and implementation. We found that the original Hadoop Platform has too much fixed overhead and incurs too much delay to the cooperative spectrum sensing algorithm, which makes it unable to update the primary user database in just a few seconds. Therefore, we studied the source code and the design and implementation of the Hadoop Platform to improve its performance. Our experimental results show that our improvement of the Hadoop Platform can significantly reduce the required time of the cooperative spectrum sensing algorithm and make it more suitable for large-scale CR networks.

15 days free trial to Access Article
WCNC - Optimizing the cloud Platform performance for supporting large-scale cognitive radio networks

2012 IEEE Wireless Communications and Networking Conference (WCNC), 2012

Co-Authors: Shieyuan Wang, Pofan Wang, Piyang Chen

Abstract:

In this paper, we optimize the performance of a cloud Platform to effectively support cooperative spectrum sensing in a cognitive radio (CR) cloud network. This cloud uses the Apache Hadoop Platform to run a cooperative spectrum sensing algorithm in parallel over multiple servers in the cloud. A cooperative spectrum sensing algorithm needs to process a very large number of spectrum sensing reports per second to quickly update its database that stores the current activities of all primary users of the CR network. Because the updates of the database must be finished as soon as possible to make the CR approach effective, the cloud Platform must be able to run the algorithm in real time with as little overhead as possible. In this work, we first measured the execution time of such an algorithm over our own cloud and the Amazon EC2 public cloud, using the original Hadoop Platform design and implementation. We found that the original Hadoop Platform has too much fixed overhead and incurs too much delay to the cooperative spectrum sensing algorithm, which makes it unable to update the primary user database in just a few seconds. Therefore, we studied the source code and the design and implementation of the Hadoop Platform to improve its performance. Our experimental results show that our improvement of the Hadoop Platform can significantly reduce the required time of the cooperative spectrum sensing algorithm and make it more suitable for large-scale CR networks.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Hadoop Platform with ideXlab!

Piyang Chen - One of the best experts on this subject based on the ideXlab platform.

optimizing the cloud Platform performance for supporting large scale cognitive radio networks

WCNC - Optimizing the cloud Platform performance for supporting large-scale cognitive radio networks

Hsiaoping Tsai - One of the best experts on this subject based on the ideXlab platform.

PAKDD Workshops - Mining Uncertain Sequence Data on Hadoop Platform

mining uncertain sequence data on Hadoop Platform

Shieyuan Wang - One of the best experts on this subject based on the ideXlab platform.

optimizing the cloud Platform performance for supporting large scale cognitive radio networks

WCNC - Optimizing the cloud Platform performance for supporting large-scale cognitive radio networks

Ziyun Sun - One of the best experts on this subject based on the ideXlab platform.

PAKDD Workshops - Mining Uncertain Sequence Data on Hadoop Platform

mining uncertain sequence data on Hadoop Platform

Pofan Wang - One of the best experts on this subject based on the ideXlab platform.

optimizing the cloud Platform performance for supporting large scale cognitive radio networks

WCNC - Optimizing the cloud Platform performance for supporting large-scale cognitive radio networks